If you’ve been following this blog you’ll know that I’ve started a new role that requires me to build a deep learning system and I’ve been catching up on the 10+ years of research since I completed my PhD. With a background in computing and mathematics I jumped straight in to what I thought would be skimming through the literature. I soon realised that it would be better all round to jump back to first principles rather than be constrained with the methods I had learned over a decade ago.
So, I found a lot of universities who had put their machine learning courses online and have decided to work through what’s out there as if I was an undergraduate and then use my experience to build on top of that. I don’t want to miss an advantage because I wasn’t aware of it.
So I picked up two key tutorials from different Professors:
- CS281: Advanced Machine Learning – this requires a good understanding of mathematics and computing
- Deep Learning from Stanford – this requires an understanding of machine learning principles
So I purchased the one mandatory book for the Machine Learning course: Machine Learning: A Probabilistic Perspective by Kevin Murphy – I got this on my kindle although the reader I have (the old non-touchscreen version) didn’t support it, and neither did the reading app on my Nexus 7, which was a shame. However, my little android mobile and the desktop app supported it with no problem. Not a huge deal as with a widescreen laptop I have enough space for the book and a working area, but something to be aware of.
The book itself is fantastic and will give you a great primer. However, you will also need to install the following to be able to do all the exercises:
- Matlab: this will require a license which will depend on whether you are in industry or a student. There is a GNU equivalent Octave, which is free and covers *most* of the features of MatLab, but not all and so not all of the examples in the book will be possible. If you’ve never used MatLab before please take time to read the introduction, just to make yourself familiar with the interface if nothing else. This will also give you a kick to make sure your maths skills are up to scratch 😉
- PMTK3: this is freely available on GitHub and provides the examples and code for exercises in the book. Just download it and follow the install instructions
Okay you think and work your way through chapter 1 quite happily and get to the exercises. The first exercise is simple enough and then you get to exercise 1.2 which asks you to down load and use FLANN to explore speed improvements. So off you go to the link, download and unzip and then realise that, unlike the PMTK3 library, you’ll have to make this before you can use it. Then you realise that you need Visual Studio 🙂
I was taught to code in vi, and have never felt the need for an IDE. When I’ve used VisualStudio in the past I’ve not really developed any faster and don’t seem to be slower than my teams… So, then I download all 10GB of VS goodness, just to make the project (yes I know I could have trimmed that down but I suspected that I’ll need it all anyway going forward 😉 ). (Update: you can get prebuilt versions here for a quick start)
So in order to read a book, I’ve had to download 2 major software packages, an app, two extension modules before I barely got started!
My one criticism of the book so far is that if you are a bit rusty with mathematics, particularly statistics, you will want to find a better refresher than Chapter 2. Having recently completed the OU’s M140 Introduction to Statistics, all of the material presented was fresh in my head, although the different order of presentation and terminology could easily trip up the less confident rather than the less able. I’m hoping to finish this and get at least part way through the Deep Learning tutorial before I go to Boston next week.
All in all, if you are new to machine learning, I would recommend the machine learning book and the two courses as a primer before you start diving into the academic literature – it’s a relatively small time investment for the return in knowledge.
Update: I’ve just found this excellent introduction to how to build a neural network from first principles in Python – far better than some of the heavier introductions above: http://iamtrask.github.io/2015/07/12/basic-python-network/