## ImageNet in 4 Minutes? What the paper really shows us

ImageNet has been a deep learning benchmark data set since it was created.  It was the competition that showed that DL networks could outperform non-ML techniques and it’s been used by academics as a standard for testing new image classification systems.  A few days ago an exciting paper was published on arxiv for training ImageNet in four minutes.  Not weeks, days or hours but minutes.  This is on the surface a great leap forward but it’s important to dig beneath the surface.  The Register sub headline says all you need to know:

So if you don’t have a relaxed view on accuracy or thousands of GPUs lying around, what’s the point? Is there anything that can be taken from this paper?

## Source Code Control for Data Scientists

I work with many people who are recently out of academia. While they know how to code and are experts in their fields, they are lacking some of rigour of computer science that experienced developers have. In addition to understanding the problems of data in the wider world and testing their solutions properly, they are also unaware of the importance of source code control and deployment. This is another missing aspect from these courses – you cannot exist as a professional developer without it. While there are many source control setups, I’m most familiar with git.

I’ve recently written a how-to guide for my team and was going to make that the focus of this post, although I’ve seen some very good guides out there that are more generic, so I’d like to explain why source code control is important and then give you the tools to learn this yourself. Continue reading Source Code Control for Data Scientists

## Learning BASIC – blast from the past

Back in those heady pre-internet days, if you wanted to learn something that you weren’t taught at school, it pretty much meant a trip to the library.  I was pretty lucky, if I wanted a book and there was even a hint of anything educational in it, then it was bought for me.

I was further fortunate in that with a teacher as a parent, I had access to the Acorn Achimedes and BBC computers as they were rolled out to schools for the entirety of the school holidays.  There was one rule: if you want to play games, write them yourself.  While rose-tinted memory has me at the tender age of 7 fist-pumping and saying “challenge accepted”,  I’m sure there was much more complaint involved, but I’m glad that I was encouraged. Continue reading Learning BASIC – blast from the past

## Why are data scientists so bad at science?

It’s rare that I am intentionally provocative in my post titles, but I’d really like you to think about this one. I’ve known and worked with a lot of people who work with data over the years, many of who call themselves data scientists and many who do the role of a data scientist but by another name1. One thing that worries me when they talk about their work is an absence of scientific rigour and this is a huge problem, and one I’ve talked about before.

The results that data scientists produce are becoming increasingly important in our lives; from determining what adverts we see to how we are treated by financial institutions or governments. These results can have direct impact on people’s lives and we have a moral and ethical obligation to ensure that they are correct. Continue reading Why are data scientists so bad at science?

## Learning Fortran – a blast from the past

Over the weekend, I was clearing out some old paperwork and I found the notes from one of the assessed practical sessions at University.  Although I was studying biochemistry, an understanding of basic programming was essential, with many extra optional uncredited courses available.  It was a simple chemical reactions task and we could use BASIC or Fortran to code the solution.  I’d been coding in BASIC since I was a child1 so decided to go for the Fortran option as what’s the point in doing something easy…. Continue reading Learning Fortran – a blast from the past

## Literate programming – effect on performance

After my introductory post on Literate Programming, it occurred to me that while the concept of being able to create documentation that includes variables from the code being run is amazing, this will obviously have some impact on performance.  At best, this would be the resource required to compile the $\LaTeX$ document as if it was static, while the “at worst” scenario is conceptually unbounded.  Somewhere along the way, pweave is adding extra code to pass the variables back and forth between the python and the $\LaTeX$, how and when it does this could have implications that you wouldn’t see in a simple example but could be catastrophic when running the kind of neural nets that my department are putting together. So, being a scientist, I decided to run a few experiments….1 Continue reading Literate programming – effect on performance

## Using Literate Programming in Research

Over my career in IT there have been a lot of changes in documentation practises, from the heavy detailed design up front to lean1 and now the adoption of literate programming, particularly in research (and somewhat contained to it because of the reliance on $\LaTeX$ as a markup language2).  While there are plenty of getting started guides out there, this post is primarily about why I’m adopting it for my new Science and Innovations department and the benefits that literate programming can give. Continue reading Using Literate Programming in Research

## Testing applications

As part of a few hours catching up on machine learning conference videos, I found this great talk on what can go wrong with machine recommendations and how testing can be improved.  Evan gives some great examples of where the algorithms can give unintended outputs.  In some cases this is an emergent property of correlations in the data, and in others, it’s down to missing examples in the test set.  While the talk is entertaining and shows some very important examples, it made me realise something that has been missing.  The machine learning community, having grown out of academia, does not have the same rigour of developmental processes as standard software development.

Regardless of the type of machine learning employed, testing and quantification of results is all down to the test set that is used.  Accuracy against the test set, simpler architectures, fewer training examples are the focal points.  If the test set is lacking, this is not uncovered as part of the process.  Models that show high precision and recall can often fail when “real” data is passed through them.  Sometimes in spectacular ways as outlined in Evan’s talk:  adjusting pricing for Mac users, Amazon recommending inappropriate products or Facebook’s misclassification of people.  These problems are either solved with manual hacks after the algorithms have run or by adding specific issues to the test set.  While there are businesses that take the same approach with their software, they are thankfully few and far between and most companies now use some form of continuous integration, automated testing and then rigorous manual testing.

The only part of this process that will truly solve the problem is the addition of rigorous manual testing by professional testers.  Good testers are very hard to find, in some respect harder than it is to find good developers.  Testing is often seen as a second class profession to development and I feel this is really unfair.  Good testers can understand the application they are testing on multiple levels, create the automated functional tests and make sure that everything you expect to work, works.  But they also know how to stress an application – push it well beyond what it was designed to do, just to see whether these cases will be handled.  What assumptions were made that can be challenged.  A really good tester will see deficiencies in test sets and think “what happens if…”, they’ll sneak the bizarre examples in for the challenge.

One of the most difficult things about being a tester in the machine learning space is that in order to understand all the ways in which things can go wrong, you do need some appreciation of how the underlying system works, rather than a complete black box.  Knowing that most vision networks look for edges would prompt a good tester to throw in random patterns, from animal prints to static noise.  A good tester would look of examples not covered by the test set and make sure the negatives far outweigh the samples the developers used to create the model.

So where are all the specialist testers for machine learning?  I think the industry really needs them before we have (any more) decision engines in our lives that have hidden issues waiting to emerge…

## Girls Can Code – episode 2 thoughts

After somewhat mixed reviews of last week’s episode I was interested to see whether episode 2 of Girls Can Code had any more emphasis on the coding side.

It started with a comment that the girls were building a tech business rather than actually learning to code themselves.  This justified one of the major criticisms of the show – that it was nothing about coding.  While I’m sure that this was all fixed months ago, I did wonder if the voice over for the start of the show had been rerecorded after the response to the first episode. Continue reading Girls Can Code – episode 2 thoughts

## Girls Can Code – episode 1 thoughts

As part of the BBC’s Make it Digital season, there was a great program on BBC3 showcasing that “Girls Can Code”.  Such a shame it was on a minor channel at 9pm rather than BBC1 or BBC2 earlier.  However, BBC3 is aimed at the youth market so I’m hoping that enough young women watched it to be inspired.

If you missed it, it’s available on iPlayer (UK only) for the next month, with the second episode next Monday.

This isn’t the apprentice – they don’t need to crush each other to get ahead – Alice Levine