Over the weekend, I was clearing out some old paperwork and I found the notes from one of the assessed practical sessions at University. Although I was studying biochemistry, an understanding of basic programming was essential, with many extra optional uncredited courses available. It was a simple chemical reactions task and we could use BASIC or Fortran to code the solution. I’d been coding in BASIC since I was a child1 so decided to go for the Fortran option as what’s the point in doing something easy…. Continue reading Learning Fortran – a blast from the past
After my introductory post on Literate Programming, it occurred to me that while the concept of being able to create documentation that includes variables from the code being run is amazing, this will obviously have some impact on performance. At best, this would be the resource required to compile the document as if it was static, while the “at worst” scenario is conceptually unbounded. Somewhere along the way, pweave is adding extra code to pass the variables back and forth between the python and the , how and when it does this could have implications that you wouldn’t see in a simple example but could be catastrophic when running the kind of neural nets that my department are putting together. So, being a scientist, I decided to run a few experiments….1 Continue reading Literate programming – effect on performance
Over my career in IT there have been a lot of changes in documentation practises, from the heavy detailed design up front to lean1 and now the adoption of literate programming, particularly in research (and somewhat contained to it because of the reliance on as a markup language2). While there are plenty of getting started guides out there, this post is primarily about why I’m adopting it for my new Science and Innovations department and the benefits that literate programming can give. Continue reading Using Literate Programming in Research
As part of a few hours catching up on machine learning conference videos, I found this great talk on what can go wrong with machine recommendations and how testing can be improved. Evan gives some great examples of where the algorithms can give unintended outputs. In some cases this is an emergent property of correlations in the data, and in others, it’s down to missing examples in the test set. While the talk is entertaining and shows some very important examples, it made me realise something that has been missing. The machine learning community, having grown out of academia, does not have the same rigour of developmental processes as standard software development.
Regardless of the type of machine learning employed, testing and quantification of results is all down to the test set that is used. Accuracy against the test set, simpler architectures, fewer training examples are the focal points. If the test set is lacking, this is not uncovered as part of the process. Models that show high precision and recall can often fail when “real” data is passed through them. Sometimes in spectacular ways as outlined in Evan’s talk: adjusting pricing for Mac users, Amazon recommending inappropriate products or Facebook’s misclassification of people. These problems are either solved with manual hacks after the algorithms have run or by adding specific issues to the test set. While there are businesses that take the same approach with their software, they are thankfully few and far between and most companies now use some form of continuous integration, automated testing and then rigorous manual testing.
The only part of this process that will truly solve the problem is the addition of rigorous manual testing by professional testers. Good testers are very hard to find, in some respect harder than it is to find good developers. Testing is often seen as a second class profession to development and I feel this is really unfair. Good testers can understand the application they are testing on multiple levels, create the automated functional tests and make sure that everything you expect to work, works. But they also know how to stress an application – push it well beyond what it was designed to do, just to see whether these cases will be handled. What assumptions were made that can be challenged. A really good tester will see deficiencies in test sets and think “what happens if…”, they’ll sneak the bizarre examples in for the challenge.
One of the most difficult things about being a tester in the machine learning space is that in order to understand all the ways in which things can go wrong, you do need some appreciation of how the underlying system works, rather than a complete black box. Knowing that most vision networks look for edges would prompt a good tester to throw in random patterns, from animal prints to static noise. A good tester would look of examples not covered by the test set and make sure the negatives far outweigh the samples the developers used to create the model.
So where are all the specialist testers for machine learning? I think the industry really needs them before we have (any more) decision engines in our lives that have hidden issues waiting to emerge…
After somewhat mixed reviews of last week’s episode I was interested to see whether episode 2 of Girls Can Code had any more emphasis on the coding side.
It started with a comment that the girls were building a tech business rather than actually learning to code themselves. This justified one of the major criticisms of the show – that it was nothing about coding. While I’m sure that this was all fixed months ago, I did wonder if the voice over for the start of the show had been rerecorded after the response to the first episode. Continue reading Girls Can Code – episode 2 thoughts
As part of the BBC’s Make it Digital season, there was a great program on BBC3 showcasing that “Girls Can Code”. Such a shame it was on a minor channel at 9pm rather than BBC1 or BBC2 earlier. However, BBC3 is aimed at the youth market so I’m hoping that enough young women watched it to be inspired.
If you missed it, it’s available on iPlayer (UK only) for the next month, with the second episode next Monday.
This isn’t the apprentice – they don’t need to crush each other to get ahead – Alice Levine