One of the things that I find I have to teach data scientists and ML researchers almost universally is understanding how to test their own code. Too often it’s all about testing the results and not enough about the code. I’ve been saying for a while that a lack of proper testing can trip you up and recently we saw a paper that rippled through academia about a “bug” in some code that everyone used…
A Code Glitch May Have Caused Errors In More Than 100 Published Studies
The short version of this is that back in 2014, a python protocol was released for calculating molecule structure through NMR shifts1 and many other labs have been using this script over the past 5 years.
After my introductory post on Literate Programming, it occurred to me that while the concept of being able to create documentation that includes variables from the code being run is amazing, this will obviously have some impact on performance. At best, this would be the resource required to compile the document as if it was static, while the “at worst” scenario is conceptually unbounded. Somewhere along the way, pweave is adding extra code to pass the variables back and forth between the python and the , how and when it does this could have implications that you wouldn’t see in a simple example but could be catastrophic when running the kind of neural nets that my department are putting together. So, being a scientist, I decided to run a few experiments….1Continue reading Literate programming – effect on performance
Over my career in IT there have been a lot of changes in documentation practises, from the heavy detailed design up front to lean1 and now the adoption of literate programming, particularly in research (and somewhat contained to it because of the reliance on as a markup language2). While there are plenty of getting started guides out there, this post is primarily about why I’m adopting it for my new Science and Innovations department and the benefits that literate programming can give. Continue reading Using Literate Programming in Research
Two months ago I hadn’t looked at a line of Python code – it was never a requirement when I was a developer and as I moved into management I worked with teams and projects using everything from C and COBOL through LAMP to .Net, while Python sat on the periphery. I’d always considered it to be a modern BASIC – something you did to learn how to code or for a quick prototype but not something to be taken seriously in a professional environment.
I’ve always believed that really good programmers understand the boundaries and strengths of multiple languages, able to choose the right tool for the job, and finding the correct compromise for consistency and maintainability. People like this are really hard to find1 although I do tend to veer away from individuals who can only evangalise a single language and say all the others are rubbish2. Due to the projects I’ve been involved with, Python ability has been irrelevant and never considered part of that toolbox. Continue reading Python: serious language or just for beginners?