Data Science Courses – the missing skills you need

One of the things that I have been complaining about with many of the data science masters courses is that they are missing a lot of the basic skills that are essential for you to be able to be effective in a business situation. It’s one of the things I was going to talk about at the Women in AI event that was postponed this week and I’m more than happy to work with universities who want to help build a course1. That said, some universities are realising this is missing and adding it as optional courses.

Continue reading Data Science Courses – the missing skills you need

A diagnostic tale of docker

twenty sided die showing common excuses for developers not to fix problems, the top of the die shows "Can't reproduce"
Developer d20 gives the answer 🙂 (from Pretend Store)

If you’ve been to any of my technical talks over the past year or so then you’ll know I’m a huge advocate for running AI models as api services within docker containers and using services like cloud formation to give scalability. One of the issues with this is that when you get problems in production they can be difficult to trace. Methodical diagnostics of code rather than data is a skill that is not that common in the AI community and something that comes with experience. Here’s a breakdown of one of these types of problems, the diagnostics to find the cause and the eventual fix, all of which you’re going to need to know if you want to use these types of services.

Read more

Mathematics of player levels in game development

My husband is a game developer and my contributions are usually of the sort where I look at what he’s done and say “hey wouldn’t it be great if it did this”. While these are usually positive ideas, they’re mostly a pain to code in. Today however, I was able to contribute some of my maths knowledge to help balance out one of his games.

Using an open api, he’d written a simple pokemon battle game to be used on twitch by one of our favourite streamers, FederalGhosts, and needed a way of determining player level based on the number of wins, and the number of wins required to reach the next level without recursion. While this post is specifically about the win to level relationship, you could use any progression statistic by applying scaling. Here we want to determine:

  • Number of wins (w) required for a given level (l)
  • The current player level (pl) given a number of wins (pw)
  • Wins remaining to the next level (wr) for a player based on current wins (pw)

Let’s take a look at a few ways of doing this. Each section below has the equations and code examples in python1. Assume all code samples have the following at the top:

import math

database = [
{"name": "player1", "wins": 5},
{"name": "player2", "wins": 15},
{"name": "player3", "wins": 25}
]
Continue reading Mathematics of player levels in game development

Data: access and ethics

Last week I attended two events back to back discussing all things data, but from different angles. The first, Open Data, hosted by the Economist was an event looking at how businesses want to use data and the ethical (legal) means that they can acquire it. The second was a round table discussion of practitioners that I chaired hosted by Ammonite Data, where we mainly focussed on the need for compliance and balancing protection of personal data with the access that our companies need in order to do business effectively.

We’re in a world driven by data. If you don’t have data then you can’t compete. While individuals are getting more protective over their data and understanding its value, businesses are increasingly wanting access to more and more – at what point does legitimate interest or consumer need cross the line?

Continue reading Data: access and ethics

Facebook’s Maths Solving AI

In December, Lample and Charton from Facebook’s Artificial Intelligence Research group published a paper stating that they had created an AI application that outperformed systems such as Matlab and Mathematica when presented with complex equations. Is this a huge leap forward or just an obvious extension of maths solving systems that have been around for years? Let’s take a look.

Continue reading Facebook’s Maths Solving AI

Let’s talk about testing

One of the things that I find I have to teach data scientists and ML researchers almost universally is understanding how to test their own code. Too often it’s all about testing the results and not enough about the code. I’ve been saying for a while that a lack of proper testing can trip you up and recently we saw a paper that rippled through academia about a “bug” in some code that everyone used…

A Code Glitch May Have Caused Errors In More Than 100 Published Studies

https://www.vice.com/en_us/article/zmjwda/a-code-glitch-may-have-caused-errors-in-more-than-100-published-studies

The short version of this is that back in 2014, a python protocol was released for calculating molecule structure through NMR shifts1 and many other labs have been using this script over the past 5 years.

Continue reading Let’s talk about testing

Starting your first AI project – a guide for businesses

I’ve been speaking at several events recently giving practical advice on getting started with AI projects.  There is a huge chasm between high level inspirational business pieces on all the usual sites1 that business leaders read and the “getting started in AI” guides that pretty much start with installing Tensorflow.  There was nothing aimed at the non-AI CTO who didn’t want to fall behind.  Nothing to indicate to them how to start a project, what talent they’d need or even which problems to start with.  Sure, there are a lot of expensive consulting companies out there, but this knowledge shouldn’t be hidden.

This time last year, I sat down with David Kelnar of MMC Ventures and we talked about why so many AI projects don’t succeed.  He asked me to contribute some ideas to be included in the new State of AI report for 2019, to which I gladly agreed.  It soon became clear that to do this justice, it was more than just a chapter, and the MMC AI Playbook was born, which we recently launched.  Contributing to this amazing publication took a lot of time and research, and this blog was the thing that had to give.

If you are trying to find the right time to start your first project and need help on where to begin, please take a look at the playbook.  Here’s a taster, based on talks I gave at Austin Fraser’s #LeadersInTech event and the Barclays AI Frenzy event both in July 2019.

Continue reading Starting your first AI project – a guide for businesses

A year of Apple Watch addiction and motivation

Early 2017 I got an Apple Watch.  I wasn’t fussed about them at the time as I never normally wear a watch of any sort.  But when my husband didn’t want his any more, I thought I’d give it a go.  A few months later and I was addicted.  While I used the word lightly at the time, what really worked for me were the regular achievements and challenges.  It was the same thing that got me hooked into World of Warcraft many years ago1 and I know that if I do something, I throw everything at it, but once I can’t complete a challenge I usually drop something.  After my initial post about the watch I found myself in a situation where I couldn’t achieve the challenges.  Towards the end of 2017 I had a few too many days in front of the computer with work and just wasn’t active.  What I noticed was that as soon as I missed a day of activity, and thus I couldn’t get a “perfect month” achievement, I stopped even trying to be active until the start of the next month. If there was no reward, even a completely irrelevant badge in an app, then why try…  Long term health benefits don’t give the same level of accomplishment in the short term for most people, myself included.  So after a particularly gluttonous December 2017 I made myself a promise. Continue reading A year of Apple Watch addiction and motivation

Agile Data Science: your data point is probably an outlier

It’s not often that I feel the need to write a reactionary post as mainly the things that tend to inflame me are usually by design.  However today I read something on LinkedIn that caused a polarisation in debate within a group of people who should really appreciate learning from different data: Data Scientists.

 

What was interesting was how the responses fell neatly into one of two camps: the first praising the poster for speaking out and saying this, supported by nearly an order of magnitude more likes than the total number of comments, and the second disagreeing and pointing out that it can work.  What has been lost in this was that “can” is not synonymous with “always”  – it really needs a good team and better explanation than many companies sometimes use.  What irked me most about the whole thread was the accusation that people doing data science with agile obviously “didn’t understand what science was”.  I hate these sweeping generalisations and I really do expect a higher standard of debate from anyone with either “data” or “science” anywhere near their profile. Continue reading Agile Data Science: your data point is probably an outlier

ReWork Deep Learning London September 2018 part 3

This is part 3 of my summary of ReWork Deep Learning London September 2018. Part 1 can be found here, and part 2 here.

Day 2 of rework started with some fast start up pitches. Due to a meeting at the office I missed all of these and only arrived at the first coffee break. So if you want to check out what 3D Industries, Selerio, DeepZen, Peculium and PipelineAI  are doing check their websites. Continue reading ReWork Deep Learning London September 2018 part 3