Conference season online

October has always been a super busy month for me. I’m usually starting a new OU module and travelling around speaking at conferences and meetups, all while doing my day job, spending time with my family and enjoying my hobbies. Sometimes I’ve not got the balance right! 2019 I remember was particularly hectic. I optimistically submitted conference sessions at the start of the year on a variety of different topics and, as the year went on I was invited to speak at various meetups in the UK and even stepped in to do some last minute presentations where other speakers had dropped out. This time last year I had just finished 8 weeks where I had a week’s holiday, spoken at 5 conferences, 2 breakfast briefings and 8 meet ups, all of which were on slightly different topics!

I really enjoy speaking at these events, otherwise I simply wouldn’t do them! As an attendee I get to learn from my peers and be inspired by steps forward in areas that I just don’t have time to keep up to date on. As a speaker, I get to pass on some of the things I’ve learned over the years in what I hope is an entertaining way, and I always love the conversations after the talks.

This year has, inevitably, been very different. February was my first event, where I spoke at The European Information Security Summit in London on the risks that Deep Fakes pose to the security sector. I spoke to a lot of security professionals at the event who were unaware of how AI was progressing in both voice and face cloning. As an attendee, I learned a lot about the state of security in many of the systems we take for granted. If you can justify the time and cost, attending a conference outside of your area of expertise can be incredibly informative.

A mere few weeks later, and I had several sessions at Big Data and AI World. I had a panel session with the amazing Sue Daley and Vitaliy Yuryev on why basics are often overlooked in data projects, followed a few hours later by my main presentation on learning from projects that go wrong. This was the 12th of March. While the organisers were doing everything they could, practically all the international speakers and attendees had decided not to attend. The sessions were reorganised to prevent large gaps in the program and many of the sessions I had been personally looking forward to were no longer happening. I really enjoyed both my sessions and got some great questions after them, but it was clear that people were nervous about the crowds and conferences and meetups would be on hold from that point onwards.

As I headed home on the train that afternoon I knew I wouldn’t be back in London for a while. My company was considering a trial of homeworking for a few days a week1, but I’d already decided to swap to home working for the foreseeable future and told my team to do the same if they wanted. My team had been at the conference with me and I didn’t realise then that it would be the last time I’d see them2.

March and April would normally be the time that I would be submitting keynote suggestions for the Autumn conference season and spending my evenings talking to University students at meetups and I really missed those interactions.

While I was interviewed over the summer (Humans of AI, Agile Data Science), I really did miss the chance to interact with a wider audience. You can’t respond to questions in a pre-recorded video.

I was delighted when Barclays Eagle Labs asked me if I would rerun a talk on Deep Fakes that I had given in person late in 2019, as a series of three online events. Despite the strangeness of talking into a camera without the feedback of the audience’s faces and the ever present anxiety that one of my neighbours would start noisily doing DIY during the session3, it was great to see so many people take 30 minutes out of their day for three consecutive weeks to learn. After the final session, I got a lot of messages from people who had made their own fakes and really understood both the positive and negative aspects of the technology and thanking me for making it accessible. It’s this type of interaction that makes these events worthwhile. Sadly these sessions were not recorded, but the slides are on my slideshare (Part 1, Part 2 and Part 3) and a variation of the talk that I gave at Tech Exeter in 2019 is available on sitepoint.

At the end of September, one of the events that was cancelled from March was resurrected as an online event sponsored by DevelopHer. I had 5 minutes, which is both an eternity (if you’ve ever heard Just a Minute) and the blink of an eye (if you have more than a single thing you want to say)! I managed to condense the 25 talk on getting into Data Science and AI into (just over) 5 minutes alongside an amazing line up of other women in AI.

What really stood out to me from this event is how many people attended who may not otherwise have been able to go to an in person meetup. Not everyone has the luxury of being able to stay late after work, or travel in for these events, or may not want to even if they could. One of the huge benefits of everything moving online is that it has made many of these events far more accessible, and I hope that this continues in some form.

Post by Bethan Reeves watching my talk at home in comfort šŸ˜€

Last week I spoke at the online version of one of my favourite conferences, Minds Mastering Machines. The invite advertised me as one of their veteran speakers :D. I’ve done some heavily technical talks at their event over the past few years, but for 2020 I decided to be a bit lighter and given world events I’m glad I did. One of the things I’ve noticed in all the projects I’ve led, advised on, or done due diligence for, is that testing never seems to be a priority for data science and AI. This is something that drives me crazy so I thought I’d approach it in a light hearted way and try and convert the attendees to testing thinking with a talk titled: Your testing sucks – what should you be doing? I paired seven best practises of testing thinking alongside examples (mostly) from spacecraft. I think it went down well and hopefully it was memorably enough to make people want to make time for testing by remembering the various missions.

My presentation from MCubed. Don’t coerce your data.

While I’ve nothing else planned for this year or even 2021, I intend to speak at more conferences. Even when large gatherings are safe again, I hope that there will still be online streams for those that cannot attend. Let’s keep tech accessible.

WSL2 and GPU powered ML

It’s been possible to run Linux on Windows for a few years now. Windows Subsystem for Linux (WSL) was released in 2016, allowing native Linux applications to be run from within Windows without the need for dual boot or virtual machine. In 2019 WSL2 was released, providing a better architecture in terms of the kernel and improving the native support. A few weeks ago, Microsoft and NVIDIA announced GPU support on WSL2 and the potential for CUDA accelerated ML on Ubuntu from within Windows. Before I dive into this in detail, I want to take a quick aside into why you might want or need to do this…

Continue reading WSL2 and GPU powered ML

STEM toy review: hydraulic robot arm

While it’s no secret I love Lego and tech in general, I also love the educational STEM toys that are released. Sometimes, the ages on the toys don’t always make sense for their complexity, leaving a child who is either frustrated at something too tricky or too simplistic. Both can leave a young person slightly disengaged with STEM, the exact opposite of the idea of these toys!

Robot Arm DIY kit, suitable for ages 10 and up.

Christmas 2019 I was given this Hydraulic Robot Arm kit, suitable for ages 10+1. With work, OU study and general life I’ve only just got around to building it2. So, let’s take a look – is it suitable for ages 10 and up for both build and principles it teaches?

Continue reading STEM toy review: hydraulic robot arm

How to be a Rockstar Neural Network Developer

There’s a trend in job descriptions that the company may be looking for “Data Science Unicorns”, “Python Ninjas”, “Rockstar developers”, or more recently the dreaded “10x developer”. When companies ask this, it either means that they’re not sure what they need but they want someone who can do the work of a team or that they are deliberately targeting people who describe themselves in this way. A couple of years ago this got silly with “Rockstar” to the point that many less reputable recruitment agencies were over using the term, inspiring this tweet:

Many of us in the community saw this and smiled. One man went further. Dylan Beattie created Rockstar and it has a community of enthusiasts who are supporting the language with interpreters and transpilers.

While on lockdown I’ve been watching a lot of recordings from conferences earlier in the year that I didn’t have time to attend. One of these was NDC London, where Dylan was giving the closing session on the Art of Code. It’s well worth an hour of your time and he introduces Rockstar through the ubiquitous FizzBuzz coding challenge.

Recorded at NDC London 2020

After watching this I asked the question to myself, could I write a (simple) neuron based machine learning application in Rockstar and call myself a “Rockstar Neural Network” developer?

Continue reading How to be a Rockstar Neural Network Developer

Data Science Courses – the missing skills you need

One of the things that I have been complaining about with many of the data science masters courses is that they are missing a lot of the basic skills that are essential for you to be able to be effective in a business situation. It’s one of the things I was going to talk about at the Women in AI event that was postponed this week and I’m more than happy to work with universities who want to help build a course1. That said, some universities are realising this is missing and adding it as optional courses.

Continue reading Data Science Courses – the missing skills you need

A diagnostic tale of docker

twenty sided die showing common excuses for developers not to fix problems, the top of the die shows "Can't reproduce"
Developer d20 gives the answer šŸ™‚ (from Pretend Store)

If you’ve been to any of my technical talks over the past year or so then you’ll know I’m a huge advocate for running AI models as api services within docker containers and using services like cloud formation to give scalability. One of the issues with this is that when you get problems in production they can be difficult to trace. Methodical diagnostics of code rather than data is a skill that is not that common in the AI community and something that comes with experience. Here’s a breakdown of one of these types of problems, the diagnostics to find the cause and the eventual fix, all of which you’re going to need to know if you want to use these types of services.

Read more

Mathematics of player levels in game development

My husband is a game developer and my contributions are usually of the sort where I look at what he’s done and say “hey wouldn’t it be great if it did this”. While these are usually positive ideas, they’re mostly a pain to code in. Today however, I was able to contribute some of my maths knowledge to help balance out one of his games.

Using an open api, he’d written a simple pokemon battle game to be used on twitch by one of our favourite streamers, FederalGhosts, and needed a way of determining player level based on the number of wins, and the number of wins required to reach the next level without recursion. While this post is specifically about the win to level relationship, you could use any progression statistic by applying scaling. Here we want to determine:

  • Number of wins (w) required for a given level (l)
  • The current player level (pl) given a number of wins (pw)
  • Wins remaining to the next level (wr) for a player based on current wins (pw)

Let’s take a look at a few ways of doing this. Each section below has the equations and code examples in python1. Assume all code samples have the following at the top:

import math

database = [
{"name": "player1", "wins": 5},
{"name": "player2", "wins": 15},
{"name": "player3", "wins": 25}
]
Continue reading Mathematics of player levels in game development

Data: access and ethics

Last week I attended two events back to back discussing all things data, but from different angles. The first, Open Data, hosted by the Economist was an event looking at how businesses want to use data and the ethical (legal) means that they can acquire it. The second was a round table discussion of practitioners that I chaired hosted by Ammonite Data, where we mainly focussed on the need for compliance and balancing protection of personal data with the access that our companies need in order to do business effectively.

We’re in a world driven by data. If you don’t have data then you can’t compete. While individuals are getting more protective over their data and understanding its value, businesses are increasingly wanting access to more and more – at what point does legitimate interest or consumer need cross the line?

Continue reading Data: access and ethics

Facebook’s Maths Solving AI

In December, Lample and Charton from Facebook’s Artificial Intelligence Research group published a paper stating that they had created an AI application that outperformed systems such as Matlab and Mathematica when presented with complex equations. Is this a huge leap forward or just an obvious extension of maths solving systems that have been around for years? Let’s take a look.

Continue reading Facebook’s Maths Solving AI

Let’s talk about testing

One of the things that I find I have to teach data scientists and ML researchers almost universally is understanding how to test their own code. Too often it’s all about testing the results and not enough about the code. I’ve been saying for a while that a lack of proper testing can trip you up and recently we saw a paper that rippled through academia about a “bug” in some code that everyone used…

A Code Glitch May Have Caused Errors In More Than 100 Published Studies

https://www.vice.com/en_us/article/zmjwda/a-code-glitch-may-have-caused-errors-in-more-than-100-published-studies

The short version of this is that back in 2014, a python protocol was released for calculating molecule structure through NMR shifts1 and many other labs have been using this script over the past 5 years.

Continue reading Let’s talk about testing