Conference season online

October has always been a super busy month for me. I’m usually starting a new OU module and travelling around speaking at conferences and meetups, all while doing my day job, spending time with my family and enjoying my hobbies. Sometimes I’ve not got the balance right! 2019 I remember was particularly hectic. I optimistically submitted conference sessions at the start of the year on a variety of different topics and, as the year went on I was invited to speak at various meetups in the UK and even stepped in to do some last minute presentations where other speakers had dropped out. This time last year I had just finished 8 weeks where I had a week’s holiday, spoken at 5 conferences, 2 breakfast briefings and 8 meet ups, all of which were on slightly different topics!

I really enjoy speaking at these events, otherwise I simply wouldn’t do them! As an attendee I get to learn from my peers and be inspired by steps forward in areas that I just don’t have time to keep up to date on. As a speaker, I get to pass on some of the things I’ve learned over the years in what I hope is an entertaining way, and I always love the conversations after the talks.

This year has, inevitably, been very different. February was my first event, where I spoke at The European Information Security Summit in London on the risks that Deep Fakes pose to the security sector. I spoke to a lot of security professionals at the event who were unaware of how AI was progressing in both voice and face cloning. As an attendee, I learned a lot about the state of security in many of the systems we take for granted. If you can justify the time and cost, attending a conference outside of your area of expertise can be incredibly informative.

A mere few weeks later, and I had several sessions at Big Data and AI World. I had a panel session with the amazing Sue Daley and Vitaliy Yuryev on why basics are often overlooked in data projects, followed a few hours later by my main presentation on learning from projects that go wrong. This was the 12th of March. While the organisers were doing everything they could, practically all the international speakers and attendees had decided not to attend. The sessions were reorganised to prevent large gaps in the program and many of the sessions I had been personally looking forward to were no longer happening. I really enjoyed both my sessions and got some great questions after them, but it was clear that people were nervous about the crowds and conferences and meetups would be on hold from that point onwards.

As I headed home on the train that afternoon I knew I wouldn’t be back in London for a while. My company was considering a trial of homeworking for a few days a week1, but I’d already decided to swap to home working for the foreseeable future and told my team to do the same if they wanted. My team had been at the conference with me and I didn’t realise then that it would be the last time I’d see them2.

March and April would normally be the time that I would be submitting keynote suggestions for the Autumn conference season and spending my evenings talking to University students at meetups and I really missed those interactions.

While I was interviewed over the summer (Humans of AI, Agile Data Science), I really did miss the chance to interact with a wider audience. You can’t respond to questions in a pre-recorded video.

I was delighted when Barclays Eagle Labs asked me if I would rerun a talk on Deep Fakes that I had given in person late in 2019, as a series of three online events. Despite the strangeness of talking into a camera without the feedback of the audience’s faces and the ever present anxiety that one of my neighbours would start noisily doing DIY during the session3, it was great to see so many people take 30 minutes out of their day for three consecutive weeks to learn. After the final session, I got a lot of messages from people who had made their own fakes and really understood both the positive and negative aspects of the technology and thanking me for making it accessible. It’s this type of interaction that makes these events worthwhile. Sadly these sessions were not recorded, but the slides are on my slideshare (Part 1, Part 2 and Part 3) and a variation of the talk that I gave at Tech Exeter in 2019 is available on sitepoint.

At the end of September, one of the events that was cancelled from March was resurrected as an online event sponsored by DevelopHer. I had 5 minutes, which is both an eternity (if you’ve ever heard Just a Minute) and the blink of an eye (if you have more than a single thing you want to say)! I managed to condense the 25 talk on getting into Data Science and AI into (just over) 5 minutes alongside an amazing line up of other women in AI.

What really stood out to me from this event is how many people attended who may not otherwise have been able to go to an in person meetup. Not everyone has the luxury of being able to stay late after work, or travel in for these events, or may not want to even if they could. One of the huge benefits of everything moving online is that it has made many of these events far more accessible, and I hope that this continues in some form.

Post by Bethan Reeves watching my talk at home in comfort 😀

Last week I spoke at the online version of one of my favourite conferences, Minds Mastering Machines. The invite advertised me as one of their veteran speakers :D. I’ve done some heavily technical talks at their event over the past few years, but for 2020 I decided to be a bit lighter and given world events I’m glad I did. One of the things I’ve noticed in all the projects I’ve led, advised on, or done due diligence for, is that testing never seems to be a priority for data science and AI. This is something that drives me crazy so I thought I’d approach it in a light hearted way and try and convert the attendees to testing thinking with a talk titled: Your testing sucks – what should you be doing? I paired seven best practises of testing thinking alongside examples (mostly) from spacecraft. I think it went down well and hopefully it was memorably enough to make people want to make time for testing by remembering the various missions.

My presentation from MCubed. Don’t coerce your data.

While I’ve nothing else planned for this year or even 2021, I intend to speak at more conferences. Even when large gatherings are safe again, I hope that there will still be online streams for those that cannot attend. Let’s keep tech accessible.

Remote Data Science – Interview

Last week I was interviewed by Keith Robinson of Ammonite Data, with a topic of managing data science teams remotely and all the challenges this brings. We had a much more wide ranging conversation where I looked at challenges of communication and even the impact on models that the current extraordinary events will have.

Part 1: Where I discuss communication and mental health while isolated
Part 2: Where I discuss the current data blip, security and consent, and prioritising work in crisis mode.

I hope you find these enjoyable and helpful.

Data Science Courses – the missing skills you need

One of the things that I have been complaining about with many of the data science masters courses is that they are missing a lot of the basic skills that are essential for you to be able to be effective in a business situation. It’s one of the things I was going to talk about at the Women in AI event that was postponed this week and I’m more than happy to work with universities who want to help build a course1. That said, some universities are realising this is missing and adding it as optional courses.

Continue reading Data Science Courses – the missing skills you need

How much maths do you really need for data science?

My LinkedIn news feed was lit up last week by a medium post from Dario Radečić originally posted in December 2019 discussing how much maths is really needed for a job in data science. He starts with berating the answers from the Quora posts by the PhD braniacs who demand you know everything… While the article is fairly light hearted and is probably more an encouragement piece to anyone currently studying or trying to get that first job in data science, I felt that, as someone who hires data scientists1, I could add some substance from the other side.

Continue reading How much maths do you really need for data science?

Getting a first job in AI or data science – what candidates need to know

Me in Lego – well not really, but it does look a lot like me 😉 – this was a very fortuitous collector fig from Series 18.

Getting any role in IT can be daunting as a first timer, whether it’s your first ever job or you’ve changed career or you’ve had a break and are returning as a junior in a new field or anything else.  Getting one in any part of AI can be even more of an up hill struggle.  Job posting and recruitment agencies are asking for PhDs, academic papers and post-doctoral research as well as years of experience in industry.  How can you get past that first barrier?  I get a lot of people asking me this when I present at Meet-Ups so thought I’d collate everything into one post.

I’m going to break down how you can demonstrate the skills that businesses need and how to talk confidently about what you can offer without the fluff.

Continue reading Getting a first job in AI or data science – what candidates need to know

Agile Data Science: your data point is probably an outlier

It’s not often that I feel the need to write a reactionary post as mainly the things that tend to inflame me are usually by design.  However today I read something on LinkedIn that caused a polarisation in debate within a group of people who should really appreciate learning from different data: Data Scientists.

 

What was interesting was how the responses fell neatly into one of two camps: the first praising the poster for speaking out and saying this, supported by nearly an order of magnitude more likes than the total number of comments, and the second disagreeing and pointing out that it can work.  What has been lost in this was that “can” is not synonymous with “always”  – it really needs a good team and better explanation than many companies sometimes use.  What irked me most about the whole thread was the accusation that people doing data science with agile obviously “didn’t understand what science was”.  I hate these sweeping generalisations and I really do expect a higher standard of debate from anyone with either “data” or “science” anywhere near their profile. Continue reading Agile Data Science: your data point is probably an outlier

Cambridge Analytica: not AI’s ethics awakening

From the wonderful XKCD, research ethics

By now, the majority of people who keep up with the news will have heard of Cambridge Analytica, the whistle blower Christopher Wylie, and the news surrounding the harvesting of Facebook data and micro targeting, along with accusations of potentially illegal activity.  In amongst all of this news I’ve also seen articles that this is the “awakening ” moment for ethics and morals AI and data science in general.  The point where practitioners realise the impact of their work.

“Now I am become Death, the destroyer of worlds”, Oppenheimer

Continue reading Cambridge Analytica: not AI’s ethics awakening

Incentivising data scientists

One of the regular Data Science discussion breakfasts. Thanks to all who attended.

I chaired a breakfast meeting for Women in Data Science recently, and one of the topics for discussion was how to retain talent. While demand is outstripping supply and the market is going crazy, it’s enough of a minefield finding good people  in the first place.

Add to this that even after you’ve made an offer to someone, recruiters will be contacting them regularly to try to tempt them away to other roles. It’s impossible to prevent this. I’m a big believer in not playing games with recruitment – I know what I can afford and won’t get into a bidding war. If I’m paying a fair salary and they go elsewhere for money, then they are more likely to jump when a recruiter calls regardless of how well you incentivise them. This isn’t a big company or small company thing, if you want to keep hold of your team after you’ve done the very hard job of hiring them then you need to understand what motivates them and either make sure that you continue to provide those needs or plan to be hiring again in the next 12-24 months. Continue reading Incentivising data scientists

Source Code Control for Data Scientists

XKCD explains git source code control.. 🙂

I work with many people who are recently out of academia. While they know how to code and are experts in their fields, they are lacking some of rigour of computer science that experienced developers have. In addition to understanding the problems of data in the wider world and testing their solutions properly, they are also unaware of the importance of source code control and deployment. This is another missing aspect from these courses – you cannot exist as a professional developer without it. While there are many source control setups, I’m most familiar with git.

I’ve recently written a how-to guide for my team and was going to make that the focus of this post, although I’ve seen some very good guides out there that are more generic, so I’d like to explain why source code control is important and then give you the tools to learn this yourself. Continue reading Source Code Control for Data Scientists

Why are data scientists so bad at science?

Do you check your inputs?

It’s rare that I am intentionally provocative in my post titles, but I’d really like you to think about this one. I’ve known and worked with a lot of people who work with data over the years, many of who call themselves data scientists and many who do the role of a data scientist but by another name1. One thing that worries me when they talk about their work is an absence of scientific rigour and this is a huge problem, and one I’ve talked about before.

The results that data scientists produce are becoming increasingly important in our lives; from determining what adverts we see to how we are treated by financial institutions or governments. These results can have direct impact on people’s lives and we have a moral and ethical obligation to ensure that they are correct. Continue reading Why are data scientists so bad at science?