## Why are data scientists so bad at science?

It’s rare that I am intentionally provocative in my post titles, but I’d really like you to think about this one. I’ve known and worked with a lot of people who work with data over the years, many of who call themselves data scientists and many who do the role of a data scientist but by another name1. One thing that worries me when they talk about their work is an absence of scientific rigour and this is a huge problem, and one I’ve talked about before.

The results that data scientists produce are becoming increasingly important in our lives; from determining what adverts we see to how we are treated by financial institutions or governments. These results can have direct impact on people’s lives and we have a moral and ethical obligation to ensure that they are correct. Continue reading Why are data scientists so bad at science?

## Professional body for data science? Yes Please

This week I was delighted to be at the Royal Statistical Society as a business representative for the launch of their Data Science Section. At over 160 years old, the RSS is one of the more established professional bodies and I like that it is questioning and making a difference as the application of their industry changes and when faced with an increasing challenge of abuse of statistical methods. I wish the general public had a greater understanding of statistics so they wouldn’t be so easily swayed by the media with a simple graph “proving” a point. Continue reading Professional body for data science? Yes Please

## MST210 – Exam and modelling exercise reflections

This week was the exam for my level 2 OU module MST210 on methods, models and modelling.  This was a compulsory module, but had it not been I would have never chosen it.  The module has been mostly applied maths, which has been really interesting, but what’s been a problem for me has been the mandatory team work modelling exercise, which makes up 16% of the continuous assessment.  So much so, that I lost motivation to do the final TMA or revise for the exam as much as I wanted to.  I thought it would be worth a short reflection on why I disliked this aspect so much (especially as it led to a repeat of last year when it came to revision…).

Mathematical modelling is important, really important.  If you’re going to use maths in any real world application, you really need to understand how to create, test and revise models.  This is a given.  However, I was uncomfortable with how this is enforced for MST210.

One of the problems with the OU compared to traditional universities is that you don’t have the sense of community with your classmates.  There are face to face tutorials, but they’re not mandatory and, if you wish, you could do the entire course without ever speaking to anyone else.  With other (non-maths) courses there are residential schools and group exercises and even part of your module score for being active on the forums, and it appeared that there was a need to include group work into the maths modules.

While I’m not the most extroverted person in the world, I don’t consider myself socially awkward, but when it comes to academic attainment, I have a very strong sense of wanting to be measured on my work and not wanting to affect or be affected by anyone else.  As a result I admit I had a pre-conceived reticence against the process.

An earlier TMA had two questions where we had to mark fake reports using a tutors mark scheme. The results from this exercise also contributed to our overall score.  While tutors get considerable feedback on marking1 we had one go.  Unlike normal mathematics where marks are awarded for result and correct method, marking modelling reports is subjective, “did the student state the problem clearly?”.  I did not choose to study maths for subjectivity, and it’s why I regretted taking DB123 as my “spare” level 1 module.

With the uncomfortable feeling of subjective exercises in a TMA fresh in my mind we were assigned groups by the tutors.  I’m not sure how these groups were allocated, but there was no concept of geographical closeness.  I was told I was in a group of 5, one of which couldn’t contribute to the forums for valid reasons, meaning that there would be four of us making the model2.  We had a private forum and wiki.  Myself and one other introduced ourselves.  Of the remaining two, they didn’t even bother saying hi so I can only assume they’ve either dropped out or decided that the marks for contributing weren’t worth the effort.  My hackles were again raised by the thought of individuals taking my work and passing it off as their own without being part of the process.  This made me reticent to post too much.  I heard about other groups having calls, screen shares and even in person meets.  For my group there was nothing.  After a week, I found I was talking to myself – even the other person had stopped posting and I discovered that I was posting all my ideas for no net gain.

Like most people doing OU courses, I’m pretty busy, so the motivation to write up thoughts and take the time to post them for the benefit of the group when I was getting nothing in return just wasn’t there.  I spoke to my tutor about it and then team members were “nudged” to no avail.

The actual maths of the model we had to create were pretty easy, which may have been the cause for the lack of engagement.  however as a group we could have done far more varied experiments to test our model and even each worked on different revisions.  This would have made all our reports much richer.

Since attendance at tutorials is not mandatory, the OU has no way to enforce collaboration without changing the distribution of marks for the TMA.  This would then start to detract from the maths itself.  I’m not sure how to fix this.

The point of the group task is to show teamwork and collaboration for future employers.  As an employer, I can get this during the interview stage in other ways and would never assume from a line with a degree on their CV that individuals had this sort of experience  – it’s usually clear elsewhere.  Maybe this could be an elective TMA which does not count to your over all classification but shows the skills.  This way, those who were engaged enough to collaborate would get the most out of it.  Similarly, I would never ask my team to all do the same thing – as a manager I assign tasks based on skills and/or aspiration and then combine the results, so the actual task itself is not a good representation of how things work in business.

Doing the depth of model that I wanted on my own took a long time, so I was behind on the study of the last three modules and the final TMA, with a real lack of motivation for it.  I was pretty annoyed and burned out.  I did the final assignment with two sessions at a soft play centre while my daughter played, but didn’t read the chapters other than to find relevant examples.

Surprisingly, I got a decent mark on both the modelling assignment and the final TMA, so much so that I felt somewhat motivated for the exam with less than a week to go.  Really not enough.  I know that had the modelling report mark been low enough to drop down my overall assessment mark (which directly contributes to the overall result3) then I’d’ve not bothered much with the exam.  Where’s the motivation to do your best when it doesn’t make a difference?  However, with my new found motivation I was in the same situation as last year – a looming exam for which I’d done no preparation.

I managed the MST209 past papers untimed on my commute (when I got a seat) and did the handbook annotations as necessary, and then the day before the exam I did two past papers under exam conditions, and the practise quizzes.  Finally, the morning before the exam I did the specimen paper4.

Really not a good approach, but the exam itself seemed to go okay and I’m feeling pretty confident.  Results are out in the middle of July.  I’m also looking forward to being able to start level 3, which is where the really interesting modules are and I’ve not chosen which ones yet…

## Smart watch insights – now I can’t do without it

A couple of weeks ago I got an iWatch.  I’d had a Nike fuel band before and am no stranger to wearable tech, but I’ve never really worn a watch.  I’ve been surrounded by things that tell us the time since I was a child so I’ve got used to not wearing anything on my wrist1  However, when my other half decided not to wear his, I thought I’d give it a go before we sold it.

I’ve had it for a couple of weeks now and, while I’m still not used to the feel of a watch on my wrist, I’ve fallen head first into its features, particularly the fitness aspect.  If you follow this blog, you’ll know just how goal-oriented I am and the achievements system for the fitness app on the iWatch feeds those addictions in me.  I can’t get to the end of the day without reaching my stand, exercise and move goals.  When I did my latest OU exam, I was frustrated at missing out on stand hours and didn’t make my targets for that day.  I don’t know why, but because of a small amount of automated digital acceptance I want to make sure I do all my exercise.  This can only be a good thing for my general fitness

However, I have noted the following oddities:

• 10 hours of thorough housework that left me exhausted only counted as 100 active calories and 8 minutes of exercise, while walking my daughter to school and back (which I barely count as exercise) hit my targets without trying.  I’m not sure if I’m doing the housework wrong or if it really isn’t as intensive as it felt 2.
• After putting my watch back on after my latest OU exam, it logged 100 active calories while I was sat still in the car.  I can only conclude my pulse was still racing after 3 hours of intense concentration.
• Sit-ups/Push-ups don’t seem to count (for me) I had to do over 30 mins of exercise to get 5 mins logged.

Still, after 2 weeks, it’s still upping my calorie goals, and I know that while I have a target to meet, I’ll keep exceeding it.  I wonder if it will level out at some point or if I’ll just end up being super fit.  I hope it’s the latter.

One of the things I love is being able to have my phone buried away in my bag while I’m commuting and still respond to texts, emails and slack alerts.  Now I’ve got the hang of the scribble feature, I’m almost as fast as typing in my responses so I don’t need to stick to short answers. I try to avoid taking calls on it as I feel quite conspicuous talking into my watch3, not to mention everyone around me hearing both sides of the call.

I have the smart home app linked through so I can control the lights in my house as I don’t have an echo for every room (yet).  It took a few minutes to set up the routines I wanted to control  within the smart things app on the phone and then they were there.  It just worked, seamlessly.

It’s also really great having the iPhone wallet contents mirrored to the watch – I attended a conference last week and had the QR code for my ticket quickly available without having to scrabble around in my bag.  I don’t know whether shops will get used to us paying for things with our watches (definitely more odd the further you get out of London), but it’s just so much easier (particularly when wrangling one or more children) even than using contactless.

I’m not sure how I managed before I got the watch, I know it’d be difficult doing without it now, and I definitely wouldn’t be finishing the post quickly as I’ve just been reminded that I’ve been inactive for too long… 🙂

## Evidence in our AI future

If you’ve been following this blog you’ll know that there have been great advances in the past few years with artificial image generation, to the stage where having a picture of something does not necessarily mean that it is real.  Image advances are easy to talk about, as there’s something tangible to show, but there have been similar large leaps forward in other areas, particularly in voice synthesis and handwriting.

## Artifical image creation takes another step forward in advertising

Earlier this month, Dove launched their new baby range with another of their fantastic adverts challenging stereotypes and questioning is there a “perfect mum”.  As a mum myself I can relate to the many hilarious bloggers1 who are refreshingly honest about the unbrushed hair, lack of make-up, generally being covered in whatever substances your new tiny human decides to produce, and all other parenting frustrations.  I’m really pleased that there are lots of women2 out there challenging the myths presented in the media – we don’t all have a team to make us beautiful, nor someone photo-shopping the results to perfection, and the pressure can be immense.  This is where Dove’s campaign is fantastic.  Rather than just creating a photoshoot with a model and doctoring the results, the image is actually completely artificial, having been generated by AI. Continue reading Artifical image creation takes another step forward in advertising

## Submitting evidence to parliament committees

I love the fact that here in the UK everyone can be involved in shaping the future of our country, even if a large number of individuals choose not to and, in my eyes, if you don’t get involved then you don’t have the right to complain.  While this is most generally applied to the election of our representatives from local parish councils to our regional MPs (or actually standing yourself)1 there are also a lot of other ways to be involved.  In addition to raising issues with your local representative, parliament has cross bench committees that seek input from the public and to help create policy or consider draft legislation.

Our elected parliament is not made up of individuals who are experts in all fields.  Even government departments are not necessarily headed by individuals with large amounts of relevant experience.  It is critical that these individuals are informed by those with the experience and expertise in the issues that  are being considered.  Without this critical input, our democracy is weakened. Continue reading Submitting evidence to parliament committees

## Choosing a Laptop for Deep Learning Development

If you’re starting out in deep learning and would prefer a laptop over a desktop, basic research will lead you to a whole host of blogs, Q&A sites and opinions that basically amount to “don’t do it” and to get a desktop or remote into a server instead.  However, if you want a laptop, whether this is for college, conferences or even because you have a job where you can work from anywhere, then there are plenty of options available to you.  Here I’ll lay out what I chose and why, along with how it’s performing. Continue reading Choosing a Laptop for Deep Learning Development

## True Type Fonts in LaTeX: a brief guide

One of the things I love about $\LaTeX$ is how customisable it is.  Separating content from design a long time before web design cottoned on to this.  However, out of the box, $\LaTeX$ comes with very limited fonts and most people just use these defaults, mainly because setting up other fonts isn’t as easy as it should be.

One of the great things about drawing diagrams in $\LaTeX$ is that the fonts match, it’s always a little jarring to my eye when I see papers with a mismatch between diagrams and main text.  However, sometimes you just can’t control what’s in your diagram or you want something a little more modern than Times New Roman for whatever you’re putting together.

So how do you go about doing this?  Like most things, the answer is “it depends”… let’s start with an assumption that you’re starting from scratch and if you’re already a few steps down the process then that’s just less work for you to do 🙂 Continue reading True Type Fonts in LaTeX: a brief guide

## Algorithmic transparency – is it even possible?

The Science and Technology Select Committee here in the UK have launched an inquiry into the use of algorithms in public and business decision making and are asking for written evidence on a number of topics.  One of these topics is best-practise in algorithmic decision making and one of the specific points they highlight is whether this can be done in a ‘transparent’ or ‘accountable’ way1.  If there was such transparency then the decisions made could be understood and challenged.

It’s an interesting idea.  On the surface, it seems reasonable that we should understand the decisions to verify and trust the algorithms, but the practicality of this is where the problem lies. Continue reading Algorithmic transparency – is it even possible?