This week I was delighted to be at the Royal Statistical Society as a business representative for the launch of their Data Science Section. At over 160 years old, the RSS is one of the more established professional bodies and I like that it is questioning and making a difference as the application of their industry changes and when faced with an increasing challenge of abuse of statistical methods. I wish the general public had a greater understanding of statistics so they wouldn’t be so easily swayed by the media with a simple graph “proving” a point.
The launch itself had some fantastic speakers from industry and academia covering what data science was all about and twelve key questions they are putting forward as areas that need debate:
- What does great data science look like?
- What does a good data science workflow look like?
- What kind of problems can be addressed by data science?
- What are the characteristics of the ideal data scientist?
- How should an organisation start a data science function?
- How should data science fit into the structure of an organisation?
- How should business practices change to make a success of data science?
- What do executives and managers need to learn about data science?
- How can an organisation build a coherent data science capability from a collection of data science projects?
- What career paths are available to a data scientist?
- How can data scientists measure the value they create?
- What is a data scientist’s responsibility to wider society?
One of the interesting things about data science is the sheer number of skills you need to do it well – you may think this is “just” mathematics, programming, communication, but it is far more than that… As well as really understanding whatever specialist domain your data represents.
What does the ideal #datascientist look like? Not a person but a team, cos no one person can do all this… #dsslaunch pic.twitter.com/sewS99mnbU
— Katie Metzler (@KMetzlerSAGE) June 19, 2017
Given this large number of skill requirements, it’s apparent that data scientists will come from all different backgrounds and a successful data science team will be made from a gestalt skill set, so can any one professional body represent the field appropriately?
Professional bodies are becoming less of a “thing” these days. It used to be all about getting the correct accredited degree for your chosen career, joining the body and then working your way up to Senior Fellow, which would go in parallel with your business seniority. Paying over a hundred pounds a year to be a member of something is ceasing to be fashionable when many of the benefits they provide can be found for free in the plethora of meet-ups that occur. There was much talk of the “Shoreditch Data Scientist”, which I think does encompass this different outlook. So is there still a place for a professional body like the RSS for Data Science?
The Royal Statistical Society are bringing this debate forward and I think they’re the correct body to lead it. One of the great quotes from the evening was:
Data Science is not just statistics but statistics is a part of data science.
What underpins all data science for me is the statistics. Without the statistical rigour applied to the data, it’s just conjecture and ceases to be a science. With the almost shocking lack of understanding of anything but the most basic elements that I see in the AI space, it’s clear that someone needs to get some standards in place, and who better than the people who live and breathe the underlying mathematics? The RSS isn’t seeking to be the gatekeeper of who can and who can’t be a data scientist – you don’t need a specific degree or experience, but what you do need is the desire to question the data.
I see a lot of people who claim the title of Data Scientist who wouldn’t know how to determine a simple Bayesian Inference even if they had all day with the Wikipedia page1. I also know a lot of people of all different backgrounds who are actively doing data science but with a different title. For me, the job title isn’t important, but the key is the practise: the scientific approach to the understanding of data.
Even if you don’t want to join the society, please do get involved. You can access the questions at https://github.com/rssdatascience/industrialisation and add your comments via twitter, slack. LinkedIn or just by raising an issue. These questions don’t have simple answers and we need to think about them as a community.
May your results be ever statistically significant.
- You know who you are 😉 ↩