Data scientist jobs: Where does the big data talent gap lie?
Europe needs 346,000 more data scientists by 2020, but why is the gap so big?
Data science is one area of the digital sector that is desperately short of talent. In fact, IBM thinks data science will account for 28% of all digital jobs by 2020, but worryingly, the same report revealed that on average, each of these places remains unfilled for up to 45 days due to a lack of talent equipped with the necessary skills.
"Machine learning, big data and data science skills are the most challenging to recruit for, and can potentially create the greatest disruption if not filled," according to IBM's The Quant Crunch report.
The Royal Society found that demand for workers with specialist data skills in the UK has more than tripled over the last five years to 231%, comparable to a general increase of regular workers of 36%.
And with the European Commission forecasting that 100,000 new data-related jobs will be created in the region by 2020, the fact there aren't enough people with the right skills to fill the role is certainly worrying.
What has triggered the data scientist skills gap?
Simply put, there aren't enough data scientists to go around - the demand for data analysis has grown exponentially over the last few years, and there aren't enough people being trained to meet the demand.
This growing gap between demand and available talent has meant that almost half of all European companies are thought to be struggling to fill their data scientist positions, according to recent data published by O'Reilly Media.
The report found that European organisations are heavily investing in products that help improve the accessibility and usability of data. Specifically, 59% are building platforms that allow for data integration and extract, transform and load (ETL) processes. Similarly, 53% said they were already working on data science platforms, while 53% said they were prioritising data preparation and cleaning, something that is core to the development of AI and analytics services.
However, the lack of talent is holding back progress. As much as 47% of the region's organisations are currently struggling to fill their data science roles, with 37% also struggling with data engineering positions. That's compared to just 25% for security-based roles.
The findings are broadly in line with data published by Gartner, which found that organisations are not only unable to source talent for full-time positions, but also do not have enough data scientists consistently available throughout the business.
The amount of data businesses are generating compared to even just a few years ago is staggering, and it's growing fast, with Forbes stating that 90% of data present in the world today was produced in the last two years alone. IDC thinks a mammoth 163 zettabytes of data a year will be generated by business by 2025.
There are more and more devices coming onto the market that scrape data, including wearables, smart home/office devices and that doesn't even include the rising number of people using the internet to shop, interact with brands and more, generating business insights that are powering business decisions.
Although the amount of data is impressive and is a positive thing, what isn't so positive is that data is useless without it first being analysed and then used to inform business transformation. Without the manpower to work out what all of the information means, analytical insights which add value to the business can't be deduced. Rather, if an organisation lacks the tools to coerce information from harnessed data, they may even be better placed abolishing data initiatives and cost-saving, to a degree. Simply, an investment in data tools must be matched by an investment in personnel.
How can data scientists benefit businesses?
Data scientists transform masses of data into analytical insights which help businesses make faster, more informed decisions.
Approximately 80% of the data generated over the next five years will be unstructured, highlighting the need for analytical tools operated by skilled personnel in order to extract insights. Using such tools, data scientists drag numbers and statistics into tables to form predictive models that can simulate a variety of possibilities.
Armed with the knowledge of the potential outcomes that each course of action can result in, businesses can select whichever solution aligns best with their initiatives. Logical, best-case scenario actions can be prescribed that improve overall performance. Moreover, as businesses build a record of performance metrics, an internal database of business-insights is formed, with which businesses are able to base decisions on recurring trends.
Boiled-down, the skill-set of a data scientist consists primarily of mathematics and statistics, programming knowledge and analytical thinking. Automated programmes can replicate these skills in the modern day, however a significant advantage of the data scientist is that they can think creatively. In this emerging field, it is certain that data science can be used in presently undiscovered ways to create further insight.
The possibilities are endless. A creative data scientist may forge new methods of gathering, interpreting and analysing to produce a profitable data strategy. According to Bernard Marr, "the corporate data superstars of the future will be people who can find new data to solve business problems and come up with new and innovative methods of applying data statistics".
In this, it can be said that the impact of data science is only as good as the data scientist. And with the skills gap only widening, sourcing such talent comes at a cost.
What is industry doing to help close the skills gap?
The European Commission has made a commitment to try and fill the data scientist skills gap across the whole of Europe by running sessions exploring what businesses can do to try and make the problem less severe, addressing the demands of organisations and using the data they collect to gain key insights about their customers and how to grow their business.
Businesses are also starting to react to the country's data scientist shortage and are collaborating with other firms and educational establishments to try and close the gap before it becomes too large to manage.
For example, data management and analytics firm SAS has teamed up with HSBC and the Data Lab to introduce an MSc course in Data Science for Business. The course will run at the University of Stirling and has been designed for those looking to start their career in data analytics. It teaches students how to use advanced analytics and apply these skills to real-life scenarios.
"There is a shortage of graduates emerging with the skills to apply the technical aspects of data science and use analytics to make sound business decisions," Dr Kepa Mendibil, course leader of the MSc in Data Science at the University of Stirling's School of Management said.
"Through this course, we have focused on the practical challenges that organisations are experiencing by merging disciplines to develop a teaching programme that makes the link between business, management and data analytics."
The Office of National Statistics is also innovating to ensure the country has the resources it needs, collaborating with universities across the country to implement data science courses, with both MSc and short courses.
Although closing the data scientist skills gap is not something that can be solved immediately, it's important businesses and the government work together to ensure they are making steps towards creating a better-equipped workforce.
The Royal Society's report recommends a fundamental change in education. To provide young people with the necessary platform to pursue data-specialist roles, transferable skills such as communication and problem-solving should be embedded into the national curriculum. And to ensure the skills gap does not completely spiral out of control, the change is needed within the next ten years.
A two-pronged approach is needed - teaching young people about the merits and opportunities of data - starting as early as possible and training existing staff how to analyse data to ensure the true meaning of information companies hold isn't simply lost forever.
What does the data scientist role entail?
While data scientists are in high demand, the role can differ quite substantially from company to company, explains data scientist Jessica Kirkpatrick, at tech recruitment firm Hired.
"In simple terms, data science is taking the scientific method and applying it to a business' data set to derive insights that help the company make strategic decisions," she says. "However, that can mean a wide variety of things in terms of day-to-day duties."
While one data scientist might be tasked with taking raw data from an application or website, for instance and organising it in a database in a way that makes it easy to analyse, another might find themselves running A/B tests on a website to see which version performs better with users.
"For example at Hired, we might pilot a new interview scheduling tool with a small group of clients and see how this new feature affects their ability to hire candidates," Kirkpatrick adds.
"We then compare how efficiently this test group is able to hire candidates with the rest of the clients who do not have access to the tool. This allows us to understand if the tool helps our clients or not and if we should add it as a permanent feature to our product."
But the role is increasingly focused not just on the data, but on company strategy, Kirkpatrick explains, helping executives identify the most efficient areas of the business and which areas require extra resources, or where companies should scale down.
"We have been seeing an increase in [the number of] Chief Data Officer roles in recent years, because companies are realising that it's important to have someone with deep analytical knowledge of the business in the boardroom," Kirkpatrick says.
2022 State of the multi-cloud report
What are the biggest multi-cloud motivations for decision-makers, and what are the leading challengesFree Download
The Total Economic Impact™ of IBM robotic process automation
Cost savings and business benefits enabled by robotic process automationFree Download
Multi-cloud data integration for data leaders
A holistic data-fabric approach to multi-cloud integrationFree Download
MLOps and trustworthy AI for data leaders
A data fabric approach to MLOps and trustworthy AIFree Download