IDF: How Intel wants to tackle the Big Data “algorithm economy”

With the growth of the Internet of Things (IoT), the amount of data available to business is going through the roof, yet there is a dearth of skills to properly manage and make sense of what is being collected, Intel has warned.

At the Intel Developers Forum (IDF) in San Francisco, leading figures in the company pointed to the crevasse that lies in the heart of Big Data analysis and called for the conversation to shift from the nature of the data itself to creating algorithms that can make sense of vast quantities of information.

Diane Bryant, senior vice president and general manager of Intel's Data Centre Group, took to the stage at a session on Big Data and the IoT yesterday to talk about the company's data strategy.

After noting that the role of data scientist was named the sexiest job of the 21st century by Harvard Business Review in 2012, Bryant waxed lyrical about the value of data in today's economy.

"Although information has always had value, with the move to the digital service economy and the massive explosion of data, data becomes the new currency - the currency of the digital world," she said.

Bryant went on to speak about the biggest problem currently facing Big Data analysis a lack of people that can do the job.

While a traditional data analysis role has relatively defined areas of expertise, Big Data analysis requires a wider, less defined skillset, she said. Traditional analysts may not, for example, have knowledge of Java and Linux systems but these are important tools for making sense of Big Data.

This is leading to a gap in ability and, as Bryant said, "there are very few people with the skills to jump that crevasse".

Addressing the Big Data skills gulf

Later in the presentation Bryant brought out data scientist Owen Zhang, dubbed the Superman of data scientists for ranking number one with data outsourcing site, Kaggle.

Reiterating Bryant's earlier sentiment, Zhang noted that "data science requires very specialist techniques," and that dealing with Big Data is, ultimately, a very complex task with a steep learning curve.

The learning curve here is a significant obstacle to the adoption of data science. Sites such as Kaggle help companies avoid this problem by providing a platform for businesses to post their data while a community of scientists compete for money to make sense of it.

Crowdsourcing in this manner may benefit some situations, but Bryant stressed the issues of scalability, as well as data confidentiality.

She said: "It's clear that if we're going to put the world's data to use and enable the algorithm economy that the industry needs an open standards-based platform that is easy to use by existing IT talent that enables rapid customisation and enables an accelerated pace of innovation to match the rapidly evolving data analytics world."

Discovery Peak

This all led to Intel's announcement of Discovery Peak, an open source analytics Platform-as-a-Service (PaaS) for data scientists and application developers. Stuck in development for three years, it is now available to use in both public and private cloud environments.

The tool comprises data platform components such as Hadoop and Spark, a data preparation process, tools for model selection and model deployment, and machine learning algorithms. It also has a data scientist development kit that includes visualisation tools as well as an analytics engine.

3D XPoint DIMMs

Bryant then proceeded to talk up 3D XPoint technology, and how it will increase the capacity of the main memory systems of Intel's next generation of Zeon platforms.

"First you're going to get a huge increase in capacity," she said. "That next generation server platform will get a 4x increase in memory capacity relative to the current Zeon platform. Up to 6TB of memory capacity on a two-socket system. It is the first time non-volatile memory can be used in main memory."

Intel also plans to give Big Data analysis a hefty push towards wider integration. Reducing the complexity of dealing with vast quantities of data, alongside making moves to substantially boost main memory capacity, has the potential to bring Big Data analysis to companies daunted by the looming skillset of data scientists, the firm believes.

The chipmaker painted a picture of a shift in the way businesses deal with information in the 21st century, pointing to the increasing need to make sense of the enormous quantities of information about to be collected as the IoT enters our homes, cities and workplaces.