Do we need data boot camps?

Stephen Pritchard

Advances in data collection and information management over the last two decades have encouraged businesses to create ever larger databases, to store an ever increasing number of records.

With the cost of servers and storage falling, in real terms, the cost of doing so on a per record basis is lower than ever. In 1981, the cost of 1GB of storage was $300,000 (185,000) whilst today, it is estimated to be just 10 cents, according to US-based researcher David Isenberg.

Advances in sensor, instrument and even point of sale technology have made it much easier to collect raw data, often in real time. Even a humble shop till or traffic light contains sensors that can gather thousands of data points in an hour.

To a large extent, the growth in data collection has outpaced the tools available for businesses to manage or analyse it and a common theme in conversations about data analytics with CIOs at least is businesses have more data than they quite know what to do with. That data is only a useful asset to the business, rather than a source of cost, if it can be turned into information and, hopefully, insight.

Of course, one answer is to invest more in data mining, business intelligence, knowledge management and predictive analytics. Technologies such as in-memory databases have also sped up business' ability to sift through their data stores for that nugget of useful information.

But the problem is by no means just a technological one. When quizzed, CIOs often admit the business collects data because it can perhaps as a result of newer technologies such as RFID but they do not always consider how they will use it.

This is part of the thinking behind an initiative by IBM to provide boot camps for developers and other IT professionals tasked with making sense of those growing volumes of information.

Increasingly, the large IT vendors and consultancies are turning their attention to the notion of "big data." This embraces not only data storage and management but also ideas such as real time data analytics including stream computing, another IBM initiative and natural language queries.

Businesses are also becoming more concerned with how to catalogue and analyse unstructured data, such as Office documents or audio and video, alongside conventional database entries.

But perhaps the bigger question is whether companies need to gather quite so much information in the first place. The more data a company holds, the more sophisticated the tools they will need to process it, and the more likely they are to breach data protection or other rules.

A sound information strategy should come first, with technology, tools and training to support it.

Stephen Pritchard is a contributing editor at IT PRO.

Comments? Questions? You can email him here