Big Data - it's exciting, it's innovative but it's not for everyone

Big Data

Much like cloud computing – Big Data is one of the buzzwords in IT at the moment. But unlike others, Big Data is a genuinely exciting movement and it deserves the interest that it is receiving. However, as usual, overzealous marketing departments are trying to push Big Data into places it doesn’t fit.

I follow Big Data news closely as well as attending the tradeshows, and even now there are still companies trying to persuade the market that Big Data has a role to play in SMBs.

A bit like cloud computing, this is partly because we all tend to use variations when it comes to the definition of Big Data. As a result we’re renaming any and all analytics as Big Data.

There are several definitions surrounding Big Data, but from a high level view it essentially means when a company has so much data, that is so big it is hard or impossible to manipulate because it is larger than your capacity. This could be from a storage or processing perspective. It can be concerned with unstructured data, but equally it can also be structured data that just exceeds the ability of relational databases to handle it.

The classic examples of Big Data usage are where very large companies like Google, Facebook or Tesco use it to gain deep insights into their customer’s behaviour using the data they collect from them on a daily basis.

There are some crimes being committed by marketing departments of analytics companies, branding anything vaguely related to analytics as“Big Data

So for example, Big Data is how Tesco can measure in real-time the impact an outdoor advert has on the Twittersphere and how that then might influence people all over the country and their buying habits.

However, there are some crimes being committed by marketing departments of analytics companies all over the world, branding anything vaguely related to analytics as“Big Data”.

There has even been a concerted push to say that “Big Data” is for all. Absolute rubbish!

The idea of most SMEs having a real use for Big Data is absurd. Unless the SME in question is a social media analytics agency there just isn’t the justification for them taking on such a toolset.

Take an average small business. How about, for example, a 30-person accountancy firm in Dorking?

Will they ever really need Big Data? Are they ever really going to need to be able to process data that is beyond their capacity - storage or processing? This is now especially the case with the flexibility cloud computing provides that allows you to scale your infrastructure up or down according to your requirements.

They just aren’t going to have that much information that the analytics couldn’t be handled by a traditional database or the likes of NoSQL, MongoDB or Cassandra.

According to some commentators, there are three characteristics that define Big Data – the so-called three Vs – variety, volume and velocity (other commentators see more than three, but that's a different question).

It is these three Vs that separate those who do need real Big Data solutions and those who don’t.

Firstly, I actually don’t fully agree with “variety” being included at all. This implies that “Big Data” is one solution to all of the problems – for all types of data. But that is like using a hammer when you need a fork.

Volume - how much data have you got that needs to be processed? This is all relative. I know companies that say, “we don’t have a lot of data - only a few hundred TB” and others who think they have huge amounts when they have just a few hundred GBs.

If you are talking about tens of TB, you aren’t necessarily in the Big Data arena. Hundreds, probably yes. More important however is the velocity that you are creating that data and the speed that you need the analysis back. If you need real-time analytics to dictate business actions, then this pushes you into the Big Data arena no matter if the data volumes are small.

For example, gambling companies that need to make real-time changes at a large scale need Big Data solutions. Small accountancy firms don’t. Decisions aren’t and do not need to be made in real time or based on huge data sets.

Yes - there will be exceptions to the rule but honestly, the most pressing business issues for our 30 person accountancy firm in Dorking aren’t Big Data and analytics.

Even if this firm is incredibly progressive and are avid users of social media they don’t need Big Data to take the social temperature of their customer base like Coca-Cola might.

Big Data is also a useless exercise if this real-time information isn’t then used to make or assist key business decision-making. In fact - even in larger organisations, this is where it sometimes falls down.

It can give you some pretty analytics, but it’s useless until it creates actual knowledge that can help you move your organisation in the right direction.

If for example a national department store has a range of clothing that is successful in 90 per cent of its shops, but not selling in the remaining 10 per cent then analytics, if used properly, can pinpoint the reason why. If the analytics say it’s simply a case of the items being displayed in the wrong place then the merchandising teams can do a quick re-jig and maximise the potential of the top item.

However, if you run a chain of five shops, you just won’t have the volume, velocity and variety of data to need a ‘Big Data’ solution to solve the problem.

There is undoubtedly a place for Big Data in the business world. But it has become a victim of the marketing machine. It is surrounded by so much hype that it is unclear just where the benefits lie and what type of organisation can take advantage.

Like cloud computing it is not a one-size fits all solution. Now more than ever organisations need to consider carefully the pros and cons and likely ROI before making any investment. Ignoring the hype and looking internally at your own business issues will make it very clear whether this is a worthwhile investment.

Radek Dymacz is head of R&D for cloud provider Databarracks. Radek studied computer science in Poland and joined Databarracks as an open source and Linux/Unix specialist five years ago. His other areas of expertise include object storage, enterprise storage and cloud security

Get more of Radek’s musings on his twitter account @ChroniclesOfRD