IT Pro is supported by its audience. When you purchase through links on our site, we may earn an affiliate commission. Learn more

What is big data?

Big data is a big deal for the business world - here's what it's all about...

Data is constantly referred to as "the new oil", while politicians compare tech giants to the US oil companies that rose to power over a century ago.

This "new oil" isn't being sucked up from the ground. Instead, it's being harvested in large volumes from people using online services, tools and applications.

There's so much data, in fact, that without the right tools to store and process it, organisations can struggle to make sense of it. This huge array of information is collectively termed 'big data'.

You only have to think of all the times you fill out an online form, sign up for a digital service, or complete a questionnaire to have an idea of the volumes being generated every day. Add to this the vast quantities of data generated by web-connected devices, social media and sensors all over the world, and you have an unimaginably large amount of information to contend with.

The growth of big data is incredibly valuable to businesses. If they can collect and store it properly, and analyse it effectively, they can extract valuable information and insights that can help them make important decisions.

Elements of big data

Before taking any steps towards implementing a big data analytics programme, it's important to know the fundamental principles that make it different to other data a company may traditionally find in its data stores.

Although there's some disagreement over what exactly constitutes big data, most experts agree on five core elements: volume, velocity, variety, veracity and value.

Volume: This is the key component of big data. Employees in the past generated the majority of data in organisations, but data is now mostly generated by systems, networks, on social media, and via IoT devices, with a massive amount of data that needs analysing. 

Velocity: As such a huge range of information is coming from a number of different sources, it is extremely important the pace at which the data flows into an organisation. This data flow is huge and continuous and includes information like emails, text messages, social media posts which are all arriving at every minute of every day. Valuable business decisions should be made and based upon the real-time data that is available, which will need to be processed and analysed. In order to do this, highly available systems are required with failover capabilities to cope with the pipeline of data.

Related Resource

IT Pro 20/20: Meet the companies leaving the office for good

The 15th issue of IT Pro 20/20 looks at the nature of operating a business in 2021

IT Pro 20/20: Leaving the office for goodDOWNLOAD NOW

Variety: Data types and sources widely vary, and they come in two different forms; structured and unstructured. Structured data is information that normally comes from a database, so it's well-organised and clear. On the other hand, unstructured  is data that comes from elsewhere including social media websites like Facebook or Twitter and is generally more chaotic as it includes other data formats like photos, videos, audio files and more. As there is quite a big variety of unstructured data, it may prove to be problematic for processing, analysing and storing. Tools that involve big data look to process this unstructured data to understand it and processing the chaotic part of it is a core component of big data.

Value: You might have a huge quantity of data to work with, but at the end of the day it won't matter unless you are smart with it in order to understand how much value it can add. When you add together the rest of the Vs, you need to ask yourself if any insights you collect from analysis be worthwhile for your business or organisation? If your data isn't used intelligently, it may unfortunately not end up providing a lot of value in the end.

Veracity: With so much data flowing in, considering its volume, variety and velocity, it can sometimes be challenging to evaluate the information's quality. The quality of the analysis stemming from this data is greatly influenced by this. When launching a big data project, it is wise to seek help to make sure the data is clean and certain processes are in place to prevent unwanted information from building up and affecting the quality of your analysis and therefore your results.

Featured Resources

AI for customer service

IBM Watson Assistant solves customer problems the first time

View now

Solve cyber resilience challenges with storage solutions

Fundamental capabilities of cyber-resilient IT infrastructure

Free Download

IBM FlashSystem 5000 and 5200 for mid-market enterprises

Manage rapid data growth within limited IT budgets

Free download

Leverage automated APM to accelerate CI/CD and boost application performance

Constant change to meet fast-evolving application functionality

Free Download

Most Popular

How to boot Windows 11 in Safe Mode
Microsoft Windows

How to boot Windows 11 in Safe Mode

15 Nov 2022
The top 12 password-cracking techniques used by hackers
Security

The top 12 password-cracking techniques used by hackers

14 Nov 2022
Windows users now able to run Linux apps and distros natively
Microsoft Windows

Windows users now able to run Linux apps and distros natively

24 Nov 2022