What is machine learning and why is it important?
No longer confined to the world of science fiction, machine learning represents a new frontier in technology
Machine learning (ML) is the process of teaching a computer system to make predictions based on a set of data. By feeding a system a series of trial and error scenarios, machine learning researchers strive to create artificially intelligent systems that can analyse data, answer questions, and make decisions on their own.
Machine learning often uses algorithms based on test data, which assist with inference and pattern recognition in future decisions, removing the need for explicit instructions from humans that traditional computer software requires.
What is machine learning?
Machine learning relies on a large amount of data, which is fed into algorithms in order to produce a model off of which the system predicts its future decisions. For example, if the data you’re using is a list of fruit you’ve eaten for lunch every day for a year, you would be able to use a prediction algorithm to build a model for which fruits you were likely to eat when in the following year.
The process is based on trial and error scenarios, usually using more than one algorithm. These algorithms are classed as linear models, non-linear models, or even neural networks. They will be ultimately dependent on the set of data you’re working with and the question you’re trying to answer.
What are the types of machine learning algorithms?
Machine learning algorithms learn and improve over time using data, and do not require human instruction. The algorithms are split into three types: supervised, unsupervised, and reinforcement learning. Each type of learning has a different purpose and enables data to be used in different ways.
Supervised learning involves labelled training data, which is used by an algorithm to learn the mapping function that turns input variables into an output variable to solve equations. Within this are two types of supervised learning: classification, which is used to predict the outcome of a given sample when the output is in the form of a category, and regression, which is used to predict the outcome of a given sample when the output variable is a real value, such as a 'salary' or a 'weight'.
An example of a supervised learning model is the K-Nearest Neighbors (KNN) algorithm, which is a method of pattern recognition. KNN essentially involves using a chart to reach an educated guess on the classification of an object based on the spread of similar objects nearby.
In the chart above, the green circle represents an as-yet unclassified object, which can only belong to one of two possible categories: blue squares or red triangles. In order to identify what category it belongs to, the algorithm will analyse what objects are nearest to it on the chart – in this case, the algorithm will reasonably assume that the green circle should belong to the red triangle category.
Unsupervised learning models are used when there is only input variables and no corresponding output variables. It uses unlabelled training data to model the underlying structure of the data.
There are three types of unsupervised learning algorithms: association, which is extensively used in market-basket analysis; clustering, which is used to match samples similar to objects within another cluster; and dimensionality reduction, which is used to trim the number of variables within a data set while keeping its important information intact.
Reinforcement learning allows an agent to decide its next action based on its current state by learning behaviours that will maximise a reward. It's often used in gaming environments where an algorithm is provided with the rules and tasked with solving the challenge in the most efficient way possible. The model will start out randomly at first, but over time, through trial and error, it will learn where and when it needs to move in the game to maximise points.
In this type of training, the reward is simply a state associated with a positive outcome. For example, an algorithm will be 'rewarded' with a task completion if it is able to keep a car on a road without hitting obstacles.
What is machine learning used for?
Nowadays, lots of businesses possess a huge amount of information, produced by actions, computers, events, people, and gadgets, which makes it tricky to analyse or learn anything from them.
Thanks to ML, we can harness this data to make something useful out of it. For example, in medical analysis, it would take users a very long time to find patterns in thousands of MRI scans. A machine, on the other hand, can be fed the data to discover any patterns within a matter of seconds, as long as the information has been labelled correctly.
Where is machine learning used?
You might be unsurprised to learn that machine learning forms the basis of Google Search, and is arguably one of the most successful deployments of the technology to date. A number of different ML algorithms feed this beast of a search engine, which helps to analyse and read the text you enter into it. This will then help to customise the results based on a user’s search history and behaviour in the digital world. If someone were to search for a term like “Java”, it’s possible for the user to either receive results around coffee or the programming language, depending on the person’s internet behaviour and browsing history.
Smart cities and driverless cars also rely heavily on the advancement and development of ML technologies. With smart cities, for example, many of the technologies used to power this innovation are now entering the public realm, including facial recognition that utilises ML technology to identify objects based on their characteristics. However, technology like this has also been branded as controversial since it usually involves some kind of frequent surveillance of citizens.
What is machine learning data bias?
The ML developer community has long grappled with the problem of bias – or the implanting unfairness into public-facing and critical software – particularly as machine learning technologies improve and are more widely adopted.
The Total Economic Impact™ of IBM Watson Assistant
Cost savings and business benefits enabled by Watson AssistantFree Download
Bias is not always easy to spot, and can exist in the data itself. If an organisation seeks to employ more diversely, for example, but only uses CVs belonging to its present workers as the test data, then the ML application will inadvertently favour candidates of a similar make up.
Some governments have been spooked by this form of machine learning and it has caused a number to implement regulations that aim to limit its use. In the UK, the Cabinet Office's Race Disparity Unit and the Centre for Data Ethics and Innovation (CDEI) teamed up to research potential bias in algorithmic decision-making. The US government also decided to pilot diversity regulations for AI research that minimises the risk of racial or sexual bias in computer systems.
What machine learning can and cannot do
If the idea of machine learning or artificial intelligence causes you to break out in a nervous sweat as you think back to futuristic science fiction films you’ve seen in the past, then the good news is that there’s only so much that this technology can do. Contrary to the idea of a robot uprising, or an omniscient AI taking on the human race, there are fixed limitations to what we can do with this technology.
This doesn’t mean it's useless, or can’t be harnessed properly. It’s just that, as it stands now, the technology can only be used for very specific and fairly inflexible purposes – the idea of an all-knowing, multi-purpose AI is still very much confined to science fiction.
What machine learning can be used for:
- Voice recognition
- Text to speech transcription
- Provide recommendations depending on search term
- Image recognition
What machine learning can’t be used for:
- Recognising human intentions
- Market analysis
- Recognise cause and effect relationships
- Making ethical or moral decisions by itself
2022 State of the multi-cloud report
What are the biggest multi-cloud motivations for decision-makers, and what are the leading challengesFree Download
The Total Economic Impact™ of IBM robotic process automation
Cost savings and business benefits enabled by robotic process automationFree Download
Multi-cloud data integration for data leaders
A holistic data-fabric approach to multi-cloud integrationFree Download
MLOps and trustworthy AI for data leaders
A data fabric approach to MLOps and trustworthy AIFree Download