Computer vision – have you seen the light?

Computer vision sensor on a production line

published 6 July 2020

The popularity of Computer vision (CV) has skyrocketed in recent years and the technology is finding its way into a wide variety of sectors, from retail and logistics through to healthcare and travel.

CV is a form of technology that allows machines to ‘see’ by acquiring visual data, interpreting images, and making decisions based on it. It enables automation of some tasks where human vision would usually be needed, but can also move beyond typical human capabilities to look for patterns and associated meaning that might not be obvious to the human eye.

“For example, being able to identify the tell-tale signs of metal fatigue in an industrial part and then make distinctions in classification such as estimated time to failure,” highlights Dr Jabe Wilson, consulting director of text and data analytics at Elsevier, a data and analytics organisation.

How AI can improve medical diagnosis What is natural language processing? Smart Cities: Living in intelligent spaces

CV can also ‘see’ beyond a human’s limited range of visibility; from X-ray to infrared, giving increased understanding of the environment and heightened capabilities.

The technology has been around for several decades, but advancements in deep and machine learning have dramatically improved its capabilities. This has led to a leap in popularity, and analyst and consultancy firm Omdia expects to see global CV software revenue grow from US$2.9bn in 2018 to $33.5bn by 2025.

What’s driving CV uptake?

“Huge boosts in computational processing, as well as the volume of data over the past five years, have helped to advance the accuracy of algorithms in the field. This has in turn breathed new life into the technology,” says Nick McQuire, VP of enterprise research at market research firm CSS Insight.

At the same time, the rise of open source frameworks, such as OpenCV, has enabled developers to focus on the actual applications at hand, and not have to worry about developing the underlying image process capabilities.

“This allows acceleration of development as engineers no longer have to start from scratch creating the necessary functions,” notes Adam Taylor, a fellow of the Institute of Engineering and Technology (IET). “The combination of sensor technology, processing capability and open source frameworks have enabled the acceleration of CV solutions for an increasing range of applications.”

From PoCs to production

While many are still in the experimentation phases with AI and allied technologies like natural language programming (NLP), CV is one of the areas where the largest amount of AI production deployments are taking place.

A recent Omdia market report profiled more than 100 use cases in 25 different industries, highlighting the reach of CV. Many proofs of concept (PoCs) that started in 2016 and 2017 have migrated to production, with new PoCs being undertaken. It also reported that several companies have created the position of chief AI strategy officer, whose job is to work out how the business can best benefit from AI, and CV has been the prime application category.

Popular applications for computer vision

According to Laura Petrone, author of GlobalData’s Computer Vision Thematic Research report, some of CV’s most popular applications include autonomous vehicles, high level medical diagnostics, smart cities and advertising.

“CV is one of the primary technologies for ambient commerce in retail. Using sensors and machine learning in physical stores, CV technology detects when an item is removed from a shelf and who took it,” says Petrone.

“In advertising, companies like GumGum apply the technology, alongside text analytics, to uncover key contextual information from text and image content from premium publishers. In warehouses, machine vision can apply CV to industrial and manufacturing functions,” she adds, noting that in Amazon's warehouses, AI cameras and scanners watch the products stocked and automatically track which products go into which bins.

“In healthcare one of the most promising applications is automated image processing used in detecting tumors,” she continues. “Here, deep learning algorithms learn important features related to the disease from a collection of medical images and then make predictions based on that learning.”

CV can help scientists and health practitioners do their jobs more quickly and with more accuracy, adds Wilson. “Efforts such as Google’s collaboration with Moorfields Eye Hospital – where its AI, DeepMind, ‘read’ more than one million 3D eye-scan images via its neural networks to learn how to detect eye disease and alert clinicians – show us how this could benefit businesses in reality.

“In many areas of research, scientists must manually view images, a time-consuming task which requires a high degree of accuracy. CV has the potential to augment researchers by automating image analysis and helping them accelerate research and treatment timelines,” he says.

Challenges ahead

But CV isn’t without its challenges. It takes a long time to develop and start monetising CV applications for example, and pricing and business models are still in a state of flux.

In addition, chipsets that address CV algorithms’ need for significant compute performance are only just starting to appear. Experts are also seeing an increase in methods that fool image recognition, which will need to be addressed.

However, the single biggest issue with CV at the moment revolves around facial recognition.

“The technology has been fraught with problems over the past several years due to ‘stale male pale’ bias in the data. This has particularly come to the fore when used in law enforcement, as we’re seeing during the Black Lives Matter protests,” says McQuire. “It’s why we’ve seen big players, such as Google several years ago, and more recently IBM and Amazon, either pause or pull out of the facial recognition market entirely.”

Whether facial recognition technology has a future is currently unknown, but the other challenges are considered small bumps in the road rather than deal breakers.

Research in this field will continue to advance and the deep learning models that power the progression will get larger and more complex.

“It’s been a fascinating few years for CV as it migrated from universities to commercial application,” says Anand Joshi, principal analyst at Omdia. “The technology’s moved at a rapid pace and frankly we’ve just scratched the surface of what we can do with it. We still have a long way to go, but it’s definitely shaping up to be a disruptive technology.”

Keri Allan is a freelancer with 20 years of experience writing about technology and has written for publications including the Guardian, the Sunday Times, CIO, E&T and Arabian Computer News. She specialises in areas including the cloud, IoT, AI, machine learning and digital transformation.