The future of virtual assistants might lie in the metaverse

Graphic to symbolise virtual assistant
(Image credit: Getty Images)

In a corner of half of the UK’s internet households sits a smart speaker. Whether alongside photos of the family and a coaster for a drink beside the sofa, or amongst toasters, blenders, and utensils in the kitchen, the phrases “Alexa,” “Hey Google,” or “Hey Siri,” are being used to instigate household functions.

When you consider smart speaker popularity in the US mirrors that of the UK, too, it’s no wonder the global smart speaker market size was valued at $6.42 billion in 2020, with projections to reach $61.87 billion by 2028. This growth may well be music to the ears of tech giants the world over, but the market has shown signs of slowing down lately. According to, the smart speaker market plateaued from 42% and 32% year-on-year growth between 2018 and 2020 respectively, to 3.4% and 4.6% since the turn of the decade.

Similarly in the UK, growth from 2019 to 2020 reached 136.5% but shrunk to less than a fifth of that figure (24.5%) from 2020 to 2021. Given this slump, applied futurist Tom Cheesewirght tells IT Pro that the virtual assistant market has reached a ceiling in the current market.

"There's two core things that happened with voice that took a lot of the heat out of the market,” says Cheesewright. “Firstly, these assistants didn’t turn out to be a good application platform. There is a limited range of applications in which voice makes sense and, while there are interesting things you can do, it's not the sort of generalised application platform that the smartphone or the PC are. There was a lot of the excitement about virtual assistants that was driven by the possibility of a generalised application platform, but that leads to the second point, which is people realised that voice isn't appropriate. In a lot of contexts people don't want to be using voice as an interface, because fundamentally, it is going to disrupt your mental flow.”

Despite these issues, he stresses the widespread adoption of these devices and their intrinsic value is good. While they’re limited, there are still things people like doing with them. Nevertheless, the slumping market may force manufacturers and customers into a rethink over how virtual assistants are used. With artificial intelligence (AI) and the metaverse gaining traction, the technology might need to adapt to a very different tech landscape to that it encountered when it first burst onto the scene.

Finding new uses for virtual assistants

As Cheesewright points out, the use of virtual assistants for consumers have been limited to basic timers, playing music, and turning on and off the lights.

There are more uses for virtual assistants, though, and having a device in an area that can provide information with a simple question justifies the investment that’s been made in the sector over the past five years, Joshua Kaiser, Tovie AI managing director points out.


The power to innovate

How to maximise digital transformation


“The healthcare industry embraced virtual assistants during the COVID-19 outbreak,” he says. “In 2019, Voicebot found 7.5% of U.S. adults had used a virtual assistant for a healthcare need; in 2021, that figure surged to 21%, and there was also a boost in AI-powered conversational apps that help to screen, triage, and diagnose people who are potentially infected with COVID-19.

“Healthcare wasn’t the only service to transition to online during the pandemic, and, as a result, more companies are integrating virtual assistants to offer a smooth online experience for customers. This development has fostered a growing trend in the conversational AI industry, specifically custom virtual assistants embedded within products, web apps, and mobile apps.”

The future might be meta

As Kaiser says, despite the fairly pedestrian use cases currently in the market, interaction with voice agents is a technology that comes into its own in the future.

Yes, it's about time we mentioned that the metaverse has the capacity to change our lives in many ways – but some of the fundamental differences between the metaverse and the virtual reality (VR) and augmented reality (AR) technologies we have now are the interaction methods.

Output devices, such as keyboards and mice, will be redundant in the virtual worlds, which pave the way for people to talk to agents as they travel through the environments they find themselves in.

“This year is going to be the year of the metaverse as giants like Meta, Baidu, Huyndai and LEGO are entering the field, and virtual assistants are likely to appear heavily,” says Kaiser. “This is not just a thing from the entertainment industry, we should expect virtual offices for employees or customers in VR, populated by virtual characters.

“Going forward, I think that all major developments in virtual assistants will be related to the metaverse and virtual character trends. Emotion management in AI is one of this year’s major trends because it allows us to make synthetic speech sound human, allowing virtual assistants to sound more natural.”

Cheesewirght agrees, adding that virtual assistants will be a vital vehicle in which the metaverse will be delivered. “Voice allows the user interface to make AR useful and not a flood of popups or even trying to replicate the smartphone experience on a pair of glasses.

“A shift is coming where most people will spend ten hours a day in mixed reality. In this context, voice starts to be a lot more useful, both in a consumer context and in a business context, because when your primary interface isn't a keyboard and mouse, and your personal AI gets to know you, it can be a great help and even make decisions for you.

“Voice is a natural interface, and when you have always got a pair of headphones in, or a pair of glasses, you can have that subtle, private, conversation with your virtual assistant. I think that's the moment where voice becomes much more important. There are issues to overcome with virtual assistants now, for example, even with ten years of evolution of array mics, interpretation, and algorithms, there's still huge comprehension issues. But when the microphone is physically connected to your head, the clarity, comprehension, and usability should be an awful lot better.”

Elliot Mulley-Goodbarne

Elliot Mulley-Goodbarne is a freelance journalist and content writer with six years of experience writing for B2B technology publications, notably Mobile News and Comms Business. He specialises in mobile, business strategy, and cloud technologies, with interests in environmental impacts, innovation, and competition. You can follow Elliot on Twitter and Instagram.