Want to deliver a successful agentic AI project? Stop treating it like traditional software

Designing and building agents is one thing, but testing and governance is crucial to success

Agentic AI concept image with human heads lined up with larger robotic humanoid head in background.
(Image credit: Getty Images)

An MIT study last year found that 95% of generative AI pilots fail to even reach the production stage, and with agentic AI the situation isn’t much better.

But while headlines and critics focused heavily on the 95%, Dael Williamson, EMEA CTO at Databricks, believes IT leaders should take stock of what the five percent club are up to – and they’re making notable tweaks to traditional software processes.

Many businesses stuck on this front are still working based on time-tested rules, but with agents the game has changed and enterprises face a multitude of new considerations and challenges.

“Most of the companies that are struggling, so the 95%, what we see with them is they’re kind of treating it like it’s still traditional software,” he told ITPro.

“Traditional software was handwritten by humans, but if you follow the software development lifecycle, 80% of time was spent designing and building.”

With agentic AI development, however, Williamson said the situation has almost flipped on its head. Simply put, if you’re still spending 80% of your time designing and building, you’re not testing - and it’s here where the friction occurs.

“These systems are far more probabilistic, and that means you’ve got to test more,” he explained. “The companies that are in the 5%, they’re building evals, the companies in the 5% are thinking about security, they’re thinking about narrowing the context of what an agent does.”

Evaluations are critical within the adoption process, helping fine-tune and hone agents and ensuring they deliver high quality outputs.

According to figures from Databricks’ State of AI Agents report, enterprises that focused heavily on this front were found to get nearly six-times more AI projects into production compared to those slacking on evals.

That focus on narrowing exactly where agents operate within IT environments and what specific tasks they’ve been allocated is critical, Williamson noted.

Many businesses view agents as a set of catch-all, click-and-go tools. However, curation based on specific individual use-cases is critical. Databricks’ report found 40% of the top AI use-cases focused on addressing practical concerns, for example.

Tasks such as onboarding are prime fodder for agents while business functions like customer support are ripe for agent infusion.

Research from Cisco last year projected that agents will be handling more than two-thirds of customer support interactions by 2028, and this function has been among the top areas set for automation with agents.

The rise of multi-agent systems

The type of agents and their role within the broader ecosystem is equally important, Williamson told ITPro. Businesses are increasingly shifting toward “multi-agent workflows”, according to Databricks.

Databricks recorded a 327% increase on this front over the last year, with enterprises moving away from harnessing individual chatbots toward multi-agent setups. This involves multiple specialist agents working in tandem and contributing to individual workflows.

Williamson compared this to a home renovation, whereby the homeowner would have multiple different contractors with specific skills contributing to the overall project.

“Think of it like when you do renovation work on a house,” he explained. “You’ll have a plumber, you’ll have somebody doing the rendering, you’ll have somebody doing the windows, so very specialist tasks.”

Analysis of the company’s Agent Bricks service shows adoption of dedicated specialist agents is growing across three key specialisms: Information Extraction agents, Knowledge
Assistant agents, and Supervisor agents.

Information extraction agents accounted for 31% of all agent usage across the Databricks platform, with the reporting noting this popularity “reflects companies’ needs to leverage both structured and unstructured data”.

This particular type of agent is helping drastically improve data sourcing, Williamson noted, allowing enterprises to tap into all their data and helping to improve visibility.

“The information extraction actually points to a really challenging problem in a lot of use cases that exist in enterprises where you're mining a lot of documents, you’re mining a lot of unstructured stuff,” Williamson said.

“And the fact that we can extract information out of large repositories and actually utilize that in raw material ways has had quite a huge breakthrough.”

Robust guardrails with supervisor agents

Meanwhile, supervisor agents are proving vital in agentic workflows. These essentially oversee and orchestrate other agents, validating activities and streamlining processes.

Williamson told ITPro this is where Databricks is seeing “the biggest emphasis” among enterprises adopting the technology.

“You have the agent that does the work, and then the agent that inspects the work, and then the supervisor oversees that all of this goes in accordance with the outcome expected,” he said.

Supervisor agents have already been touted as a means of improving guardrails and governance for enterprises dabbling in agentic AI. Analysis from Gartner last year found ‘guardian agents’ will account for up to 15% of all agents deployed globally by 2030.

These are designed specifically to ensure other agents within IT environments are working according to company policies and to bolster security.

Williamson said that the use of supervisor agents tackles a long-running concern with generative AI, and more recently agentic AI: hallucinations and the risks associated with rogue bot activities.

“Most companies freak out about the fact that this could be nondeterministic and they talk about hallucinations and things like that,” he said. “So in order to reduce that you want to almost be able to test the agent outputs and kind of kill the stuff that's noise.”

“It’s actually not that different from how unit tests were used in software engineering to test code quality and integration tests were used in software engineering to figure out how code integrated with other code. This is just building on that kind of thinking.”

Databricks’ report noted that companies embracing stronger AI governance policies put more than 12-times more projects into production. Indeed, this domain is now a top investment priority among those hoping to capitalize on the success of industry counterparts.

This is a tricky area for many businesses, however. Williamson told ITPro that those recording success with agentic AI adoption are often tech companies themselves and digital natives.

These companies might be the ones to watch as far as successful adoption rates go, but the majority of businesses don’t have the same workforce capabilities.

“They have a more early adopter software engineering workforce,” he said. “You probably find that putting in the governance guardrails is almost more natural to them. So that's kind of one thing to just account for.”

“Governance is actually quite a tricky thing. Let's take a sort of single market enterprise. So they're not a multinational, but a single market enterprise, but they’re in the upper billions in revenue,” Williamson added.

“They still have a lot of different departments, they have a lot of different functions. So doing enterprise wide governance is actually incredibly tricky.”

FOLLOW US ON SOCIAL MEDIA

Make sure to follow ITPro on Google News to keep tabs on all our latest news, analysis, and reviews.

You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.

Ross Kelly
News and Analysis Editor

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.

He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.

For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.