‘There is no law of computer science that says that AI must remain expensive and must remain large’: IBM CEO Arvind Krishna bangs the drum for smaller AI models
Lightweight, domain-specific models will be the go-to option for enterprises moving forward


IBM CEO Arvind Krishna believes smaller, domain-specific AI models represent a prime opportunity for enterprises to truly capitalize on the technology.
Speaking at IBM’s 2025 Think conference in Boston, Krishna reflected on the course of the generative AI boom so far, noting that the industry has matured significantly compared to last year - as have enterprises and their expectations of the technology.
“As I think about last year to this year, there’s one really big difference that I’m feeling, which is that AI has moved from experimentation to a focus on unlocking business value,” he told attendees.
“People are worrying about what is the use-case? How do I get my business to scale leveraging AI? And I think that's a real big difference, because that means, if I sort of think about hype cycles, that means the hype cycle’s kind of fading, and we are now thinking about adoption, we're thinking about ROI, we're thinking about business value.”
The initial focus on large, cumbersome AI models has waned, Krishna claimed, and is instead shifting towards more finely curated options aimed at tackling specific business challenges.
A key factor in this is businesses adopting AI solutions are dead set on supercharging productivity. While larger models have helped with this, they are often a one-size-fits-all approach.
In contrast, smaller models are now proving vital by enabling enterprises to harness internal data in a more efficient and strategic manner, Krishna noted.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
“I think AI is the source of productivity for this era,” he said. “But not all AI is built the same and not all AI is built for the enterprise.
“Why do I make that claim? 99% of all enterprise data has been untouched by AI. So if you need to unlock the value from that 99% you need you need to take an approach to AI that is tailored for the enterprise.
“If you think about the massive general purpose models. Those are very useful, but they are not going to help you unlock the value from all of the data inside the enterprise.”
Smaller models have a key advantage in that they inherently allow enterprises to derive greater value from enterprise data on a case-by-case basis. Ultimately, this not only helps businesses drive productivity, but also allows them to balance costs and target specific areas where the technology has maximum impact.
“To win, you are going to need to build special purpose models, much smaller, that are tailored for a particular use-case that can ingest the enterprise data and then work,” he explained.
“So you'll think about it, and then you’ll say, ‘well what about accuracy?’ - go look at leaderboards. Smaller models are now more accurate than larger models. What about intellectual property? That is where the open nature of some of the smaller models comes in.”
When assessing specific domains in which to deploy AI solutions, Krishna added that it’s “a lot easier to build a smaller model that can go after that”.
“So as we begin to go forward in that, you can think about these models that are three, eight, 13, 20 billion parameters, as opposed to 300 or 500 and then you begin to think about, well what advantage do I get? They're incredibly accurate. They are much, much faster,” he said.
“They're much more cost effective to run, and you can choose to run them where you want.”
IBM can find its niche with small AI models
It’s in this approach that Krishna believes IBM has a major advantage. The company’s watsonx platform, for example, allows enterprises to build applications designed to tackle specific challenges and areas of the business.
Similarly, Krishna pointed to IBM’s Granite model range, which he said is “a lot cheaper than some of the other alternatives”. These products have become core focus areas for IBM, and it intends to continue investing heavily in both moving forward.
As the industry matures and progresses, Krishna believes this focus on smaller model ranges will have a profound impact, opening up opportunities for enterprises in a trickle-down-style effect.
“As technology comes down the cost curve, it opens up many, many more opportunities,” he told attendees.
“It opens up a huge amount of aperture in terms of the problems you can afford to solve. That’s kind of been the curve it is on. We sometimes capture it in the phrase of, well, that's how we democratize technology, or rather, make it accessible to all, because the cost has come down.
“That is what we are very, very focused on. There is no law of computer science that says that AI must remain expensive and must remain large.”
Small language models are all the rage

Krishna isn’t the first to suggest that small language models will make all the difference for enterprises in the coming year. As organizations weigh up the benefits of the largest models over the lowest cost models that meet their needs and speed up the returns on their AI investments, developers are pivoting to providing lighter models that are performant at the edge.
The likes of Google’s Gemma 3 and Meta’s Llama 4 Scout can be run on a single Nvidia H100 GPU, while in the public cloud OpenAI’s 4o nano is its most cost competitive model for text-only tasks.
By leaning into this burgeoning field with its lightweight Granite model family, IBM could finally carve out a meaningful place for itself in the AI market.
Ensuring that each of its SLMs has a clear, competitive, domain-specific use case will be the key here. While there’s little chance of taking on the AI leaders at their own game – namely general purpose models – there’s clear demand for models that simply do one kind of task very well.
Mixture of experts (MoE) models, an architecture in which an AI model is made up of several smaller sub-models which are ‘experts’ at specific tasks and are only activated when needed, complicates the picture when it comes to how ‘small’ SLMs actually are.
The MoE approach allows models which, on paper, have many billions more parameters than the smallest models available on the market to run at a similar latency and cost per token. Take Llama 4 Scout, for example. Meta’s smallest frontier model has a total of 109 billion parameters, but only 17 billion active parameters at any given time, giving it a more lightweight computational footprint.
As these grow in popularity, developers will have to demonstrate concrete benefits to sticking with their approach or adapt to deliver a hybrid model offering. The jury’s still out on what the best approach or lowest latency is for AI output so there’s all to play for here.
For example, OpenAI boasts that GPT-4o nano takes just five seconds to generate its first output tokens when given an input of up to 128,000 tokens. If IBM could demonstrate a faster response time, or even anything like it available through its open source licensing, enterprises would be able to rely more heavily on AI at the edge.
The more domain-specific IBM gets, the more potential for wins in very niche markets.
MORE FROM ITPRO
- IBM eyes Oracle expertise gains with latest acquisition
- IBM’s CEO just said the quiet part out loud on AI-related job losses
- IBM and SAP expand partnership to drive generative AI capabilities

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
-
Cyber attacks are getting quicker and businesses need to plan accordingly
In-depth Without proactive patch management, businesses are vulnerable to attacks on overlooked weaknesses
-
How the AWS outage unfolded, who it impacted, and what it might cost
News Apps and websites impacted by the AWS outage have recovered after a highly disruptive start to the week
-
Employee ‘task crafting' could be the key to getting the most out of AI
News Tweaking roles to make the most of AI makes you more engaged at work
-
Salesforce could become the king of enterprise AI – but only if customers believe in its potential
Analysis At Dreamforce 2025, Salesforce painted a believable picture for enterprise AI, but shareholders will only be reassured by greater business buy-in
-
OpenAI has a bold plan to pay for its $1 trillion spending spree: Ads, personal assistants, and cheaper subscriptions
News OpenAI has lined up more than $1 trillion in spending – and now it's trying to figure out how to pay for it all.
-
AI is redefining roles in the tech industry and forcing Gen Z workers to reassess career paths
News Gen Z workers remain cautious about AI while industry turbulence is changing their outlook on company loyalty
-
"Do not sacrifice your entry-level jobs": Salesforce might be all in on AI, but it isn't giving up on junior workers yet – despite Marc Benioff's job replacement claims
News Salesforce is still committed to hiring junior team members even as AI automates roles, according to UK&I chief executive Zahra Bahrololoumi.
-
Databricks wants to train 100,000 people in AI across the UK and Ireland – here's how to get involved
News The company will work with government and academic institutions to bolster AI and data skills
-
Microsoft unveils additional CEO to work alongside Nadella
News The move aims to free up Microsoft CEO Satya Nadella to focus on AI
-
AI is boosting personal productivity but slowing down teams – here’s why
News An Atlassian survey suggests AI is helping worker productivity, but a failure to collaborate means it isn't delivering ROI