‘There is no law of computer science that says that AI must remain expensive and must remain large’: IBM CEO Arvind Krishna bangs the drum for smaller AI models
Lightweight, domain-specific models will be the go-to option for enterprises moving forward
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
You are now subscribed
Your newsletter sign-up was successful
IBM CEO Arvind Krishna believes smaller, domain-specific AI models represent a prime opportunity for enterprises to truly capitalize on the technology.
Speaking at IBM’s 2025 Think conference in Boston, Krishna reflected on the course of the generative AI boom so far, noting that the industry has matured significantly compared to last year - as have enterprises and their expectations of the technology.
“As I think about last year to this year, there’s one really big difference that I’m feeling, which is that AI has moved from experimentation to a focus on unlocking business value,” he told attendees.
“People are worrying about what is the use-case? How do I get my business to scale leveraging AI? And I think that's a real big difference, because that means, if I sort of think about hype cycles, that means the hype cycle’s kind of fading, and we are now thinking about adoption, we're thinking about ROI, we're thinking about business value.”
The initial focus on large, cumbersome AI models has waned, Krishna claimed, and is instead shifting towards more finely curated options aimed at tackling specific business challenges.
A key factor in this is businesses adopting AI solutions are dead set on supercharging productivity. While larger models have helped with this, they are often a one-size-fits-all approach.
In contrast, smaller models are now proving vital by enabling enterprises to harness internal data in a more efficient and strategic manner, Krishna noted.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
“I think AI is the source of productivity for this era,” he said. “But not all AI is built the same and not all AI is built for the enterprise.
“Why do I make that claim? 99% of all enterprise data has been untouched by AI. So if you need to unlock the value from that 99% you need you need to take an approach to AI that is tailored for the enterprise.
“If you think about the massive general purpose models. Those are very useful, but they are not going to help you unlock the value from all of the data inside the enterprise.”
Smaller models have a key advantage in that they inherently allow enterprises to derive greater value from enterprise data on a case-by-case basis. Ultimately, this not only helps businesses drive productivity, but also allows them to balance costs and target specific areas where the technology has maximum impact.
“To win, you are going to need to build special purpose models, much smaller, that are tailored for a particular use-case that can ingest the enterprise data and then work,” he explained.
“So you'll think about it, and then you’ll say, ‘well what about accuracy?’ - go look at leaderboards. Smaller models are now more accurate than larger models. What about intellectual property? That is where the open nature of some of the smaller models comes in.”
When assessing specific domains in which to deploy AI solutions, Krishna added that it’s “a lot easier to build a smaller model that can go after that”.
“So as we begin to go forward in that, you can think about these models that are three, eight, 13, 20 billion parameters, as opposed to 300 or 500 and then you begin to think about, well what advantage do I get? They're incredibly accurate. They are much, much faster,” he said.
“They're much more cost effective to run, and you can choose to run them where you want.”
IBM can find its niche with small AI models
It’s in this approach that Krishna believes IBM has a major advantage. The company’s watsonx platform, for example, allows enterprises to build applications designed to tackle specific challenges and areas of the business.
Similarly, Krishna pointed to IBM’s Granite model range, which he said is “a lot cheaper than some of the other alternatives”. These products have become core focus areas for IBM, and it intends to continue investing heavily in both moving forward.
As the industry matures and progresses, Krishna believes this focus on smaller model ranges will have a profound impact, opening up opportunities for enterprises in a trickle-down-style effect.
“As technology comes down the cost curve, it opens up many, many more opportunities,” he told attendees.
“It opens up a huge amount of aperture in terms of the problems you can afford to solve. That’s kind of been the curve it is on. We sometimes capture it in the phrase of, well, that's how we democratize technology, or rather, make it accessible to all, because the cost has come down.
“That is what we are very, very focused on. There is no law of computer science that says that AI must remain expensive and must remain large.”
Small language models are all the rage

Krishna isn’t the first to suggest that small language models will make all the difference for enterprises in the coming year. As organizations weigh up the benefits of the largest models over the lowest cost models that meet their needs and speed up the returns on their AI investments, developers are pivoting to providing lighter models that are performant at the edge.
The likes of Google’s Gemma 3 and Meta’s Llama 4 Scout can be run on a single Nvidia H100 GPU, while in the public cloud OpenAI’s 4o nano is its most cost competitive model for text-only tasks.
By leaning into this burgeoning field with its lightweight Granite model family, IBM could finally carve out a meaningful place for itself in the AI market.
Ensuring that each of its SLMs has a clear, competitive, domain-specific use case will be the key here. While there’s little chance of taking on the AI leaders at their own game – namely general purpose models – there’s clear demand for models that simply do one kind of task very well.
Mixture of experts (MoE) models, an architecture in which an AI model is made up of several smaller sub-models which are ‘experts’ at specific tasks and are only activated when needed, complicates the picture when it comes to how ‘small’ SLMs actually are.
The MoE approach allows models which, on paper, have many billions more parameters than the smallest models available on the market to run at a similar latency and cost per token. Take Llama 4 Scout, for example. Meta’s smallest frontier model has a total of 109 billion parameters, but only 17 billion active parameters at any given time, giving it a more lightweight computational footprint.
As these grow in popularity, developers will have to demonstrate concrete benefits to sticking with their approach or adapt to deliver a hybrid model offering. The jury’s still out on what the best approach or lowest latency is for AI output so there’s all to play for here.
For example, OpenAI boasts that GPT-4o nano takes just five seconds to generate its first output tokens when given an input of up to 128,000 tokens. If IBM could demonstrate a faster response time, or even anything like it available through its open source licensing, enterprises would be able to rely more heavily on AI at the edge.
The more domain-specific IBM gets, the more potential for wins in very niche markets.
MORE FROM ITPRO
- IBM eyes Oracle expertise gains with latest acquisition
- IBM’s CEO just said the quiet part out loud on AI-related job losses
- IBM and SAP expand partnership to drive generative AI capabilities

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
-
Stop treating agentic AI projects like traditional softwareAnalysis Designing and building agents is one thing, but testing and governance is crucial to success
-
PayPal appoints HP’s Enrique Lores in surprise CEO shake-upNews The veteran tech executive will lead the payments giant into its next growth phase amid mounting industry challenges
-
Business leaders are using AI as a “license to reduce headcount” – new Morgan Stanley research lays bare the impact on UK workersNews Analysis of five sectors highlights an "early warning sign" of AI’s impact on jobs
-
Lloyds Banking Group wants to train every employee in AI by the end of this year – here's how it plans to do itNews The new AI Academy from Lloyds Banking Group looks to upskill staff, drive AI use, and improve customer service
-
CEOs are fed up with poor returns on investment from AI: Enterprises are struggling to even 'move beyond pilots' and 56% say the technology has delivered zero cost or revenue improvementsNews Most CEOs say they're struggling to turn AI investment into tangible returns and failing to move beyond exploratory projects
-
Companies continue to splash out on AI, despite disillusionment with the technologyNews Worldwide spending on AI will hit $2.5 trillion in 2026, according to Gartner, despite IT leaders wallowing in the "Trough of Disillusionment" – and spending will surge again next year.
-
A new study claims AI will destroy 10.4 million roles in the US by 2030, more than the number of jobs lost in the Great Recession – but analysts still insist there won’t be a ‘jobs apocalypse’News A frantic push to automate roles with AI could come back to haunt many enterprises, according to Forrester
-
Businesses aren't laying off staff because of AI, they're using it as an excuse to distract from 'weak demand or excessive hiring'News It's sexier to say AI caused redundancies than it is to admit the economy is bad or overhiring has happened
-
Lisa Su says AI is changing AMD’s hiring strategy – but not for the reason you might thinkNews AMD CEO Lisa Su has revealed AI is directly influencing recruitment practices at the chip maker but, unlike some tech firms, it’s led to increased headcount.
-
Accenture acquires Faculty, poaches CEO in bid to drive client AI adoptionNews The Faculty acquisition will help Accenture streamline AI adoption processes