DeepSeek’s R1 model training costs pour cold water on big tech’s massive AI spending
Chinese AI developer DeepSeek says it created an industry-leading model on a pittance
In mid-2024, Anthropic CEO Dario Amodei projected AI training costs to soar to such an extent that building a new model could cost upwards of $100 billion.
Amodei’s lofty claims appear to have been completely shot down with the publication of a new research paper from DeepSeek. In a recent paper, published in the academic journal Nature, the Chinese AI developer claims it spent a paltry amount training its flagship R1 model.
All told, training costs amounted to $294,000, with the company using 512 Nvidia H800 chips to build the model that had US companies sweating earlier this year.
It’s worth noting that these costs come in addition to around $6 million spent by the firm to create the base LLM R1 is built on. Regardless, the results are impressive given the far higher training costs associated with competing models.
So what makes DeepSeek R1 tick?
Under the hood of DeepSeek R1
DeepSeek R1 is a ‘reasoning model', meaning it’s designed specifically to excel at tasks such as mathematics and coding. It’s also an ‘open weight’ model, so is freely available for anyone to download.
As ITPro reported earlier this year, the decision to offer R1 as an open weight model was welcomed by industry stakeholders, and also gave competitors a vital insight into how the model performed compared to others out there on the market.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
R1 is among the most popular models currently available on AI community platform, Hugging Face, having been downloaded over 10 million times.
AI reasoning models are purposefully trained on real-world data, enabling them to “learn” how to solve specific problems. It’s a costly, long-winded process but has been a key focus for AI providers over the last 18-months as they offer users more intuitive tools.
According to this latest research paper, DeepSeek was able to reduce training costs through a carrot-and-stick type approach to reinforcement learning. Researchers essentially incentivized and rewarded the model to produce correct answers to user queries.
In a blog post dissecting the paper, researchers at Carnegie Mellon University noted that this was akin to a child playing video games, constantly learning new ways to progress.
“As the child navigates their avatar through the game world, they learn through trial and error that some actions (such as collecting gold coins) earn points, whereas others (such as running into enemies) set their score back to zero,” researchers explained.
“In a similar vein, DeepSeek-R1 was awarded a high score when it answered questions correctly and a low score when it gave wrong answers.”
So much for huge training costs
DeepSeek has previously hinted that its training processes were highly cost efficient. A preprint release of the study, published in January, pointed toward far lower costs compared to US competitors.
Some Silicon Valley figures questioned the veracity of DeepSeek’s claims at the time, but with R1 becoming the first peer-reviewed LLM, the company appears vindicated.
AI training has long been lamented as a costly, compute-intensive process for big tech providers. Amodei isn’t alone in highlighting the huge costs associated with this, either.
In 2023, OpenAI CEO Sam Altman hinted that foundation model training had cost upwards of $100 million.
These figures appear to have been confirmed during his appearance at a 2024 MIT event, where he said it’s “more than that”, per reports from Wired.
Make sure to follow ITPro on Google News to keep tabs on all our latest news, analysis, and reviews.
MORE FROM ITPRO

Ross Kelly is ITPro's News & Analysis Editor, responsible for leading the brand's news output and in-depth reporting on the latest stories from across the business technology landscape. Ross was previously a Staff Writer, during which time he developed a keen interest in cyber security, business leadership, and emerging technologies.
He graduated from Edinburgh Napier University in 2016 with a BA (Hons) in Journalism, and joined ITPro in 2022 after four years working in technology conference research.
For news pitches, you can contact Ross at ross.kelly@futurenet.com, or on Twitter and LinkedIn.
-
Software developer salaries are surging in the UK as AI skills gaps drives demandNews Stack Overflow says positive growth in developer salaries shows the community is thriving
-
Darktrace bolsters expansion plans with double C-suite appointmentNews Industry veteran Samun Raju joins the security vendor as CFO, while former KnowBe4 executive Hein Hellemons becomes CRO
-
'It's slop': OpenAI co-founder Andrej Karpathy pours cold water on agentic AI hype – so your jobs are safe, at least for nowNews Despite the hype surrounding agentic AI, OpenAI co-founder Andrej Karpathy isn't convinced and says there's still a long way to go until the tech delivers real benefits.
-
Nvidia CEO Jensen Huang says future enterprises will employ a ‘combination of humans and digital humans’ – but do people really want to work alongside agents? The answer is complicated.News Enterprise workforces of the future will be made up of a "combination of humans and digital humans," according to Nvidia CEO Jensen Huang. But how will humans feel about it?
-
‘I don't think anyone is farther in the enterprise’: Marc Benioff is bullish on Salesforce’s agentic AI lead – and Agentforce 360 will help it stay top of the perchNews Salesforce is leaning on bringing smart agents to customer data to make its platform the easiest option for enterprises
-
This new Microsoft tool lets enterprises track internal AI adoption rates – and even how rival companies are using the technologyNews Microsoft's new Benchmarks feature lets managers track and monitor internal Copilot adoption and usage rates – and even how rival companies are using the tool.
-
Salesforce just launched a new catch-all platform to build enterprise AI agentsNews Businesses will be able to build agents within Slack and manage them with natural language
-
The tech industry is becoming swamped with agentic AI solutions – analysts say that's a serious cause for concernNews “Undifferentiated” AI companies will be the big losers in the wake of a looming market correction
-
Microsoft says 71% of workers have used unapproved AI tools at work – and it’s a trend that enterprises need to crack down onNews Shadow AI is by no means a new trend, but it’s creating significant risks for enterprises
-
Huawei executive says 'we need to embrace AI hallucinations’News Tao Jingwen, director of Huawei’s quality, business process & IT management department, said firms should embrace hallucinations as part and parcel of generative AI.