Google blows away competition with powerful new Gemini 3 model
Gemini 3 is the hyperscaler’s most powerful model yet and state of the art on almost every AI benchmark going
Google has officially unveiled Gemini 3 Pro, its new state of the art LLM with record-breaking scores across almost every AI benchmark.
The new model is intended to improve every Google service that uses Gemini, including its dedicated app, coding tools, and AI in search.
Google stated that Gemini 3 Pro is much better at handling requests in their intended context and providing useful answers that don’t resort to flattery.
The model debuted in the number one spot across text, WebDev, and vision in LMArena, with Google having claimed Gemini 3 Pro is the “best model in the world for complex multimodal understanding”.
Across multimodal reasoning benchmarks, Gemini 3 Pro was found to consistently outperform competition such as GPT-5.1 and Claude Sonnet 4.5.
In MMMU-Pro, for example, the model scored 81% versus GPT-5.1’s 76% and Claude Sonnet 4.5’s 68%.
ARC-AGI-2 is a rigorous benchmark for testing the capability of AI model reasoning across a series of abstract visual puzzles. Easy for humans to complete but difficult for today’s LLMs, it’s considered a true challenge for frontier models.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Gemini 3 Pro scored 31.1% in tests, far in excess of GPT 5.1’s 17.6% and Claude Sonnet 4.5’s 13.6%.
Gemini 3 is a game changer for Google shops
In other areas, the performance gap between Google’s model and the competition is even more stark.
Gemini 3 Pro scored a new record score of 23.4% at MathArena Apex, a benchmark that tests LLMs on their ability to solve mathematical problems upon which they weren’t trained, compared to just 1% by GPT-5.1 and 1.6% by Claude Sonnet 4.5.
Although the model doesn’t beat the latest version of Claude at the most common coding benchmarks, it greatly improves upon Gemini 2.5 Pro.
Google has also emphasized how its reasoning and visual understanding helps Gemini 3 Pro to do more with the coding capabilities it has, for quicker overall resolution of common developer tasks.
In a demo, Google showed how Gemini 3 Pro could turn an image of a chessboard into an interactive game, as well as a back-of-the-napkin sketch of a website into a functioning page.
Gemini 3 Deep Think, the multi-step reasoning variant of the new model, improves upon its benchmarks at the cost of far slower responses.
In Humanity’s Last Exam, an intense benchmark designed to press LLMs with 2,500 difficult questions across a wide range of subject areas, Gemini 3 Deep Think scores 41%, compared to Gemini 3 Pro’s 37.5% and GPT-5 Pro’s 30.7%.
Gemini 3 Deep Think is still undergoing safety tests but will become available to Google AI Ultra subscribers within the next few weeks.
Gemini 3 Pro was trained using Google’s tensor processing units (TPUs), dedicated hardware for AI training and inference that Google cites as key to its efforts to develop AI sustainably.
Google Antigravity
Alongside the much-anticipated launch of Gemini 3 today, Google also revealed Google Antigravity.
The product is described as a new ‘agent-first’ IDE, with a focus on elevating developers into a manager of AI agents.
Using Gemini 3’s agentic coding features, as well as its reasoning and tool use, Antigravity is intended to allow developers to set agents multi-step, complex coding tasks and receive evidence proving the job has been completed to a high standard.
In a demo, Google showed how a developer could use the tool to build a flight lookup web app, returning flight data based on a flight number provided by the user.
The tool is then capable of creating an implementation plan, writing the code, and then testing the app by opening the Chrome browser. Finally, it provides the user with screenshots of its tests, which the user can critique in order to immediately update the UI of the finalized app.
These features are powered by a combination of Gemini 3, Gemini 2.5 Image (better known as Nano Banana), and Google 2.5 Computer Use.
In another demo, Google showed how enterprise developers can delegate to multiple ‘background agents’ within Antigravity, without needing to stay in the same chat window while tasks are being completed.
Google added that Antigravity is capable of supporting Claude 4.5 Sonnet in addition to its own models.
Antigravity is now in public preview for free, with support for other models including Claude Sonnet 4.5 and OpenAI’s GPT-OSS.
Enterprises with the Vertex AI and Gemini Enterprise subscriptions have access to Gemini 3 from today and the model is also available in the Gemini app, the Gemini API within AI Studio, Antigravity, and Gemini CLI.
The model will be priced at $2 per million input tokens and $12 per million output tokens, for prompts under 200,000 tokens.
Prompts over this limit will cost $4 per million input tokens and $18 per million output tokens.
Make sure to follow ITPro on Google News to keep tabs on all our latest news, analysis, and reviews.
MORE FROM ITPRO

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.
-
2025 marked the beginning of the end for OpenAIOpinion OpenAI has its fingers in too many pies and it’s rapidly losing favor with consumers and enterprises alike
-
Will 2026 be another challenging year for technology?Feature SMBs will be looking at how they can prepare for the challenges ahead, from global regulations to AI implementation to digital IDs…
-
OpenAI says prompt injection attacks are a serious threat for AI browsers – and it’s a problem that’s ‘unlikely to ever be fully solved'News OpenAI details efforts to protect ChatGPT Atlas against prompt injection attacks
-
Google DeepMind CEO Demis Hassabis thinks startups are in the midst of an 'AI bubble'News AI startups raising huge rounds fresh out the traps are a cause for concern, according to Hassabis
-
OpenAI turns to red teamers to prevent malicious ChatGPT use as company warns future models could pose 'high' security riskNews The ChatGPT maker wants to keep defenders ahead of attackers when it comes to AI security tools
-
Google DeepMind partners with UK government to boost AI researchNews The deal includes the development of a new AI research lab, as well as access to tools to improve government efficiency
-
AWS has dived headfirst into the agentic AI hype cycle, but old tricks will help it chart new watersOpinion While AWS has jumped on the agentic AI hype train, its reputation as a no-nonsense, reliable cloud provider will pay dividends
-
AWS CEO Matt Garman says AI agents will have 'as much impact on your business as the internet or cloud'News Garman told attendees at AWS re:Invent that AI agents represent a paradigm shift in the trajectory of AI and will finally unlock returns on investment for enterprises.
-
Westcon-Comstor partners with Fortanix to drive AI expertise in EMEANews The new agreement will help EMEA channel partners ramp up AI and multi-cloud capabilities
-
Microsoft quietly launches Fara-7B, a new 'agentic' small language model that lives on your PC — and it’s more powerful than GPT-4oNews The new Fara-7B model is designed to takeover your mouse and keyboard