Google launches flagship Gemini 3 model and Google Antigravity, a new agentic AI development platform
Gemini 3 is the hyperscaler’s most powerful model yet and state of the art on almost every AI benchmark going
Google has officially unveiled Gemini 3 Pro, its new state of the art LLM with record-breaking scores across almost every AI benchmark.
The new model is intended to improve every Google service that uses Gemini, including its dedicated app, coding tools, and AI in search.
Google stated that Gemini 3 Pro is much better at handling requests in their intended context and providing useful answers that don’t resort to flattery.
The model debuted in the number one spot across text, WebDev, and vision in LMArena, with Google having claimed Gemini 3 Pro is the “best model in the world for complex multimodal understanding”.
Across multimodal reasoning benchmarks, Gemini 3 Pro was found to consistently outperform competition such as GPT-5.1 and Claude Sonnet 4.5.
In MMMU-Pro, for example, the model scored 81% versus GPT-5.1’s 76% and Claude Sonnet 4.5’s 68%.
ARC-AGI-2 is a rigorous benchmark for testing the capability of AI model reasoning across a series of abstract visual puzzles. Easy for humans to complete but difficult for today’s LLMs, it’s considered a true challenge for frontier models.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Gemini 3 Pro scored 31.1% in tests, far in excess of GPT 5.1’s 17.6% and Claude Sonnet 4.5’s 13.6%.
Gemini 3 is a game changer for Google shops
In other areas, the performance gap between Google’s model and the competition is even more stark.
Gemini 3 Pro scored a new record score of 23.4% at MathArena Apex, a benchmark that tests LLMs on their ability to solve mathematical problems upon which they weren’t trained, compared to just 1% by GPT-5.1 and 1.6% by Claude Sonnet 4.5.
Although the model doesn’t beat the latest version of Claude at the most common coding benchmarks, it greatly improves upon Gemini 2.5 Pro.
Google has also emphasized how its reasoning and visual understanding helps Gemini 3 Pro to do more with the coding capabilities it has, for quicker overall resolution of common developer tasks.
In a demo, Google showed how Gemini 3 Pro could turn an image of a chessboard into an interactive game, as well as a back-of-the-napkin sketch of a website into a functioning page.
Gemini 3 Deep Think, the multi-step reasoning variant of the new model, improves upon its benchmarks at the cost of far slower responses.
In Humanity’s Last Exam, an intense benchmark designed to press LLMs with 2,500 difficult questions across a wide range of subject areas, Gemini 3 Deep Think scores 41%, compared to Gemini 3 Pro’s 37.5% and GPT-5 Pro’s 30.7%.
Gemini 3 Deep Think is still undergoing safety tests but will become available to Google AI Ultra subscribers within the next few weeks.
Gemini 3 Pro was trained using Google’s tensor processing units (TPUs), dedicated hardware for AI training and inference that Google cites as key to its efforts to develop AI sustainably.
Google Antigravity
Alongside the much-anticipated launch of Gemini 3 today, Google also revealed Google Antigravity.
The product is described as a new ‘agent-first’ IDE, with a focus on elevating developers into a manager of AI agents.
Using Gemini 3’s agentic coding features, as well as its reasoning and tool use, Antigravity is intended to allow developers to set agents multi-step, complex coding tasks and receive evidence proving the job has been completed to a high standard.
In a demo, Google showed how a developer could use the tool to build a flight lookup web app, returning flight data based on a flight number provided by the user.
The tool is then capable of creating an implementation plan, writing the code, and then testing the app by opening the Chrome browser. Finally, it provides the user with screenshots of its tests, which the user can critique in order to immediately update the UI of the finalized app.
These features are powered by a combination of Gemini 3, Gemini 2.5 Image (better known as Nano Banana), and Google 2.5 Computer Use.
In another demo, Google showed how enterprise developers can delegate to multiple ‘background agents’ within Antigravity, without needing to stay in the same chat window while tasks are being completed.
Google added that Antigravity is capable of supporting Claude 4.5 Sonnet in addition to its own models.
Antigravity is now in public preview for free, with support for other models including Claude Sonnet 4.5 and OpenAI’s GPT-OSS.
Enterprises with the Vertex AI and Gemini Enterprise subscriptions have access to Gemini 3 from today and the model is also available in the Gemini app, the Gemini API within AI Studio, Antigravity, and Gemini CLI.
The model will be priced at $2 per million input tokens and $12 per million output tokens, for prompts under 200,000 tokens.
Prompts over this limit will cost $4 per million input tokens and $18 per million output tokens.
Make sure to follow ITPro on Google News to keep tabs on all our latest news, analysis, and reviews.
MORE FROM ITPRO

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.
-
Microsoft unveils Foundry overhaul for managing, optimizing AI agentsNews The hyperscaler is aiming to simplify AI agent oversight, as organizations grapple with the increasingly complicated business of processing and paying for outputs
-
Microsoft opens up Entra Agent ID preview with new AI featuresNews Microsoft Entra Agent ID aims to help manage influx of AI agents using existing tools
-
Microsoft's new Agent 365 platform is a one-stop shop for deploying, securing, and keeping tabs on AI agentsNews The new platform looks to shore up visibility and security for enterprises using AI agents
-
Google CEO Sundar Pichai sounds worried about a looming AI bubble – ‘I think no company is going to be immune, including us’News Google CEO Sundar Pichai says an AI bubble bursting event would have global ramifications, but insists the company is in a good position to weather any storm.
-
Some of the most popular open weight AI models show ‘profound susceptibility’ to jailbreak techniquesNews Open weight AI models from Meta, OpenAI, Google, and Mistral all showed serious flaws
-
Sundar Pichai thinks commercially viable quantum computing is just 'a few years' awayNews The Alphabet exec acknowledged that Google just missed beating OpenAI to model launches but emphasized the firm’s inherent AI capabilities
-
'It's slop': OpenAI co-founder Andrej Karpathy pours cold water on agentic AI hype – so your jobs are safe, at least for nowNews Despite the hype surrounding agentic AI, OpenAI co-founder Andrej Karpathy isn't convinced and says there's still a long way to go until the tech delivers real benefits.
-
Nvidia CEO Jensen Huang says future enterprises will employ a ‘combination of humans and digital humans’ – but do people really want to work alongside agents? The answer is complicated.News Enterprise workforces of the future will be made up of a "combination of humans and digital humans," according to Nvidia CEO Jensen Huang. But how will humans feel about it?
-
‘I don't think anyone is farther in the enterprise’: Marc Benioff is bullish on Salesforce’s agentic AI lead – and Agentforce 360 will help it stay top of the perchNews Salesforce is leaning on bringing smart agents to customer data to make its platform the easiest option for enterprises
-
This new Microsoft tool lets enterprises track internal AI adoption rates – and even how rival companies are using the technologyNews Microsoft's new Benchmarks feature lets managers track and monitor internal Copilot adoption and usage rates – and even how rival companies are using the tool.