‘AI tools are now able to transcend their initial training’: Researchers taught GPT-5 to learn an obscure programming language on its own
OpenAI’s GPT-5 learned to code in Idris despite a lack of available data, baffling researchers
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
You are now subscribed
Your newsletter sign-up was successful
AI coding tools are surging in popularity, but often require a huge amount of oversight by developers and are trained on specific programming languages.
But recent research suggests these tools could start filling in gaps in their own knowledge.
That's according to a paper by researchers at the University of Southern California Viterbi (USC Viterbi), who developed a technique to allow an AI model to improve its own skills in an area it hasn't really been trained on, letting it expand beyond its own training parameters.
To test the idea, they used OpenAI's GPT-5 and used it to write code in an obscure programming language called Idris – which researchers said they didn't even know how to use themselves.
Researcher Minda Li improved the model's success from 39% to 96% using a system that allowed her to tell the AI when it was incorrect, but also letting it have another go.
"Our AI tools are now able to transcend their initial training," said Prof. Bhaskar Krishnamachari in a blog post detailing the project. "Used to be, maybe a year or two ago, you would say an AI model is only as good as the data it has seen. This paper is saying something different."
The work comes amidst a sharp rise in the use of AI coding tools over the last two years. A study in Science from earlier this year showed the share of Python contributions on GitHub that were written by – or otherwise used by – AI in the US climbed from 5% in 2022 to 29% at the end of 2024.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
Teaching GPT-5 to code in Idris
The project made use of Idris, which has just 2,000 code repositories publicly available online versus 24 million for Python – meaning the AI model had much less data to train on.
Most of the researchers involved on the project had never even heard of it and therefore couldn't tell if the AI's code was correct or not.
"We were hunting for a language so obscure that we hadn’t heard of it," Krishnamachari said. "I think we were just in my office together, googling around, trying to find some crazy language that no one’s ever heard of."
That meant Li needed to figure out a way of teaching a skill she couldn't do herself. At first, she set GPT-5 on solving a series of Idris coding exercises on a training platform. It managed a 39% score.
To boost the score, she gave the model a range of assistance, including access to documentation, manuals, and reference guides – that helped, but only enough for a score around 60%.
To take it further, Li built a compiler feedback loop, taking error messages from the compiler and giving those to GPT-5 with a prompt to fix the issues. It was given 20 chances per error.
Li said she expected an improvement of 10%, but instead the system scored nearly perfectly.
"I was surprised that just that alone, seemingly one simple thing, just keep recompiling, keep trying, was able to get to 96%," she said.
What's next?
The results suggest it's possible to teach AI when there's a lack of available or relevant data, even after the model has been trained. Krishnamachari said the technique could be applied more widely and have huge long-term implications.
He suggested a feedback loop could help with creating 3D models of buildings or advanced mathematical reasoning, while a former student is working to translate endangered languages.
However, the technique worked well with coding Idris thanks to the feedback of the compiler, highlighting the continued need for long-standing processes such as reinforcement learning and extensive training.
"What I’ve learned from this project is that so long as you can figure out how to provide that kind of clear and correct feedback, there’s a chance we can now significantly improve the quality of AI outputs," he noted.
If that problem can be solved, it should be possible to build "an AI tool to do a task that we cannot do ourselves," Krishnamachari added.
FOLLOW US ON SOCIAL MEDIA
Follow ITPro on Google News and add us as a preferred source to keep tabs on all our latest news, analysis, views, and reviews.
You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.
Freelance journalist Nicole Kobie first started writing for ITPro in 2007, with bylines in New Scientist, Wired, PC Pro and many more.
Nicole the author of a book about the history of technology, The Long History of the Future.
-
Dell Technologies unveils massive expansion to Dell AI Factory with Nvidia at GTC 2026News New offerings aim to give Dell customers maximum flexibility, while providing the raw power necessary for tomorrow’s trillion-parameter models
-
Future-ready data center AI: Agentic AI reasoning with NVIDIA Rubin platformSponsored Autonomous AI is transforming enterprise computing. The combination of intelligent infrastructure and advanced acceleration enables reasoning-based workloads. How can businesses prepare for the era of agentic intelligence?
-
Zoom users can now create their own custom AI agentsNews The workplace collaboration giant is going all in on "agentic AI orchestration"
-
Microsoft CEO Satya Nadella says 'anyone can be a software developer' with AI, but skills and experience are still vitalNews AI will cause job losses in software development, Nadella admitted, but claimed many will reskill and adapt to new ways of working
-
Claude Code flaws left AI tool wide open to hackers – here’s what developers need to knowNews The trio of Claude code flaws could have put developers at risk of attacks
-
AI isn’t killing DevOps, you’re just using it wrongNews New research indicates that enterprises with mature DevOps processes are gaining the most from AI adoption
-
Enterprises still can't get a handle on software security debt – and it’s only going to get worseNews New research shows that the backlog of unresolved vulnerabilities is growing faster than organizations can deal with it
-
‘AI is making us able to develop software at the speed of light’: Mistral CEO Arthur Mensch thinks 50% of SaaS solutions could be supplanted by AINews Mensch’s comments come amidst rising concerns about the impact of AI on traditional software
-
Automated code reviews are coming to Google's Gemini CLI Conductor extension – here's what users need to knowNews A new feature in the Gemini CLI extension looks to improve code quality through verification
-
Claude Code creator Boris Cherny says software engineers are 'more important than ever’ as AI transforms the profession – but Anthropic CEO Dario Amodei still thinks full automation is comingNews There’s still plenty of room for software engineers in the age of AI, at least for now
