Meta has officially released Code Llama, a new open-source LLM for code completion, generation, and testing, that can be run on local hardware and compete with ChatGPT.
The model is available for free for both research and commercial use, and comes in a number of variations to best suit user needs. It can produce or complete lines of code in languages such as Python, C++, Java, and Bash.
Code Llama is a specialized version of Meta’s free LLM Llama 2, and was created by subjecting Llama 2 to additional training based on 500 billion tokens of code and programming data.
The model comes in three different parameter sizes: 7-billion (7B), 13-billion (13B) and 34-billion (34B).
Meta stated that while the 34B model is the most accurate, the 7B and 13B models operate faster and can be more beneficial for low-latency demands such as real-time code completion.
Code Llama 34B scored 48.8% accuracy on HumanEval, a benchmarking dataset made by OpenAI to run AI models through programming challenges, better than the 30.5% achieved by the base model Llama 2 and a slight improvement on the 48.1% scored by OpenAI’s GPT-3.5 model, which is the backbone for ChatGPT.
All models still fell short of OpenAI’s multimodal GPT-4, which can generate code in a wide range of programming languages and is the base model for Microsoft’s advanced code AI programming assistant Copilot X.
In addition to the variety of Code Llama model sizes, Meta released two fine-tuned models titled ‘Code Llama — Python’ and ‘Code Llama — Instruct’.
The former was subjected to additional training based on a vast dataset of 100 billion Python-specific tokens, to ensure that it is especially accurate at generating code in the language.
Meta stated it was created because Python is among the most used languages within the AI community, has been heavily benchmarked to date, and is the basis for the open-source machine learning framework (ML) Pytorch.
Llama — Instruct has been trained on 5 billion tokens to fine-tune it for natural language input, and is the model that Meta recommended for users wishing to generate answers or code based on questions in plaintext as one would with a tool like ChatGPT.
While the generalist Llama 2 can be used in a similar fashion, it is not as accurate with its code responses as it has not been subjected to the same fine-tuning steps as Code Llama.
The 7B model can also be run on a single graphics processing unit (GPU), though Meta did not specify the minimum hardware requirements for achieving this.
Software engineer Anton Bacaj posted a video in which Code Llama was able to process code generation at a rate of 49ms per token, running on four Nvidia 3090 GPUs.
Code llama 34B on 4x3090s, 49ms~ per token pic.twitter.com/5A3bOdGe6KAugust 25, 2023
This could prove beneficial for programmers who wish to use the model for generating, testing, or completing code based on sensitive data or proprietary information.
Although this will require up-front investment in hardware, smaller businesses may weigh up these costs against subscriptions for services such as ChatGPT Plus or Copilot X.
The cost of keeping data local may also be seen as necessary versus the ‘black hole’ of oversight for code passed to companies like Google and OpenAI.
Meta has not stated the origins of some of the data used to train Llama 2, which could open firms up to legal action under legislation such as the EU’s AI Act if they are later found to have generated code based on copyrighted data.
Llama 2’s predecessor LLaMA was leaked online in March 2023, and some hackers called for it to be stored on Bitcoin for anonymous, easy access. Some experts had expressed concerns that in the wrong hands, LLaMA could be used to boost cyber crime.
Driving disruptive value with Generative AI
This free webinar explains how businesses are leveraging AI responsibly and at scale
DOWNLOAD FOR FREE
Unlike LLaMA, Llama 2 and Code Llama are freely available outside of academia. Meta stated that Code Llama has been put through additional testing to iron out malicious outputs.
“As with all cutting edge technology, Code Llama comes with risks. Building AI models responsibly is crucial, and we undertook numerous safety measures before releasing Code Llama,” the firm stated.
“As part of our red teaming efforts, we ran a quantitative evaluation of Code Llama’s risk of generating malicious code. We created prompts that attempted to solicit malicious code with clear intent and scored Code Llama’s responses to those prompts against ChatGPT’s (GPT3.5 Turbo). Our results found that Code Llama answered with safer responses.”
In addition to overtly malicious outputs, Code Llama will be judged on the day-to-day usefulness of its code generation and debugging.
ChatGPT was recently found to give incorrect answers to programming questions more than 50% of the time.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2023.
Rory Bathgate is a staff writer at ITPro covering the latest news on artificial intelligence and business networks. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, after four years in student journalism. You can contact Rory at email@example.com or on LinkedIn.