Why Python is the programming language of choice for AI developers

Female python software developer working at computer in open plan office space.
(Image credit: Getty Images)

Python has emerged as the go-to programming language for developers building generative AI applications, according to new research. 

“Python is the language of choice for AI programming,” said a report from cloud data company Snowflake, which analyzed usage data from 9,000 of its customers.

Snowflake said Python use grew by 571% on its Snowpark platform, considerably more than any other language year-over-year. The use of other languages also grew - such as Scala (up 387%) and Java (up 131%) – but not as fast.

Python skills will be increasingly essential to development teams as they venture into advanced AI,” the report said.

That’s because Python has a lot going for it, the report said. The programming language is easy to learn and read, enabling developers to "focus on solving AI problems rather than parsing abstract syntax”.

It also boasts a large ecosystem of libraries and frameworks to simplify otherwise daunting AI tasks, as well as an active community of contributors to help with learning and problem-solving.

“Overall, Python lets devs focus on the problem, not the language. They can work fast, accelerating prototyping and experimentation — and therefore overall learning as dev teams make early forays into cutting-edge AI projects,” the report said.

The research from Snowflake follows analysis earlier this month that showed enthusiasm for Python continues to grow. The Tiobe programming language ranking noted that the gap between Python and the rest of the pack has never been greater.

Python supporting unstructured data improvements

The Snowflake report also found that enterprises are tapping their unstructured data. Most data — perhaps as much as 90% — is unstructured, in the form of videos, images and documents, but the company said it saw processing of unstructured data grow by 123%.

“That’s good news for many uses, not the least of which is advanced AI,” the report said. “Proprietary data will give large language models their edge, so unlocking that underutilized 90% has huge value.”

Notably these types of data are being processed with Python, Java and Scala.

“Given that Python in particular is the language of choice for many developers, data engineers and data scientists, its fast-growing adoption suggests that these unstructured data workflows are not just for building data pipelines, but also involve AI applications and ML models,” Snowflake said.

Snowflake records an “LLM explosion”

Certainly, building generative AI-powered apps on top of large language models (LLM) is now a priority for many developers.

“The LLM explosion is happening now—probably at your office,” said Snowflake.

The company said that, in the last year across its Streamlit developer community, it saw 20,076 developers work on 33,143 LLM-powered apps. Nearly two-thirds of developers said they were working on work projects.

While generative AI is yet to be an all-encompassing technology, Snowflake said “we’re definitely seeing a lot of effort to get us there ASAP”, underlining the intense enterprise interest in the use of AI tools and applications.

The kind of apps developers are making is also evolving – Snowflake said that between May 2023 and January 2024 in Streamlit, chatbots went from 18% of LLM apps to 46%.

This most likely doesn’t represent a shift in the market’s appetite for LLM apps, but shows how developers are increasing their skills and are able to build more complex chatbot apps.

Developers said their top concern when building generative AI apps was whether the LLM response was accurate – a reference to the ongoing issue of AI hallucinations – followed by concerns about data privacy.

Coupled with this, businesses are also taking a more proactive approach to data governance. The number of tags applied to an object rose 72%, while the number of objects with a directly assigned tag is up almost 80% and the number of applied masking or row-access policies increased 98%.

But Snowflake also said the cumulative number of queries run against policy-protected objects is up 142%. This is particularly significant, it said, because it showed that companies were increasing their use of data while ensuring responsible use.

“We’re seeing more and more governance through the use of tags and masking policies, but the amount of work being done with this more carefully governed data is rising rapidly.”

Jennifer Belissent, Principal Data Strategist at Snowflake, said while data security has long been a key focus, the rapid acceleration of AI applications has brought the issue to the fore. Addressing issues such as privacy and security “delivers peace of mind”.

“When the data is protected, it can be used securely,” she said.

"Taken individually, each of these trends is a single data point that shows how organizations across the globe are dealing with different challenges. When considered together, they tell a larger story about how CIOs, CTOs, and CDOs are modernizing their organizations, tackling AI experiments, and solving data problems — all necessary steps to take advantage of the opportunities presented by advanced AI.”

Steve Ranger

Steve Ranger is an award-winning reporter and editor who writes about technology and business. Previously he was the editorial director at ZDNET and the editor of silicon.com.