GDPR in the age of AI raises a “data dilemma” for enterprises

Data protection and GDPR concept image showing multiple padlocks on a green background with one opened padlock.
(Image credit: Getty Images)

AI developers could fall foul of GDPR in years to come due to the need for companies to process and store vast amounts of data while building models. 

This week, Wojciech Wiewiórowski, head of the European Data Protection Supervisor (EDPS), pointed out the threat posed to GDPR concepts of purpose limitation and data minimization which stand opposed to the data collection objectives of AI companies. 

“I am of the opinion that the discussion about GDPR is going to start again in the next term of the parliament,” Wiewiórowski said in the European Parliament at a recent committee on civil liberties. 

“We have … quite strong attacks on the principles themselves” Wiewiórowski added. 

Wiewiórowski mentioned that purpose limitation in particular was likely to be “questioned” going forward, owing to the fact that it fundamentally restricts the acts of data collection vital to the training and deployment of large language models (LLMs).

The process of data minimization is similarly restrictive to the AI supply chain, in that it demands that those collecting data only gather the minimum amount of data needed to deliver a service.

Sarah Pearce, partner at Hunton Andrews Kurth, told ITPro that both processes will constitute a gray area in the burgeoning field of AI, as the exact purpose and minimum amount of data required can rarely be defined.

“Companies will inevitably be accused of taking their collection activities towards the excessive and may be asked to explain whether and why they are retaining the data for longer than may be perceived necessary,” she said. 

“Often, this is to help improve and further replicate algorithms – it is not necessarily being used for additional commercial gain,” Pearce added. Regardless of this, companies will need to comply, she noted.

Similarly, regulatory concerns about data are mirrored by commercial concerns around the successful development of AI and its dependence on a healthy pool of resources for training. 

“AI development and the GDPR face a data dilemma. The GDPR's principles of purpose limitation and data minimization restrict data use, hindering the vast datasets needed for AI training,” Mayur Upadhyaya, CEO of APIContext, told ITPro

“This tension intensifies with APIs facilitating cross-border data flows,” he added. 

While businesses will want to argue their own cases for leniency surrounding data collection practices, regulators will need to keep close tabs on what companies are doing and whether infringements are taking place.

Though EU regulators are still on an AI-high following the successful passing of the region’s AI act earlier this year, Wiewiórowski’s comments point to the problems that could arise moving forward. 

As such a landscape-shifting technology, governments in the EU and across the globe will need to not only design new legislation, but also look at how their existing legislation will be affected. 

GDPR compliance will become more complex

The fundamental tensions between data protection legislation and AI training means that enterprises will be forced to contend with a compliance-related minefield.

Building privacy and data compliance regulations into AI from the outset is one way to keep on top of these issues, Upadhyaya said, as it helps to make regulation synergetic with developments in the space.

“The key lies in "privacy-by-design" – integrating data protection from the outset in both AI and API development,” Upadhyaya said. 

RELATED WHITEPAPER

“Aligning with emerging regulations like DORA, which focuses on data security and resilience, is also crucial. As AI evolves, fostering a synergy between innovation, data protection practices (like GDPR), and responsible regulations like DORA will be vital for its future,” he added. 

Where the EDPS and others will need to turn a critical eye to new and existing legislation, organization’s will have to stay on top of the current demands and requirements. 

“Organizations implementing such AI technology developed or provided by third-parties, will need to ensure they have robust third-party vetting policies and procedures in place,” Pearce said.  

George Fitzmaurice
Staff Writer

George Fitzmaurice is a staff writer at ITPro, ChannelPro, and CloudPro, with a particular interest in AI regulation, data legislation, and market development. After graduating from the University of Oxford with a degree in English Language and Literature, he undertook an internship at the New Statesman before starting at ITPro. Outside of the office, George is both an aspiring musician and an avid reader.