IT Pro is supported by its audience. When you purchase through links on our site, we may earn an affiliate commission. Learn more

Google adds Python support to privacy-preserving data analysis tool

The addition of Python opens up the open-source differential privacy library to nearly half of all developers worldwide

Google has expanded its open-source differential privacy (DP) platform to support the Python programming language, widening availability to millions more developers and data analysts.

The announcement makes Python the fourth language supported by the project after initially launching in 2019 with support for C++, Java, and Google-created language Go, sometimes referred to as Golang.

It comes after Google reported a significant number of developers contacting the company expressing their interest in using the open-source library for their Python projects. Google worked for more than a year with OpenMined on Python support and said numerous projects have already used its DP library, including Australian developers who have accelerated scientific discoveries through analysing medical data in a private way.

DP is a system used by data analysts to preserve the privacy of the individuals whose data is used in an analysed data set. Work to develop strong DP dates back decades, but only in recent years have tech giants such as Google and Apple embraced the system.

One of the key areas of DP development for Google in the past year has been on providing a tool for developers within the library to fine-tune the 'epsilon' - a mathematical measure of privacy. Finding the optimum epsilon requires a great deal of trial and error to perfect and having a tool within the library that allows developers to make adjustments to yield a lower epsilon, which indicates a more private release, means individual projects are able to be tuned as privately as possible.

Google said now Python is supported, the DP library is now available to nearly half of all developers worldwide which means more developers and researchers will be able to analyse data and make new discoveries while preserving the privacy of users to whom the data belongs.

Python is among the most popular programming languages currently in use and won 'Language of the Year 2021' from the TIOBE index, which ranks programming languages based on their popularity. Python is useful for a wide range of programming activities but is especially well-known for its capabilities in data analysis, making it a natural progression for Google's DP library.

As part of the launch, Google has released a new web-based product, pipleinedp.io, which allows any Python developer to analyse their dataset with differential privacy. Google also said it has seen organisations experimenting with new use cases such as showing a website's most visited web pages by country, in an anonymised fashion.

The library is compatible with leading large data processing engines, the Spark and Beam frameworks, and Google will be launching an additional tool to help users "visualise and better tune the parameters used to produce differentially private information".

"We encourage developers around the world to take this opportunity to experiment with differential privacy use cases like statistical analysis and machine learning, but most importantly, provide us with feedback," said Google announcing the news. "We are excited to learn more about the applications you all can develop and the features we can provide to help along the way.

"We will continue investing in democratising access to critical privacy-enhancing technologies and hope developers join us in this journey to improve usability and coverage. As we’ve said before, we believe that every Internet user in the world deserves world-class privacy, and we’ll continue partnering with organisations to further that goal."

What is differential privacy?

Differential privacy is a tool that has gained acclaim in recent years as data and identity protection have become focal points for researchers, businesses, and regulators alike.

Some argue it is fundamentally necessary in data analytics to preserve the privacy and hide the identity of people whose data is being analysed. For technology companies especially, it has been at the forefront of how their users expect them to handle the data they hold on others.

Related Resource

Content syndication isn't dead, but your data processes might be

It's a new (lead) generation

Drawn image in white of a figure with a graph arrow on the up and a dollar sign over a photo of metal cogsFree Download

DP works by adding 'controlled noise' to datasets so that people cannot be individually identified by the data they provide to the dataset. For example, if residents of a neighbourhood supplied data for analysis involving their salaries which were then represented as an average, and one resident left the neighbourhood, their salary information could be tied to their identity by looking at the difference in the data pre- and post-move.

Similarly, if two databases were analysed, one with a single data point on 50 people and one with a single data point on 51 people, the analysis results for both would have to be indistinguishable from each other to avoid identifying that 51st person, in order to qualify as differentially private.

Adding controlled noise to a dataset would remove the possibility of identifying an individual by skewing the statistics just enough to remove the element of identification, without significantly compromising the accuracy of the results.

All major Big Tech firms have embraced DP in different ways. Microsoft's AI Lab works with Harvard University on projects to facilitate DP-enabled research. Apple has used DP on its products since macOS Sierra and iOS 10, and Facebook and Amazon also have experience working with the system too.

Featured Resources

Four strategies for building a hybrid workplace that works

All indications are that the future of work is hybrid, if it's not here already

Free webinar

The digital marketer’s guide to contextual insights and trends

How to use contextual intelligence to uncover new insights and inform strategies

Free Download

Ransomware and Microsoft 365 for business

What you need to know about reducing ransomware risk

Free Download

Building a modern strategy for analytics and machine learning success

Turning into business value

Free Download

Recommended

Apple executive rejoins Google over remote work policy
flexible working

Apple executive rejoins Google over remote work policy

18 May 2022
Here’s the first look at Google’s new Bay View campus
Business operations

Here’s the first look at Google’s new Bay View campus

17 May 2022
Google offers UK SMBs £87,000 scholarships to boost tech skills
Careers & training

Google offers UK SMBs £87,000 scholarships to boost tech skills

10 May 2022
Google Cloud confirms it is building a dedicated team to support Web3 developers
Cloud

Google Cloud confirms it is building a dedicated team to support Web3 developers

9 May 2022

Most Popular

16 ways to speed up your laptop
Laptops

16 ways to speed up your laptop

13 May 2022
Russian hackers declare war on 10 countries after failed Eurovision DDoS attack
hacking

Russian hackers declare war on 10 countries after failed Eurovision DDoS attack

16 May 2022
(ISC)2 launches free scheme to get 100,000 UK citizens into cyber security
Careers & training

(ISC)2 launches free scheme to get 100,000 UK citizens into cyber security

17 May 2022