4 million Facebook users' details 'exposed by Cambridge academics'
Personal data including age, location and gender was accessible to anyone - report


The personal information of more than four million Facebook users was stored in an easily-accessible online database for the last four years, it has been revealed.
The database, which was created by researchers from the University of Cambridge and features highly sensitive information, was left exposed with minimal protections.
Facebook users' data was compiled through a Facebook app called myPersonality - a personality quiz similar to the ThisIsYourDigitalLife app that Alexandr Kogan used to collect the data that would later be passed to Cambridge Analytica.
Included in the database were details including age, gender, location and relationship status for 4.3 million people, as well as 22 million status updates from 150,000 users, according to the New Scientist, which broke the news following an investigation it conducted.
The purpose of the quiz was to gather information on users' 'Big Five' personality traits - five broad factors that are used in some branches of psychology to analyse an individual's personality. The database included the 'Big Five' scores of 3.1 million Facebook users.
While their datasets were supposedly anonymised, with users' names stripped out prior to publication, the rest of their details were still linked, making it comparatively easy to determine the users' identity, according to the publication.
The men behind the project are Michal Kosinski and David Stillwell, two researchers from Cambridge University's Psychometrics Centre - although the university said that Stillwell created the app prior to his employment.
Get the ITPro daily newsletter
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
The data was made available to other researchers who wanted to use it in their own studies, with the proviso that they use it in such a manner that the data could not be traced back to the individual participants of the quiz, while researchers from private companies could use it for non-commercial purposes provided they agreed to be bound by strict data protection policies.
Researchers from Google, Microsoft and Yahoo registered to use the data, as well as academics from around 150 universities, totalling almost 300 people. However, one university lecturer apparently gave students the credentials to access the database as part of a course on analysing Facebook data.
The login details were then uploaded to GitHub, where they were easily accessible via a simple Google search, allowing anyone to access the database. The GitHub page remained up for around four years.
Facebook suspended the myPersonality quiz last month over concerns that the language describing how data was shared violates its terms of service, and has launched an investigation to determine if a violation has taken place. The Information Commissioner's Office is also examining the case, although an official investigation has not been announced.
Stillwell told the New Scientist that over the course of the project's lifespan, it has only experienced one data breach, saying "we believe that academic research benefits from properly controlled sharing of anonymised data among the research community".
The news comes after the Cambridge Analytica scandal, in which the US analytics firm allegedly siphoned millions of users' data from an app created by Cambridge professor Aleksandr Kogan to try to influence voters in the 2016 US presidential election.
It has raised questions around what third-parties do with Facebook users' data, and the lack of controls Facebook had in place to monitor their usage of such information. Last week Facebook revealed it has suspended 200 apps amid an investigation into whether third parties misused users' information.
Adam Shepherd has been a technology journalist since 2015, covering everything from cloud storage and security, to smartphones and servers. Over the course of his career, he’s seen the spread of 5G, the growing ubiquity of wireless devices, and the start of the connected revolution. He’s also been to more trade shows and technology conferences than he cares to count.
Adam is an avid follower of the latest hardware innovations, and he is never happier than when tinkering with complex network configurations, or exploring a new Linux distro. He was also previously a co-host on the ITPro Podcast, where he was often found ranting about his love of strange gadgets, his disdain for Windows Mobile, and everything in between.
You can find Adam tweeting about enterprise technology (or more often bad jokes) @AdamShepherUK.
-
RSAC Conference 2025: The front line of cyber innovation
ITPro Podcast Ransomware, quantum computing, and an unsurprising focus on AI were highlights of this year's event
-
Anthropic CEO Dario Amodei thinks we're burying our heads in the sand on AI job losses
News With AI set to hit entry-level jobs especially, some industry execs say clear warning signs are being ignored
-
The business value of Zscaler Data Protection
Whitepaper Understand how this tool minimizes the risks related to data loss and other security events
-
Top data security trends
Whitepaper Must-have tools for your data security toolkit
-
Three essential requirements for flawless data protection
Whitepaper Want a better CASB and stronger DLP? You have to start with the right foundation
-
PowerEdge - Cyber resilient infrastructure for a Zero Trust world
Whitepaper Combat threats with an in-depth security stance focused on data security
-
P2PInfect self-replicating Rust worm discovered attacking Redis instances
News Researchers believe that the worm could be laying the groundwork for a larger campaign to be launched at some point in the future
-
Anticipate, prevent, and minimize the impact of business disruptions
Whitepaper Nine best practices for building operational resilience
-
Three steps to transforming security operations
Whitepaper How to be more agile, effective, collaborative, and scalable
-
Top ten ways to anticipate, eliminate, and defeat cyber threats like a boss
Whitepaper Improve your cyber resilience and vulnerability management while speeding up response times