Yandex data breach reveals source code littered with racist language
Yandex source code for a range of key services was leaked to a popular hacker forum last week
Russian tech company Yandex has issued an apology after racial slurs were discovered in source code leaked in a recent data breach.
Several references to racial slurs, including the ‘N-word’, were found in the company’s source code last week.
A researcher first revealed the use of offensive terminology in a series of posts on Twitter on 26 January, sparking heavy criticism.
In a statement, Yandex told IT Pro that an initial investigation showed that the leaked code "appears to be old fragments differing from the current version of the company’s repository".
The company added that leaked code "would never have affected any of the company’s services".
"We deeply regret that this word ever appeared in our internal codes," Yandex said. “It is unacceptable and a blatant violation of our corporate ethics."
"We are currently conducting an internal review to better understand how this happened, and will be taking appropriate measures, including to ensure that this does not happen again."
Yandex source code leak
The discovery follows a recent data breach at Yandex which saw 44.7 gigabytes of source code leaked on a popular online hacker site BreachForums.
Leaked files were found to contain code on a range of Yandex products. The company is one of Russia’s largest tech firms and provides email, advertising, cloud computing and online sales services.
Responding to the breach, Yandex insisted that its systems were not hacked, but attributed the leak to a former employee.
Cost of a data breach report 2022
Discover the factors to help mitigate breach costsFree Download
In a blog post detailing the scale of the leak, security researcher Arseniy Shestakov said the exposed files date back to February 2022, coinciding with the Russian invasion of Ukraine.
While Shestakov said the leaked files included source code for a range of services, they did not contain sensitive user data.
"Since this leak only contains contents of git repositories there is no personal data," he wrote. "There are at least some API keys, but they are likely only been used for testing deployment only."
Racial slurs were dotted throughout Yandex's leaked Git codebase. They were used in function and variable names, printed messages, and other places throughout configuration files.
Programmers frequently use specific terms or names to enable other developers to understand what function or action a certain line of code performs.
The use of easy-to-read terms is a common approach which helps cut the time required for engineers to potentially modify or update code.
In this instance, Yandex developers appear to have substituted a generic term for a function with offensive language.
Exactly why these terms were included is unclear. However, the use of offensive language in code is a violation of both best practice and, as Yandex pointed out in its statement, against its code of ethics.
Yandex did not provide additional information on why the ‘N word’ was used in this case, but onlookers noticed it seemed to have also been used to replace 'workers' in various parts of its codebase.
IT best practices for accelerating the journey to carbon neutrality
Considerations and pragmatic solutions for IT executives driving sustainable ITFree Download
The Total Economic Impact™ of IBM Spectrum Virtualize
Cost savings and business benefits enabled by storage built with IBMSpectrum VirtualizeFree download
Using application migration and modernisation to supercharge business agility and resiliency
Modernisation can propel your digital transformation to the next generationFree Download
The strategic CFO
Why finance transformation propels business valueFree Download