Google Cloud deep into second day of fire and flood data center fiasco
Services have still not fully recovered in the French cloud region
Google Cloud services continue to be heavily affected at one of its West European data centers, after a flood and subsequent fire caused widespread outages at the site.
The issues appear to have begun with an unexpected leak caused by a failure in the air conditioning systems at the Global Switch data center in Paris’ Clichy commune.
Battery components affected by the water sparked a fire, which caused a major outage across the entire French cloud region.
“Water intrusion in europe-west9-a has caused a multi-cluster failure and has led to an emergency shutdown of multiple zones,” read Google Cloud’s 20:51 PDT update on April 25.
“We expect general unavailability of the europe-west9 region. There is no current ETA for recovery of operations in the europe-west9 region at this time, but it is expected to be an extended outage.”
Over 100 services were affected in the initial outage, though Google Cloud Storage (GCS), Cloud Key Management Service (KMS), Cloud Identity and Access Management (IAM), and Google Kubernetes Engine (GKE) have now fully recovered across europe-west-9.
Google Cloud Console also experienced worldwide issues, and the firm’s service dashboard appeared to link this to the Paris incident.
Get the ITPro. daily newsletter
Receive our latest news, industry updates, featured resources and more. Sign up today to receive our FREE report on AI cyber crime & security - newly updated for 2024.
Quantifying the public vulnerability market
An analysis of vulnerability disclosures, impact severity, and product analysis
Management tasks for those operating outside of the French cloud region are now operating normally.
On the fibre optic forum La Fibre, an administrator laid out a timeline of events using data gathered by the French Network Operators Group (FRnOG), a voluntary community of mainly technical experts who work to collect and distribute information relating to French internet services.
It stated that a “cooling system water pump problem” led to water being unexpectedly released in the early hours of the morning on April 25.
This leaked into the battery room and caused a fire that emergency services had to let burn out for some time before more could be done.
Air conditioning systems came back online just before noon, but fire teams were unable to enter the battery room and instead cooled the walls in an attempt to stem the spread of the fire.
A Global Switch LinkedIn post from 13:00 local time stated that the “Fire Brigade has been in attendance and the fire is now contained”.
“The fire response systems in the building have performed as designed and no one has been injured,” it continued.
“A number of customers have been temporarily affected and our site team is working to restore services to those customers as soon as possible.
At 17:22 local time, officials at the data center had reportedly not shut down power as the air conditioning and critical server components were unaffected.
Google launched europe-west-9, its French region, in June 2022. It has not provided an ETA for when the region will operate normally again.
Some users on the YCombinator forum post about the incident noted that the region-wide outages implied that europe-west-9 a, b, and c are all located within the Clichy site to some extent.
“Ouch. Isn't part of separate zones being protected against something, say, like a terrorist attack or a natural disaster that can take down a whole datacenter?” one user noted.
At time of writing, Google Cloud’s latest update for the incident reports that services are still impacted in europe-west9-a but that europe-west9-b and europe-west9-c have largely recovered.
Asked for more information on when the outage will be fully resolved, Google referred ITPro to its service dashboard and the statement released by Global Switch.
Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.
In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at rory.bathgate@futurenet.com or on LinkedIn.