The AWS outage explained: What happened, who was impacted, and what services are back online?
Overheating at a single data center has been identified as the cause of the AWS outage, which impacted customers such as Coinbase
Amazon Web Services (AWS) has confirmed a recent outage that impacted customers was caused by overheating at a North Virginia data center.
The disruption affected one of AWS' six Availability Zones, use1-az4 in the AWS US-EAST-1 region. This is one of the company's most heavily used regions globally.
Notably, the incident hit platforms including cryptocurrency exchange Coinbase, disrupting core exchange functions for more than five hours. Other reported victims include the CME Group trading platform and major gambling company FanDuel.
Coinbase last night warned that some users might experience delayed sends and receives on the Solana network and for ALEO, but said it was working on the issue.
The crypto trading platform has since resumed operations. In a statement, Coinbase said: "All markets have been re-enabled for trading on coinbase.com and in the Coinbase iOS and Android apps. Coinbase customers can log in to trade."
FanDuel, meanwhile, posted a statement on X, saying it was working to troubleshoot the issue. "Our team is aware and investigating the current technical difficulties prohibiting users from accessing our platform," it said.
Overheating behind AWS outage
In an update to customers, AWS attributed the cause of the outage to overheating. The hyperscaler is yet to confirm how the overheating occurred.
Sign up today and you will receive a free copy of our Future Focus 2025 report - the leading guidance on AI, cybersecurity and other IT challenges as per 700+ senior executives
"We have experienced an increase in temperatures within a single data center, which in some cases has caused impairments for instances in the Availability Zone," AWS said in a status report.
"EC2 instances and EBS volumes hosted on impacted hardware are affected by the loss of power during the thermal event."
In its latest update, AWS said it had shifted traffic away from the impacted zone. The hyperscaler said it was still carrying out mitigation efforts.
These are taking longer than expected to bring additional cooling system capacity online and recover the remaining affected infrastructure safely and in a controlled manner.
AWS warned some customers will continue to see their affected EC2 instances and EBS volumes as impaired until it can achieve full recovery. It said it currently didn't have an ETA for this.
What services are back online?
A number of services are back online following the outage, according to the hyperscaler. This includes:
- AWS IoT Core
- AWS NAT Gateway
- Amazon Elastic Kubernetes Service
- Amazon Elastic Load Balancing
- Amazon Redshift
Some services are still impacted at time of writing, including:
- Amazon ElastiCache
- Amazon Managed Streaming for Apache Kafka
- Amazon OpenSearch Service
- Amazon SageMaker
Yet another AWS outage
It's not the first time that a major AWS outage has caused chaos. Last year, hundreds of apps and websites including Slack, Zoom, Coinbase, Snapchat, and Signal were taken down in a global outage.
Banking applications including Lloyds and Halifax also saw customers unable to access services. On that occasion, AWS attributed the outage to a DNS issue.
The incident highlights the extent to which major websites and apps are dependent on just a few tech giants.
In 2024, for example, issues with CrowdStrike saw hospitals, banks, and airports in Australia, New Zealand, India, Japan, the US, Germany, and the UK seriously affected.
FOLLOW US ON SOCIAL MEDIA
Follow ITPro on Google News and add us as a preferred source to keep tabs on all our latest news, analysis, views, and reviews.
You can also follow ITPro on LinkedIn, X, Facebook, and BlueSky.
Emma Woollacott is a freelance journalist writing for publications including the BBC, Private Eye, Forbes, Raconteur and specialist technology titles.
-
Claude users beware, hackers are using a fake website to dupe developers and deliver malwareNews 'Beagle' is deployed through a Dynamic Link Library (DLL) sideloading chain, and gives attackers remote access to the system
-
Argyll and SambaNova team up to launch sovereign AI cloud for UK customersNews The companies claim that their approach reduces power and cooling demands, cutting complexity and long-term cost
-
‘Skipping a beat on resilience investment isn’t an option any more’ as IT outage costs soarNews Organizations are ramping up resilience efforts at IT outage costs continue mounting
-
The majority of businesses couldn’t survive three days of downtimeNews IT outages have a disastrous impact on enterprise productivity and finances
-
Cloudflare outage explained: What happened, who was impacted, and how was it resolved?News The seven-hour outage affected customers using Cloudflare's Bring Your Own IP (BYOIP) services
-
Going all-in on digital sovereigntyITPro Podcast Geopolitical uncertainty is intensifying public and private sector focus on true sovereign workloads
-
AT&T expands AWS partnership in network modernization, cloud migration pushNews The telecoms giant said the deal will supercharge the nation’s connectivity infrastructure
-
Grid constraints are slowing down AWS infrastructure plans across Europe – and research shows it's only going to get worseNews Efforts by AWS to expand data center infrastructure across Europe face severe delays due to sluggish grid connection practices, a senior company figure claims.
-
AWS and NTT Data team up to drive legacy IT modernization in EuropeNews Partnership between AWS and NTT DATA aims to boost AWS European Sovereign Cloud capabilities
-
AWS' new DNS 'business continuity' feature targets 60 minute recovery time after October cloud outageNews The US-EAST-1 Region is getting extra tools and features to help customers during an outage
