Microsoft and AWS hit by Christmas cloud outages

No entry sign

While many of us have been enjoying some much needed downtime over the Christmas period, several cloud providers were forced to act quickly after their services went offline, leaving (one assumes) users a little less full of festive cheer.

Here, we take a look at what happened and how it was sorted.

Netflix

Christmas Eve and Christmas Day should have been the busiest period of the year for online film service Netflix. However, when developers at Amazon Web Services (AWS) accidentally deleted data critical to Netflix’s running, customers in the Americas were left searching for other forms of entertainment over the festive period.

The disruption started at lunchtime on Christmas Eve, when a portion of AWS’ Elastic Load Balancing Service (ELB) state data was deleted. While the greatest part of the issue was resolved by 8.15am on Christmas Day, it took close to 24 hours for the service to fully return to normal.

AWS has since published an explanation of what happened, and apologised for the outage, the last of several that hit the company during 2012.

Xbox Live

Millions of people will have received a new computer game for Christmas. Sadly for Xbox 360 users, though, the thrill of playing the latest release was put on hold, after Microsoft’s Cloud Save feature broke down on 28 December.

The outage continued for the whole weekend, with users unable to access saved games held in the cloud until 31 December. Streaming services such as Netflix and HBO Go were also affected for some users.

By way of apology, Xbox Live has said it will be automatically applying a one-month extension to the Gold membership of all those who were affected.

Alex Garden, Xbox Live general manager, said in a blog post: “It took longer than we expected to get back to full performance as we needed to ensure the integrity of everyone’s game saves.

“Whether you couldn’t access your game saves for a couple of hours or a couple of days, we sincerely apologise for the delay and inconvenience.”

Garden also said his team would be doing a “thorough post mortem” to avoid a recurrence of the issue.

Microsoft Azure

It was not just gamers who were affected by Microsoft’s technical hitch; the company’s Azure service was also disrupted between 28 and 30 December.

Microsoft initially reported that only users of its storage service in the South Central US region were affected. However, it quickly became apparent the outage was also affecting its global Management Portal.

The problem, which was blamed on ‘faulty nodes’ took over 36 hours to resolve in full, with Microsoft issuing an apology “for the interruption and issues it has caused our customers”.

Jane McCallion
Deputy Editor

Jane McCallion is ITPro's deputy editor, specializing in cloud computing, cyber security, data centers and enterprise IT infrastructure. Before becoming Deputy Editor, she held the role of Features Editor, managing a pool of freelance and internal writers, while continuing to specialise in enterprise IT infrastructure, and business strategy.

Prior to joining ITPro, Jane was a freelance business journalist writing as both Jane McCallion and Jane Bordenave for titles such as European CEO, World Finance, and Business Excellence Magazine.