Can cloud help deal with peaks and troughs of demand?

Angry businessman punches through his laptop screen

One of the benefits of cloud computing is the ability to respond to the elasticity of demand. When it comes to e-commerce there are some predictable peaks in demand such as the Christmas period, and therefore it should be easy to plan for any sizeable increases in website and back end activity.

Yet there are times when even the forecast levels of demand are exceeded, which can cause havoc to an organisation’s ability to deliver information or to fulfil any financial transactions.

The ability to access cloud services could also be affected if misguided demand and capacity forecasting lead to poor planning. Organisations therefore need to risk assure their on-premise IT and cloud services, predict as accurately as possible what would happen if demand exceeded their expectations and how they would deal with the extra workload on their systems.

Complacency can lead to denials of service in the form of website or application crashes, slower front end performance, revenue losses and it can damage an organisation’s reputation. The latter is because customers will remember it as a bad example of customer service, and perhaps go elsewhere.

Students spike UCAS

The Universities & Colleges Admissions Service (UCAS) suffered from a spike in demand on 18 August 2011, which caused its website to crash. The catalyst for this was created by students, their friends and families being spurred on to save money against next year’s prospect of university fees trebling to up to £9,000 per year. This is more than treble the cost of the current cost of attending a university course.

“The volume of applications this year is roughly five times that of 2010, but we have a five-fold increase in users trying to log into Track”, says James Woodward – a spokesman at UCAS. Rather than being prepared for the increase in demand to ensure that the service remained up and running for those accessing it to find a university place, he explains that a decision was made to take the organisation's website down. This was strangely seen as the safest option, and he claims the students’ ability to “add a clearing choice was not affected and it opened at 6pm on that day as planned.” The spike involved 644 visits a second at its peak.

Questions need answering

Bryan Foss, a visiting professor at Bristol Business School and an independent commercial and public sector board level advisor, asks some poignant questions about how UCAS determined the amount of demand that was expected. He wonders whether it was based on its 2010 figures, and whether they were a good demand predictor considering the spectre of next year’s fee increases being on the near horizon. “It appears that the students, their friends and families – the users - are changing their access patterns”, he explains. This is because as soon as the students received their A’ Level results they began to access UCAS’s website using a variety of channels: eg mobiles, home PCs, tablets, etc.

He asks: “When students have worked very hard for two years for an A’ Level result, and committed themselves to their first choice of university and spend the last month waiting for an exam result that tells them whether or not they have gained their chosen place, why wouldn’t they want to log on immediately to see if they have got it?” In his view a situation like this can lose an organisation much credibility. He believes that UCAS should and could have planned to meet the increase in demand by “applying elastic production resources which would reduce the possibility of a service failure from occurring during times that are perceived as critical by its customers.

Unforeseen causes

Steve Palmer, product manager for data solutions at Azzuri Communications, says the key message is that “you can’t expect to foresee what will generate the flash crowd, but you can plan your response to it by considering the worst case scenario, then double the demand that causes it and then do so once again.” This will enable an organisation like UCAS to plan to have the right level of capacity to meet any spikes in demand as and when they happen. He adds that CIOs are making decisions about whether a day’s outage out of 365 days is worth worrying about, but adds that organisations should consider how they are going to use cloud services to meet additional demand.

A holistic approach is required to avoid any downtime, argues Cap Gemini security consultant Steve Allen. “Start with a threat analysis, which would have helped UCAS to predict that a significant spike in demand was likely because of the increase in university fees”, he explains. If UCAS had bought 4-5 times more hardware than they had available, then it might have been able to cope with the spike in demand. Yet most of this would have been left redundant for most of the year. “This is why they would have been very unwilling to use this approach”, he believes. The squeeze on public funding makes life more difficult as it constrains public organisations’ ability to purchase new hardware.

Comparison: London 2012

London 2012, the organisation behind next year’s Olympics, faced a similar situation with its e-commerce ticketing services. It claims that its website didn’t crash, but a spokesman admitted that the site slowed down significantly. Allen says that if a customer can’t get what he or she wants, the perception will be that the system has crashed – even if it hasn’t.

“The situation for the Olympic site was much simpler than the one faced by UCAS as people knew what they wanted”, he says. Customers were also less likely to want to return to London 2012’s ticketing site once a bid for a ticket had been made. Due to the nature of UCAS’s service, students and other interested parties are likely to return several times, and so the nature and frequency of each transaction needs to be considered.

Customers tend to see the front end. The UCAS and London 2012 websites are examples of it, but at the back end of them you have the systems that enable an organisation to deliver its services. In the case of London 2012’s Olympic ticketing website, the credit card payments and other related transactional activities would occur in the back end. This could involve on-premise, hybrid of cloud-based systems. So Allen argues that it’s important to ensure that the back end servers can cope with any increased volumes in activity in order to fulfil a customer’s expectations. “Each layer of your system has to be working at maximum efficiency to enable the customer to receive an optimal service”, says Allen.

Top risk assurance tips

After speaking to a number of industry experts, including the ones mentioned in this article, CloudPro recommends that your own organisation should consider the following top tips, which will enable you to risk assure your own IT and cloud services to remain operational at peak performance:

  1. Ensure that the risks are clearly identified and managed throughout the development, procurement and operation of an IT service;
  2. Be agile by predicting and planning for the occurrences of expected and unpredictable peaks in demand to ensure that your e-commerce and other transactional services are not affected in an adverse way;
  3. “Use service assurance solutions to link real end user experience, transactions and applications with the network infrastructure supporting them”, advises Colin Bannister – CTO and Vice-President of UK and Ireland at CA Technologies;
  4. Don’t think of it as an IT problem alone, but one that the whole organisation faces as this will enable you to better resolve the issues;
  5. Make sure that your cloud and on-premise systems are all contributing to your ability to manage the spikes;
  6. Use the flexibility and scalability of the cloud to full effect rather than buy hardware that will lie redundant for the rest of the year;
  7. Make sure you work with a reputable cloud provider that has the capacity and the ability to increase your capacity as and when a spike in demand occurs;
  8. Ask experts to help you to analyse, risk assure and plan for spikes in demand, allowing you to look forward to them rather than dread them;
  9. Ensure that the cloud provider is compliant with data protection requirements and any other regulatory or legal obligations;
  10. Plan as if you would for disaster recovery by making sure you have a range of options to allow your operations and services to continue without any hindrance;
  11. Understand the implications of failing to meet your stated objectives and consider the true costs of an outage, while also challenging any demand assumptions. This includes making sure that senior executives understand the risks that are involved with such a failure.

The cloud or a hybrid model provides an ideal platform for delivering e-commerce and related online services. Yet as Foss says they need to have a high assurance level as internal, outsourced or even as offshore operations. He concludes: “There needs to be confidence that these systems can properly support the achievement of the organisation’s service objectives, and that they are not significantly disrupted by capacity, business continuity, security, data protection and other potential issues.” So to risk assure your IT and cloud services what you need to do is carefully plan, plan and plan ahead.