Dealing with disaster recovery

Ten years ago, only major financial companies could be counted on to have a comprehensive strategy in place to ensure the recovery of key business data and the restoration of major systems in the wake of a disaster. But now, with the range of possible disasters on the increase - from flooding to terrorism - and with the sharpening of regulations mandating the need for better data management, the urgency for a sound disaster recovery (DR) policy has spread to many more types of organisation. Here we talk to two senior IT professionals to find what recovery potential means to them.

Richard McGrail, head of IT with investment management firm Baillie Gifford:

Baillie Gifford is one of the UK's leading privately owned investment management firm. Headquartered in Edinburgh, the company does business across the UK, North America, Japan and Europe. With 430 staff and around 25 billion of client funds to manage, it has had a disaster recovery strategy in place since 1994, making it an early adopter, even by the standards of its own market sector.

Data recovery provision has been put in the hands of SunGard, and its Recovery Centre in nearby Livingston.

But responsibility for recovery has by no means been outsourced, nor marginalised within the IT function, says McGrail. "There are a large number of people here with responsibility for disaster recovery, deliberately not simply one," he says. "We take it very seriously, and have regular tests runs to ensure everyone knows what to do."

He says that as an investment company, Baillie Gifford is typical of the sort of organisation that is expected to have a recovery strategy in place. "But I think all organisations have a duty to see to it that they keep going in the event of a disaster - not just financial services ones. It's probably an easier job for us than for, say, a manufacturing company with a huge and elaborate supply chain. In financial services, our supply chain is electronic, not physical, so that makes key data easier to protect."

With nearly 14 years elapsed since the firm decided to protect its livelihood in the event of a disaster, it has moved a long way, says McGrail: "There wasn't that much technology involved back then - some PCs and a few servers," he recalls. "We've now evolved our recovery strategy so that all key systems are now replicated so that we can be sure of being up and running again within four hours of all but the very worst kind of disaster."

He says that there has been no serious disaster in that time, but that there have been situations where world events have affected the business. "Our disaster recovery committee met a few times in the week after 9/11," he says. "It's interesting how the language around disaster recovery has changed in 14 years. Back then we were guarding against things like flood and fire, but now we're talking about bombs and terrorists."

He says this change means that it's not just your own premises that you need consider a disaster happening to. "Our offices are near Edinburgh railway station, and a terror disaster there might well affect us," he says.

And it's not just physical disasters to property that matter, he says. "You've got to consider what would happen if there was a pandemic and some of your staff fell ill," he says. "The rest might not want to come into the office, so you'd need provisions for them to work at home and a command and control centre to manage that. You have to think of it from all different angles."

He says that, more and more, the firm gets a lot of questions about disaster recovery from clients. "It's important that we can allow them to tick their boxes and know that their business is safe with us," he says. "Some go further than simply asking if we have a disaster recovery strategy. They want proof that it works and has been tested. It's not a 'nice to have' for our clients, it's a 'must have'."

He says to begin with there were a lot of people in the organization who thought that disaster recovery was a waste of time and money. "But I'm not getting that impression now," he says. "There is a cost attached, but it would be difficult to say what percentage of our IT budget that is. That's because some of this spend relates to day to day contingency, and not just to disaster recovery. Obviously too big a spend would start to nullify the benefit, so you need proportion."

Dave Lipsey, IS infrastructure manager for Ordnance Survey:

Ordnance Survey has been providing accurate and detailed geographic information for customers for more than 200 years. Employing over 1,400 people, it offers a range of products for both business and leisure uses - from complex digital information to traditional walking maps.

Its surveyors use high-tech measuring equipment to gather information, capturing details as fine as the shapes of individual buildings, the precise alignment of roads and pavements and the exact location of public telephone boxes.

"Our servers hold data from our GIS systems, as well as map information," says Lipsey. "Also, we have aerial shots of the whole of the UK in high definition. We hold 450 million items in our databases, many of them images that are up to 700Mb in size."

To store all this information securely, the organisation has a 700Tb SAN that supports a number of servers running on different operating systems and platforms including VMS, Oracle, Windows, Sun Solaris.

"We have, as you might expect, major back up requirements, with 48 tape libraries," says Lipsey. "Until we put a disaster recovery strategy in place, we had problems here, as we had to stream the backed-up data onto the same reused tapes."

To establish a continuity strategy, Lipsey chose Data Domain and its enterprise protection storage systems both to help with backup and to deliver network-based disaster recovery protection.

"It's a much better way of doing things than we had before," he says. "We're able to keep 30 days of backed-up data at a time, but without risk of any loss as it's all replicated into identical appliances on the disaster recovery site. It's much faster to use, you can store more on it, and if we lose a major system we have the data off site. We could restore it in two hours."

He says even if the whole central headquarters was lost, or the primary data centre, he'd still have the majority of data which would 'give a head start in recovering'.

"We used to have the whole of our database spread over many tapes, so if one was lost that could affect the whole back up," he says. "That's not the case any more. We've also improved compression ratios by 15 to 1. And there's no more scrambling around for a tape to reuse on a Friday afternoon."

He says the organisation's customers include local government departments, central government and energy companies, all of which now impose demands that make recovery essential: "We have SLAs in place with them, so if we don't supply data to them on time we suffer a great deal of pain," he says. "We're perhaps not under the same pressure as, say, a financial services company, but we absolutely need to provide for certain scenarios."

Five years ago, he says, there was no disaster recovery provision at all. "We do now, thanks to senior management buying into the idea," he says. "We weighed the cost up, and it looked favourable. There's a good case financially for it. Although what we have now wasn't a hard case to make, the next stage might be. I'd like to expand what we have and get rid of tape altogether. We have a two hundred year history of keeping data responsibly, and the accuracy and currency of our data is key to our continued success, so hopefully I'll be allowed to make further improvements and keep this going."