Basket £ 0.00 (0 items)
You are here: HomeArticle › Practical business continuity

Practical business continuity

So how would you cope if your head office suffered at the hands of fire or some other disaster? In this special ITadviser feature, we look at the experiences of organisations who have had to deal with that very situation and assess how their business continuity plans stood up to the test. We also examine the different approaches taken to backing-up data, including the requirement for SOX compliance.

The Erith Group

Background

When disaster struck construction services company The Erith Group in September 2006, IT manager Paul Driscoll was already investigating ways to remove as much complexity as possible from his systems management strategy. In many ways, the fire which razed the £60 million company's head office to the ground simply reinforced his determination to streamline the IT infrastructure so that it would be more resilient in the wake of future catastrophes - and easier to manage on a day-to-day basis.

With a 200-strong workforce nationwide, Erith Group is heavily ''dependent on systems availability to manage and deliver a range of services that includes asbestos removal, demolition and haulage. The fire, caused by a neighbouring company's illegal storage of gas cylinders - one of which torpedoed the boardroom and knocked out the firm's systems - could have disrupted business indefinitely. Nobody was even allowed access to the site for 48 hours.

But Driscoll was midway through a pilot test for Google Apps, and half the board was already using the hosted suite. 'Despite the back-up working correctly, such a catastrophic incident cannot fail to have a major impact on the business,' says Driscoll.

'My job was to minimise the impact and get the systems running again. Almost half our senior executives had migrated to the hosted software so were able to access e-mail, calendar and document applications from laptops and home PCs with an internet connection. This meant we could conduct 'business as usual' while I recovered the wider systems.'

But even with the Dunkirk spirit in full flow and staff accessing the system from Starbucks, McDonald's and their home broadband connections, the fire exposed the disadvantage of those who hadn't yet migrated. Fortunately, although one 'old school' colleague had neglected his back-ups and so had no document access, most people found that their essential content and data was being held 'virtually' by colleagues: the system had already become a substantial document repository.

Business impact

Driscoll describes himself as a 'belt and braces' IT manager - his last action after dialling 999 was to pick up the hard disk (everything was backed up using Solarsys) as he left the building.

'In my experience, people are either arrogant or forgetful when it comes to backing-up, so it has to be done without human intervention,' he says. 'Every Friday I swap the hard disk and take it home! It isn't a big deal and it's much better to lose a week's worth of back-ups than to have no access to your system because you can't get near the building!'

So despite the premises being reduced to a 'pile of ash,' Driscoll was able to rebuild his IT systems from scratch, and put together a strong business case for wholesale migration to Google Apps. He discounts concerns about entrusting corporate data to a third party - albeit a household name - in California.

'They are hundreds of times bigger than us and haven't done badly at storing data so far,' he says. 'And frankly, if Google burnt to the ground, there'd probably be something a lot more sinister going on in the world than the fire we experienced.

'After our disaster, it was a great deal easier to convince the board to migrate. There is no server to log onto, so it was a simple process of moving across.'

Since implementation, Erith Group has considerably reduced its IT costs: it now pays £25 per account, per year, and benefits from a significant increase in e-mail capacity. Employees can access their work and collaborate on documents from any location, virtual meetings can be held and engineers report graphically on demolition progress by uploading photographs.

'To achieve the same level of functionality, the business would have had to allocate an additional budget of £38,000 per year - and we would still not have seen the productivity benefits delivered by improvements in working practices,' says Driscoll.

One of the problems exposed by the fire - the vulnerability of the paper trail - prompted Driscoll to implement a document scanning regime, which means that staff from any of the company's departments can access this content at any time.

Lessons learned

'The figures speak for themselves,' he says. 'For an outlay of £2,000 per annum, I don't need to buy a server that I'd have to replace every two or three years, plus all the software licences and the VPN infrastructure.

'If I was going through the process again, I'd make sure I had all the technology-savvy managers on side before going to the board. There was the inevitable reluctance from people who don't traditionally understand IT. Even if there had been no fire, with more technology-aware managers on side, it would have been easier to show them the business benefits and roll the system out.

'My main piece of advice, though, is that if you're backing-up to a portable hard disk, put a physical barrier up behind your disk to protect it from flying missiles!'

Tagish

Background

High-tech companies are often the worst hit when disaster strikes and the need for a business continuity strategy is graphically highlighted. The irony is not lost on Andrew Fisk, managing director of Alnwick-based Tagish, which provides web technology services for the public sector. He'd only been in charge for six weeks when fire destroyed the company's head office one Saturday morning in February 2006.

'We did have some disaster recovery plans in place, but they were mainly focused on the loss of individual systems and servers,' he says. 'But we had completely under-estimated the impact of losing everything. The company had been in business since 1995, so there was a lot of paper-based storage in the office - signed contracts and so on. After a six-hour fire, nothing came out. So yes, it's often the high-tech businesses that are caught out when these things happen.'

Although Tagish's disaster recovery plans enabled the business to re-establish and maintain most client services within a few days, the sobering fact was that new sales were effectively stopped in their tracks for six months after the fire. The firm has now fully recovered, but Fisk admits that it simply wasn't properly prepared for business continuity.

'Our disaster recovery plan was invoked by Saturday afternoon but only got going big-style on Sunday,' he says. 'We split into teams, tasked with getting the technology up and running, communicating with our suppliers and partners, liaising with the insurers and maintaining ongoing projects.

'Because we were in a rural location, there was no alternative office space available. We were able to make a few interim arrangements, borrowing meeting rooms from other companies, but the majority of our 20 staff were working from home - and that's a challenge when you're thrown straight into it.

'We knew of the services we could use because of the business that we're in, and of course we were already using different internet services including hosted project management, so we switched some internal project management over to external services. One of the most important things was to be able to record information in a secure location online.'

Business impact

Fisk says that thanks to the speed of communication and restoration of services, Tagish only lost one customer following the disaster. Most were supportive and several, inspired by Tagish's experience, took the opportunity to review their own business continuity strategies.

'But we didn't make any new sales for six months,' he says. 'So the knock-on impact is considerable, and of course setting up a new physical office takes time and resources which are diverted from other areas of the business.'

Maintaining cash flow was also a priority - a challenge complicated by the destruction of paper archives including information about money-owed, which was clearly vital to the restoration of the business. The experience has had a radical impact on the company's IT and communications strategy.

'We now use a lot more off-site hosting of systems and have them replicated between different data centres,' says Fisk. 'We've become less reliant on the physical office, so a lot of our communications don't need the office to be enabled. We use VoIP for our main telecoms platform for example, although it isn't yet integrated with our mobiles. I think the system we now have is resilient. It's a balancing act between how much you pay for it, the level of resilience and the amount of control you have. We've gone for a modular approach which gives us control over specific key areas.'

Lessons learned

Fisk says the human element of any disaster recovery strategy can't be under-estimated. And some of the most important lessons learned from an experience like this are practical rather than at the cutting edge of technology.

'The first thing to do, after you've got the basics sorted out, is to calm down rather than rushing into things immediately,' he says. 'Anyone who can do anything useful is likely to be in shock. Some of our staff had been with the company for 20 years (Tagish was a spin-off from a previously existing firm) so it was a lot to take in.

'We were on the top floor - and we have now relocated to ground floor premises specifically because of the fire experience. We also learned the importance of knowing exactly what you have in the office. We've taken a video recording and we now store it off-site. It's such a simple thing to do, but it would have saved us many hours of trying to remember!'

Crucially, the fire also revealed that some staff had fallen into the habit of storing information on their local machines without backing-up - a practice that has now been firmly stamped out. In addition, says Fisk, the company now keeps all of its software licence agreements off-site, to counter any possible disputes about software assets should disaster ever strike again.

Fisk says the company was under-insured in terms of business continuity implications and has 'massively' upped its level of insurance to counter similar situations in the future. Usefully for an IT services company, Tagish also found that the fire gave it a chance to clarify service level agreements with its customers - some of whom had expectations well above what they were actually paying for!

'We're always looking at technology to support the business and the risks associated with the business as a matter of course,' says Fisk. 'But the most important aspect of business continuity, even for an IT company like us, is the staff. Without their commitment, we wouldn't still be here and the best-laid disaster recovery plans wouldn't have made any difference.'

OmnicomMediaGroup

Background

OmnicomMediaGroup is a prolific global media agency with a strong UK presence: more than 850 media planners and buyers in London and Manchester. In 2005, with data volumes growing at a rate of 30 per cent per year, the five-strong IT team was juggling a rising tide of back-up tapes and courier costs with the pressing need for a more forward-looking disaster recovery strategy and - particularly as a US company - SOX compliance.

'We were using a classic server and back-up routine, relying on a relatively local tape storage company,' says associate director, shared technology services, David King. 'But as the Buncefield fire showed, close proximity storage wasn't necessarily the smartest idea. Plus, the company was looking at the cost of consumables and the price of disk storage was coming down significantly.'

Omnicom's data was backed-up to tapes on a Compaq storage device, requiring an expensive multi-tape drive library, and with the ongoing headache of tape rotation management. When a specific document was required, it could take up to two weeks to locate and retrieve. At the same time, the tape drive itself was coming to the end of its warranty and the number of 'dirty' read and write tapes was rising - together with the need for more care packs, maintenance call-outs and cleaning cartridges.

After a comprehensive study of alternative options, Omnicom decided to pursue a hosted storage strategy that would meet its back-up, disaster recovery and business continuity requirements in one fell swoop. It chose Thinking SAFE's web-enabled platform, initially outsourcing the hosted element of the strategy to the vendor before bringing it back to its principal data centre in London's Docklands.

Business impact

The biggest difference to the IT operation, King says, has been the invisibility of the back-up process: 'You get right away from the tape-changing scenario, which is replaced by an automated file copy every evening,' he adds.

'That leaves the IT team to get on with their real desktop support roles. Apart from checking their e-mail in the morning for any alerts, there are no logs for them to trawl through. As far as restores are concerned - and we don't have many requests - they are simply drag and drop processes. You open up the web interface, drill down through the file structure, find the version of the document you want, and restore it.'

'This approach provides the speed of recovery that's so essential to a media agency like ours,' says King. 'What used to take around two weeks can now be achieved in just 15 minutes, so once somebody knows there's a problem with a document, it is very quick to retrieve.'

From a disaster recovery perspective, King explains, the instant availability of a copy of the data means that it is a relatively quick process to build a new server, replicate the data and point the user at the source. Once the physical site becomes available again, the IT team can rebuild the server offline and point them back without anyone noticing the difference.

He estimates that investment in this type of web-based hosted back-up model pays for itself within two years. But compliance benefits are equally important. When a client recently queried the length of time their data was being held for, it required a simple process of location and deletion rather than a lengthy trawl through the old tape library.

'The system has allowed us to implement a document management policy,' says King. It marks data for deletion and alerts us when it's time. When you have 90 days of logs in a tape system, it's a much more complicated issue.'

Lessons learned

'With hindsight we possibly put too much reliance on disk,' he says. 'You should be very careful how you design your disk architecture. We've never had two fail at once but if we'd suffered multiple failures, there'd be a strong case for not having all our eggs in one basket.'

King also recommends mixing up disk speeds in order to save cache, and expects advances in caching technology and solid state memory to play an important role in the company's future strategy.

'It depends on what you're looking for - proper data management or just replication,' he says. 'But I'd recommend our solution to anyone. As a group, we have over 37 offices across Europe, and there's the real possibility of a global roll out. At the moment, we're discussing an expansion of the Thinking SAFE platform to allow our European offices to buy storage on it so we can extend the recovery benefits to them. With such a flat model, it would be very easy to work out the pricing.'

(ITadviser, Issue 56, Winter 2008)

 

Contact

For more information about The National Computing Centre and our services, please contact us at the details below:

Email: info@ncc.co.uk
Telephone: +44 (0)870 908 8767
Fax: +44 (0)870 134 0931

Click here for more contact information


TwitterFollow us on Twitter
Linked InJoin our LinkedIn Group
FBLike us on Facebook

 

Management Guidelines

NCC Guidelines Vol 5 No 1

more in Management Guidelines

 

Professional Development

Cloud Computing

more in Professional Development

 

Analyst Digest

September 2016 Bulletin published

more in Analyst Digest