Dealing with key trends during an IT disaster recovery
Cloud services, virtualisation, BYOD and social CRM are key factors influencing disaster recovery
By Bob Violino | CSO | Published: 17:00, 20 July 2012
As we've seen in recent years, natural disasters can lead to long-term downtime for organizations. Because earthquakes, hurricanes, snow storms or other events can put data centers and other corporate facilities out of commission for a while, it's vital that companies have in place a comprehensive disaster recovery plan.
Disaster recovery (DR) is a subset of business continuity (BC), and like BC, it's being influenced by some of the key trends in the IT industry. Foremost among these are
- cloud services,
- server and desktop virtualisation,
- the proliferation of mobile devices in the workforce
- and the growing popularity of social networking as a business tool.
These trends are forcing many organizations to rethink how they plan, test and execute their DR strategies. CSO previously looked at how these trends are specifically affecting IT business continuity; as with BC, much of the impact they are having on DR is for the better. Still, IT and security executives need to consider how these developments can best be leveraged so that they improve, rather than complicate, DR efforts.
Related Articles on Techworld
Here's a look at how these four trends are having an impact on IT disaster recovery.
As organisations use more internal and external cloud services, they're finding that these resources can become part of a disaster recovery strategy.
Marist College in Poughkeepsie, New York, provides numerous private cloud services to internal users and customers. It also hosts services for 17 school districts and large enterprise clients.
"The cloud configuration allows us to perform software upgrades across the multiple tenant systems quickly, easily and without disruptions," says Bill Thirsk, vice president of IT and CIO at the college.
"Because our storage is virtualized, we can replicate data across SANS [storage-area networks] that we have placed strategically on our campus in numerous locations and in our data centre A loss of a SAN means only that production operations switches over to another."
Because Marist can perform server-level backups across partitions, it can move data from one server platform to another should an event occur, Thirsk says.
There's big potential value in cloud-based DR services, says Rachel Dines, senior analyst, Infrastructure & Operations, at Forrester Research in Cambridge Mass.
To date, adoption of these offerings has been low, Dines says, "but there is a huge amount of interest and planning going on at end-user companies. Instead of buying resources in case of a disaster, cloud computing and its pay-per-use pricing model allows companies to pay for long-term data storage while only paying for servers if they have a need to spin them up for a disaster or test."
Cloud-based DR has the potential to give companies lower costs yet faster recovery, with easier testing and more flexible contracts, Dines says.
In a 2012 report from Forrester, the firm says cloud-based DR threatens to shake legacy approaches and offer a viable alternative to organisations that previously couldn't afford to implement disaster recovery or found it to be a burdensome task.
Perhaps the biggest downside to the cloud from the standpoint of DR are concerns surrounding security and privacy management.
"You still see with some major events, such as the lightning strike in Dublin [in 2011] that took out the cloud services of Amazon and Microsoft, that there can be some temporary loss of service," says John Morency, research vice president at research firm Gartner Inc. in Stamford, Conn. "The cloud shouldn't be considered 100% foolproof. If organizations do need that 100% availability guaranteed they need to put some serious thought into what they need to develop for contingencies."
A growing number of larger companies with complex IT infrastructures are putting in private clouds and using these as part of their disaster recovery strategies, rather than relying on public cloud services, Morency says. "They worry about being left out in the cold during a disaster" if service providers are not able to provide service, he says.
Morency notes that this is only true in the case of DR subscription services that provide floor space and actual equipment at a specific geographical location. "Given the more distributed and virtual nature of public clouds, this is much less of an issue," he says.
What the cloud has done for traditional disaster recovery service providers is make testing of their backup capabilities more flexible and less costly, Morency says.
For many organisations, server virtualisation has become a key compnent of the DR strategy, because it enables greater flexibility with computing resources.
"Virtualisation has the potential to speed up the implementation of a disaster recovery strategy and the actual recovery in case of a disaster," says Ariel Silverstone, an independent information security consultant and former CISO of Expedia, who blogs at www.ArielSilverstone.com.
"It also has the ability to make disaster recovery more of an IT function rather than a corporate audit-type function," Silverstone says. "If you have the right policies and processes in place, [with virtualisation] disaster recovery can become part of automatically deploying any server."
Virtualisation enables companies to create an image of an entire data center that can be quickly activated - in part or in whole - when needed, at a relatively low cost, Silverstone says.
For Teradyne Inc., a North Reading, Mass., supplier of test equipment for electronic systems, virtualisation has been an enabler for a much improved DR capability, says Chuck Ciali, CIO.
"We have leveraged virtualization for DR significantly," Ciali says. Using virtualization technology from VMware, Teradyne can seamlessly fail over to redundant blade servers in the case of hardware problems. It can also use the technology to move workloads from its commercial data center to its research and development data center in case of disasters.
"This has taken our recovery time from weeks [or] days under our former tape-based model to hours for critical workloads," and saves $300,000 per year in DR contract services, Ciali says.
Marist College has deployed virtualization, and one of the benefits is avoiding systems unavailability. "We do all we can to avoid any event that would cause users dissatisfaction, loss of access or loss of functionality," Thirsk says. "To do so, we utilise massive virtualisation of our processors, our network topology and our storage."
Because Marist IT can now provide a virtual server, virtual network and spin out storage, "our systems assurance activities move along at a very rapid rate," Thirsk says.
"If at any point of testing something goes horribly wrong, we can decide to trash it and start over or continue forward, all without much trouble at all on the system side."
On the whole, server virtualization has made DR a lot easier, Dines says. "Because virtual machines are much more portable than physical machines and they can be easily booted on disparate hardware, a lot of companies are using virtualization as a critical piece of their recovery efforts," she says.
There are lots of offerings in the market that can perform tasks such as automating rapid virtual machine rebooting, replicating virtual machines at the hypervisor layer with heterogeneous storage, and turning backups of physical or virtual machines into bootable virtual machines, Dines says.
"Ultimately, virtualisation means companies can get a faster RTO [recovery time objective] for less money," she says.
On the downside, the popularity of virtualisation has led to virtual machine sprawl at many organizations, which can make DR more complex. "Companies have the [virtualization] structure in place that gives them the ability to create many more images, including some they do not even know about or plan for," Silverstone says. "And they can do so very quickly."
Another potential negative is that virtualisation might give organizations a false sense of security. "People may fail to plan properly for disaster recovery, assuming that everything will be handled by virtualization," Silverstone says. "There are certain machines that for various reasons are not likely to be virtualized, so using virtualization does not replace the need for proper disaster recovery planning and testing."
Mobile devices in the workforce
From a disaster recovery standpoint the growing use of mobile devices such as smart phones and tablets facilitates the continuation of IT operations and business processes even after a disaster strikes.
"People will carry their mobile devices with them," says George Muller, vice president, sales planning, supply chain & IT at Imperial Sugar Co, Sugar Land, Texas, a processor and marketer of refined sugar.
"I might not carry my laptop wherever I go, but if all of a sudden we've got a disaster I've probably got my Blackberry in my shirt pocket. Anything that facilitates connectivity in a ubiquitous way is a plus."
One of the positive impacts of the prevalence of mobile devices is that it gives people a greater ability to work remotely and communicate using their devices in an emergency, says Malcolm Harkins, vice president of the IT group and CISO at microprocessor manufacturer Intel Corp. in Santa Clara, Calif.
But mobile device proliferation has also made disaster recovery slightly more complex, Dines says. "Along with mobile devices comes more data center infrastructure, such as mobile device management and [products] such as the BlackBerry Enterprise Server, which are often very critical," she says. "This becomes one more system that must be planned for and properly protected."
Another possible negative with mobility in a disaster recovery scenario is that some critical enterprise applications, such as payroll, might not be available for mobile devices, Silverstone says.
Harkins notes that there are potential security risks, such as non-encrypted mobile devices being lost or stolen, and unauthorised access to corporate networks from these devices. But these risks can be overcome by the ability to wipe out data on devices remotely over the internet.
Like mobile devices, social networking gives people another way to stay in contact during or after a disaster.
"We've seen instances such as a couple of years ago when we had major snow storms on the east coast and a lot businesses shut down and employees kept in touch with each other via Facebook and Twitter vs. email," Morency says.
In some cases it might take days or weeks for a corporate data center to recover after a disaster. And if the company is relying on internal email systems that might put email service out of commission, Morency says.
"Assuming that either public or wireless networks are still available you can now be using social media to communicate, as an alternative to in-house email which may not be available," Morency says.
"If you're using a service like Gmail than it's less of an issue. But if you're using an Exchange-based internal email or directory services, than social media may be a more available alternative."
During a recent disaster test that Marist College performed, "we were curious to see how social networking would be used in case of an actual event," Thirsk says. Early one early morning the IT department launched an unannounced disaster drill. "While we had warned staff we would be doing this, they had no idea how real we were going to make it," he says.
First, Thirsk sent a message that the college was experiencing a massive system failure. Due to building conditions, staffers could not report to their work place or to the data center. "We shut down our enterprise communications systems and then watched how the staff responded," Thirsk says.
Managers quickly began communicating to their staff via outside email accounts, chat rooms, Facebook and Twitter. "They even found my personal email account off campus and began messaging me," Thirsk says.
In a matter of 20 minutes, all staff had reported to a command center in the campus library, where they were tasked with performing a number of system checks, verifications and processes. "All of this activity occurred using alternate communications methods," Thirsk says. "We documented this exercise and now use it as part of our plan."
Forrester says there are several reasons why social networking should play a role in an emergency communications strategy. For one thing, social technology adoption is increasing, and a greater portion of employees and customers have continuous access to social sites such as Twitter and Facebook.
In addition, social channels are essentially free. It costs very little to set up a Facebook, Twitter or Yammer profile, recruit followers, and send out status updates.
Social media sites can also facilitate mass communication with external parties, the firm says. Typically, during a crisis immediate communication is limited to internal staff. However, companies should also plan for situations that call for communication with partners, customers, public officials and the public at large. Social media sites make it easy to establish these external connections.
Finally, the environment of social discussions provides mass mobilization and situational awareness. The value of social networking sites offers unique advantages in the crisis communications arena, Forrester says.
One downside of social networks for disaster recovery is that social networking "by its very nature has the ability to increase FUD - fear, uncertainty and doubt," Silverstone says. "So I would advise companies that they need to have a policy in place [on how to use social networks] long before a disaster happens, and think about many different possibilities and manage all access and data sharing on social networks like any other communication effort."