Backup and recovery: Solving the storage puzzle
An EMC consultant discusses the differences
By Iain Anderson, CIO Canada | Published: 18:00, 21 March 2006
Backup and archival storage continue to be key issues for enterprises large and small, and the reasons for this are fairly simple. First, information growth continues to explode. In fact, industry experts estimate information to be growing at 60 per cent per year for most businesses. Second, regulatory and legal issues are having an impact on how information is kept and protected. Generally companies have responded by either keeping data forever or deleting it as fast as possible, both of which are problematic. And third, most IT professionals are using outdated backup and archive practices, making their lives more difficult.
For instance, how many times does the same file get backed up in your infrastructure even though it hasnt changed? How many copies of that same file are clogging up your backups? Are your tier-one systems burdened with infrequently accessed fixed content (information that is no longer actively changing)? If so, you are not alone. IT professionals are wrestling with this issue as costs continue to escalate and recoverability requirements increase.
Integrating and aligning your backup, recovery and archive environment offer some tangible benefits and cost savings by optimizing recovery and retrieval of information and production performance.
Backup at the forefront
Lets review two primary traps most companies fall into. The first is focusing on backup speed instead of recovery or information retrieval. The second is using backup images as archive copies.
Backup windows are daily problems everyone deals with and they generally receive the most attention. Businesses are often faced with the prospect of not completing a backup in time to start production, and wrestle with the tradeoffs of degraded production performance or ending the backup job. Neither option is attractive.
Organizations frequently expedite backups by using techniques such as multiplexing multiple servers to a tape system, or using differential or incremental backups. And, while its not an easy thing to admit, IT professionals sometimes skip data or simply do not backup as frequently as they would like. These shortcuts often help increase backup speeds; however, recovery time should be the primary focus. The processes used to speed backups, like incremental backups and multiplexing, often worsen recovery time; and the ability to recover quickly is the reason for backing up in the first place.
Consider the healthcare industry, where recovery speed is essential. Nowadays, information management is an integral part of healthcare and as such, it has a huge impact on the delivery of that care, explains Roy Southby, Director of Technology Services for British Columbias Interior Health Authority. The ability to quickly recover is immense. It could even save lives.
Since many IT professionals have ready access to backup images, they rely upon these for long-term data retention and archive needs. Backup images were never meant for this purpose. A backup image does nothing to address redundancies in the data, treats all information the same even if it hasnt changed, gets very bloated over time through data growth and accumulation, and offers little long-term protection from media corruption.
Archiving vs. Backup
The first step in moving to a new solution is to understand the difference between backup and archive. Backups are copies of active production information that are used when a problem arises within a production environment and a recovery copy is needed to get the business back up and running. Since backups are focused on constantly changing business information, a newer and known good copy is always preferred to an older copy. Thus backups are generally short-term and often overwritten.
Archives do not focus on recovering an application or business, but allow for information retrieval, usually at the level of a file, e-mail or other individual piece of content. So, archives are not copies of production data but rather the primary version of a piece of data often inactive or non-changing data. In fact, when data stops changing or being frequently used, it is often best to move it to an archive, where it lives outside the daily backup window and can still be accessed.
When we move fixed content into an online active archive and out of the production environment, we can garner numerous benefits. Tier-one production system capacity is reduced. Backups are now on smaller active data sets, so backup windows are shorter. Recoveries have less-static content in them, so are much faster and easier to manage. Information retrievals come from the searchable online archive, not out of a backup image. Application performance improves and is more consistent since there is less data in the system.
Since the backup image will now be smaller and more manageable over time, many companies use this time to move to new, affordable disk-based backup and recovery options. Disk-based backup and recovery delivers significant benefits relative to recovery time, manageability and reliability. When you couple disk-based recovery with active archiving, you experience a significant increase in your recovery and retrieval capability.
Archiving technology: There are many new technologies to help organizations integrate and align backup, recovery and archiving. For archiving needs, there are software tools available for a variety of applications. These software tools work with applications to find and move static data, based on policies you determine, from the primary system to a second tier of storage. This leaves the information online to the application, out of the daily backup window and on a lower-cost storage medium. Many customers are seeing significant value from archiving e-mail and file-serving data. Database archiving and enterprise content management solutions also can help move content out of the production environment for organizations that have pain points in those areas.
Backup technology: One of the hottest topics in IT today is the movement toward affordable disk-based backup and recovery.
Tape is the traditional medium for backup, recovery and archiving and is the low-cost solution, explains Warren Shiau, lead analyst for IT research at The Strategic Counsel in Toronto. Something that could be delaying people from looking at disk-based solutions is that a lot of processes are already in place around tape. The staff knows the processes and knows the system, and tape is very easy and low-cost to scale. This in part is leading to somewhat of a delay in the adoption of disk-based systems, which really are for backup and recovery purposes a superior solution.
In fact, there are many backup-to-disk software options that can improve performance and maximize the benefits of disk over tape. The best solutions will have high performance and proven functionality for recovering from mixed disk/tape environments.
The introduction of lower-cost (ATA) disk drives has led to the creation of cost-effective, disk-based backup platforms of multiple flavours. Standard RAID systems with ATA drives can offer cost-effective solutions for backup and the flexibility to support other applications on the less-expensive drives. These systems work with advanced backup software that can utilize disk-based capacity for backup and recovery. Another popular type of backup-to-disk architecture is an appliance that emulates tape libraries. These appliances allow you to maintain the processes and software you have today and generally offer advanced embedded functionality for compression, replication and simplified management.
Making backup, recovery and archive work for you
To begin, organizations must first understand what information is within their environments and where the biggest challenges are.
Looking at data and categorizing it can often help companies choose the technology that will work best, notes Shiau. For instance, the requirements for active data having to do a lot of write-overs, constantly backing up, a lot of recovery cycles dont match up to tape. Tape has physical limitations: as it starts to wear out, that can corrupt data. It is also difficult to do recoveries compared to disk. For this type of data, disk makes the most sense.
For static data, however, Shiau says tape is still the best solution because of its cost-advantage and because its portable. Companies can physically move tape to a storage facility easily, and retrieve it as required.
Here are some questions to help get the process started:
- What applications are most problematic in terms of backup, recovery and archive?
- How much of your data in primary applications is static content that could be moved to an active archive?
- Which applications do you need to retain data for long periods of time? Are there specific regulations to meet?
- What applications do you need to have for the fastest recovery? How much down time is allowable?
- How much do you spend on tape media? Could you spend some of that on upgrading to an integrated disk-based environment for backup, recovery and archive?
The advantages of deploying an integrated and aligned backup, recovery and archival solution are significant, however it can be a complex undertaking. The best thing you can do is to start thinking about it today.
Iain Anderson is the Client Director for EMC Consulting in Canada, which provides consulting services on the strategic, architectural, operational and managerial best practices for sustainable information infrastructure. EMC Canada (www.emc2.ca) is headquartered in Toronto.