IT Jobs

Did you know? Techworld now offers an IT Jobs section with hundreds of jobs! Current job listings are now available for Software Developers, Web Developers, Application Engineers, Project Managers, Graduate opportunities and more. Apply for your new IT job today!

Get SMART about disk failures

Can you predict when drives will fail?

Disk drives are mechanical, and so inherently unreliable. ATA-class drives are less reliable than SCSI or Fibre Channel drives, being built for non-24x7 duty cycles. But the actual duty cycles they can withstand are unknown. All that you can know is that they will fail. As Stuart Gilkes, systems engineering director, N Europe, for Network Appliance, says, "Disks drives break. So deal with it."

What can you do to deal with it? Considering that a crashed drive is unrepairable then the 'deal with it' phrase means dealing with it before the drive fails. And that means detecting a failing drive and copying the data off it before it fails.

If a drive develops bad blocks, areas from which data cannot be read, then as long as the drive has RAID protection the data can be recovered. However, Gilks explains that RAID reconstruction can take a fair amount of time, particularly with high capacity drives. As we head towards 500GB ATA drives then the chance of a second drive failing, whilst RAID reconstruction is taking place from an initial drive failure, becomes more likely.

Network Appliance's double parity RAID scheme can cope with that whereas RAID 5 cannot.

We are heading towards a situation where bigger drives means more data loss from drive failures. The good news is that drive failure can be predicted.

Predicting drive failure
Disk manufacturers have added S.M.A.R.T features to their drives. The acronym stands for Self-Monitoring Analysis and Reporting Technology. With it drives have technology to monitor aspects of their status and report it. It has become an industry standard. Whether it is used in your direct-attached storage or drive arrays is another matter.

While the drive is spinning diagnostic checks are made. Anything that is non-standard is noted and, if it persists, the diagnostic system is triggered into sending an alert. Things that are monitored include disk spin speed, sector-level faults, a need for recalibration, spin-up time, head:disk distance, the temperature of the drive, and various aspects of the drive motor, the media and the servo mechanisms.

The ocurrence of errors can be noted and compared to standard performance parameters encoded in the diagnostic system. Let's suppose the drive begins to take longer to reach spin speed, and that more retries are needed to attain full rpm, then it can indicate that the drive's bearings or motor are likely to fail.

Another example would be an increased need for error correction on read data. This could indicate platter surface contamination (restricted to a few disk blocks) or a read head problem (applies to all blocks on a platter).

The S.M.A.R.T. system can only detect about 70 percent of likely failures. It might not seem that smart but it is seventy times better than 0 percent.

Smart disk vendors
- Hitachi GST has a Drive Fitness Test which uses S.M.A.R.T. diagnostics built into Hitachi drives.
- Maxtor has got S.M.A.R.T. technology. For example see here.
- Seagate's SeaTools is its diagnostic suite for Seagate drives. It comes in desktop and enterprise versions and there is even a SeaTools Online version.
- Western Digital has its Data Lifeguard which is S.M.A.R.T.-enabled.

What about drive array vendors? All good arrays will come with diagnostic monitoring that uses S.M.A.R.T. facilities. Some examples:

- EMC has its CLARalert suite for Clariion arrays,
- Dell's Fibre CHannel PowerVault 660F is S.M.A.R.T.-enabled,
- HP's Smart Array 6400 Controller has S.M.A.R.T. technology features,
- LSI Logic's Global Array Manager for RAID arrays has it too.

Often the technology is several layers down inside an overall diagnostic suite.

What can you do if you have a JBOD or small server with internal drives?
Santoods provides an application, claiming it's the only S.M.A.R.T. disk monitoring software that supports SCSI, Fibre channel, IDE, and SSA peripherals on UNIX and Windows Platforms. Techworld reviewed it here.

Shareware SMART application software is available (look with Google for Ariolic) but most probably is not of interest to enterprise users.

With the increasing use of SATA and ATA drives in nearline or secondary storasge applications the use of S.M.A.R.T. monitoring to help prevent disk crashes and subsequent data loss becomes more important.


What are your views on this subject? Use the form below to post a comment on this article up to 500 characters.


Characters remaining: 500

Related Storage news

HP tool offers continous laptop backup

Set it and forget.

Intel fixes drive bricking firmware update for flash drives

Company to re-release SSD software

IBM offers Lotus Symphony on Keepod USB devices

Thin USB device uses VMware to provide secure access to the Lotus suite

Sun claims record-breaking storage array

Says Storage 7000 is fastest on the planet

Related Storage reviews



Email this article to a friend or colleague:


PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Database security: Preventing enterprise data leaks at the source

IDC discusses the growing internal threats to business information, the impact of government regulations on the protection of data, and how enterprises must adopt database security best practices...

Download Whitepaper

Service-oriented security

SOA has become an integral part of enterprise software by providing a framework to efficiently develop software as services that is easily sharable, reusable, and integrated. No where is the need more apparent than in the Identity Management space. Welcome to the age of Service-Oriented Security (SOS).

Download Whitepaper

Data protection prospective vendor checklist

Organisations need a way to map business needs against all these challenges in procuring a technical solution. To help, SANS has developed the following Prospective Vendor Checklist.

Download Whitepaper

Unlock the power of the mainframe

This whitepaper presents the notion of CICS as an integration hub based on a component-based, service-oriented architecture supporting Web services. Highlights will review the challenges and contrasted support for Web services natively in CICS.

Download Whitepaper

Techworld UK - Technology - Business

COLT White Paper

Are all VoIP services the same?

Questions to ask your service provider to ensure you get the VoIP service you need
With careful choice of partner, your business can have all the advantages of VoIP access - reduced costs, flexibility and simplicity - without the drawbacks.
This white paper is your guide to ensure you get right the VoIP service and details the pitfalls which businesses would do well to avoid.

Download white paper
BMC

Ride the express lane in the journey to speed ITIL adoption

Explore the challenges in making the journey to ITIL and the criteria for selecting consulting services
By following ITIL practices, your IT organisation will become more closely integrated with the business. We recommend making the journey to ITIL in a sequence of six incremental steps, the phases of which are driven through execution of a strategic transformational roadmap.

Download white paper

Webcast: IT Financial Management: Cost Optimisation for Efficiency and Agility.
On Demand Webcast
Join this webcast to learn about the techniques and technologies that can help you prove the value of IT to the business by understanding the true cost of today's IT services and those that will be necessary to deliver future success.

Register Today

Site Map

IDG Network

* *