Follow Us

We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message

Why disk statistics can fail to satisfy

Vendor hard drive failure rates: Myth or metric?

Article comments

The statistics of mean time between failures (MTBF) and average failure rate (AFR) have got lots of attention lately in the storage world, especially with the release of three much-discussed studies devoted to the topic in the last year. And for good reason: Vendor-stated MTBFs have risen into the 1 million-to-1.5 million-hour range, equalling 114 to 170 years, a life-span that no one is seeing in the real world.

Three studies over the past year on MTBF include the following:

-- Google's "Failure Trends in a Large Disk Drive Population"
-- Carnegie Mellon University's "Disk Failures in the Real World"
-- University of Illinois: "Are Disks the Dominant Contributor for Storage Failures?"

"MTBF is a term that's in growing disrepute inside the industry because people don't understand what the numbers mean," says Robin Harris, an analyst at Data Mobility Group who also runs the StorageMojo blog. "Your average consumer and a lot of server administrators don't really get why vendors say a disk has a 1 million-hour MTBF, and yet it doesn't last that long."

Indeed, "how do these numbers help a person who wants to evaluate drives?" says Steve Smith, a former EMC employee and an independent management consultant. "I don't think they can."

Even storage system maker NetApp acknowledges in a response to an open letter on the StorageMojo blog that failure rates are several times higher than reported. "Most experienced storage array customers have learned to equate the accuracy of quoted drive-failure specs to the miles-per-gallon estimates reported by car manufacturers," the company says. "It's a classic case of 'Your mileage may vary' - and often will - if you deploy these disks in anything but the mildest of evaluation/demo lab environments."

Study results

The upshot of the recent studies can be summarised this way: Users and vendors live in very different worlds when it comes to disk reliability and failure rates.

Consider that MTBF is a figure that's reached through stress-testing and statistical extrapolation, Harris says. "When the vendor specs a 300,000-hour MTBF - which is common for consumer-level SATA drives - they're saying that for a large population of drives, half will fail in the first 300,000 hours of operation," he says on his blog. "MTBF, therefore, says nothing about how long any particular drive will last." In other words, MTBF does a very poor job communicating what the actual failure profile looks like, he says.

It's like providing the average woman's height in the US but without showing the numbers used to derive that average, Smith says. "MTBF became the standard because it was perceived as a simpler answer to the question of reliability than showing the data of how they arrived at it," Smith says. "It's an honest-to-God simplification."

Stan Zaffos, an analyst at Gartner, agrees. While he believes MTBF is an accurate representation of what the vendors are experiencing with the technology they're shipping, it's also difficult to translate into something meaningful to end users. "It's a very complex and tortuous route to undertake, requiring a lot of solid engineering experience and an understanding of probability and statistics," he says.


Share:

More from Techworld

More relevant IT news

Comments



Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Advancing the state of virtualised backups

Dell Software’s vRanger is a veteran of the virtualisation specific backup market. It was the...

Download Whitepaper

Techworld UK - Technology - Business

Innovation, productivity, agility and profit

Watch this on demand webinar which explores IT innovation, managed print services and business agility.

Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *