The best way to maximise server uptime
The elusive goal of keeping things going
By John Edwards | Computerworld US | Published: 09:20, 11 November 2010
Look at hardware quality
Acquiring quality servers rather than cut rate boxes or blades is an obvious way to enhance long term server reliability. "There's a decided difference in the longevity of hardware as you move to midgrade or high grade servers," says Jeffrey Driscoll, director of operations at E-N Computers, an IT services provider.
Yet in the real world, budget-strapped managers often face a painful choice between meeting their server needs with low cost products or acquiring better, more reliable systems that meet established performance criteria. What to do?
Driscoll advises shopping intelligently, looking for bargains and, whenever possible, working with management to get a budget that reflects real world operational needs. It's also not a bad idea to show management the financial damage that can be caused by unreliable servers. "It's a point that can be easily proved with simple figures and projections," Driscoll says.
Know when it's time to cut your losses
Simple common sense may be the best way of ensuring maximum server uptime without breaking the budget. "Hardware is hardware. At some point, something will break," Gabiam says. "It's important to learn from whatever happened and to be ready with a plan if it ever happens again."
Using common sense also means knowing when it's time to cut your losses and move on to something new, regardless of your replacement cycle's current stage. "If your IT staff is spending 25% of its time fighting fires and supporting out-of-date systems, who wouldn't see that as a huge waste of time?" Beddoe asks.
While maximising server uptime creates some extra work, most managers feel that the final rewards far outweigh the added exertion. "It's hard to say that any effort is wasted when it applies to uptime," Luludis says. "Anything you do can help."
Beddoe feels that striving for the most uptime almost guarantees the creation of a more reliable data centre. He contends that an "active environment", one that continually encourages staff members to identify and squelch potential problems before they can cause any damage is key to maximising uptime. "In 17 years, we have not had a major outage that has impacted our clients."