The best way to maximise server uptime
The elusive goal of keeping things going
By John Edwards | Computerworld US | Published: 09:20, 11 November 2010
Don't let hackers steal your uptime
Security also plays an important role in ensuring server uptime. Not surprisingly, servers that are compromised by malware or unsecured network paths are more likely to go down than their well-protected counterparts. "You start off with physical security, your data centre building, and making sure that it's physically secure," Beddoe says.
Next, it's important to have server access rules that are known and enforced, secure shelves, antivirus programs, firewalls and disciplined administrators, he says. "They all play an equally important role in server security and promoting uptime."
John Luludis, who supervises server operations for Superior Technology Solutions, an IT consulting firm and custom software developer, says that to really ensure maximum server uptime it's important to move beyond basic security practices. Luludis is a strong believer in regular independent security audits. "I have my network go through penetration tests on a regular basis, and I do that because as much as I may think that my network is secure, it's also important to have an outside point of view," he says.
Protect your data
While Princeton Radiology's Howard is also a strong believer in regular server maintenance, he notes that some amount of failure is inevitable despite the best efforts of both managers and employees. To guard against any data losses caused by server failure, Howard recommends developing a data protection plan that's tied into the enterprise's comprehensive business continuity strategy.
Princeton uses an off site storage solution from Compellent Technologies to replicate all of its stored data. "Even though it's a disaster recovery data centre, we actually run some servers primarily from that site so we replicate in both directions," Howard says.
Gabiam, meanwhile, relies on the load-balancing technology built into his network infrastructure to protect against sudden server failure. "If one server crashes or one application becomes unresponsive, that traffic is redirected to other, similar servers that can handle the load," he says.
Unlike Princeton's Howard, Gabiam is a fan of clustering and uses Novell Cluster Services to provide an additional layer of redundancy, Gabiam says. If one of the cluster nodes fails, or needs down time for maintenance, the clustered application or component of a service running on that node can run seamlessly on another node in the cluster, he explains.
This migration process can be configured to be manual or automatic fail over. "Usually, you would want the application to automatically fail over to the next preferred node in the event of a hardware or software failure," Gabiam says, but administrators could initiate a migration to another node if they needed to perform maintenance on a specific node.