Avoid VM stall, the scourge of virtualisation projects
Why do so many enterprises get stuck at phase one?
By Kevin Fogarty | CIO US | Published: 14:00, 12 May 2011
Virtualising and consolidating data centre servers provides such clear a financial benefit that there are few companies of any size, in any industry, that shouldn't virtualise at least some of their servers and applications, industry analysts say.
But companies that start virtualisation projects looking for cost savings, without planning for a second phase of migration that requires spending more on new tools than the project might save in short term costs, will get stuck in phase one. That means saving money on hardware, but getting only a fraction of the benefit of the virtualisation products they've bought, analysts add.
The cost benefit of getting as many as 10 or 20 virtual servers for the price of one physical box drove many companies to migrations that covered as much as 25 percent to 35 percent of all the servers targeted for conversion, before hitting "VM stall," a virtual halt in migrations caused by the most subtle cost and organisational issues that affect virtualisation projects directly, according to James Staten, principal analyst at Forrester research.
Related Articles on Techworld
"Companies can get close to the 50 percent point [in a P2V migration] still using the same thinking they did in the physical world," Staten says. "Obvious costs like licences, how many machines you can take out of an environment, how many VMs you can put on a host all make one cost picture. Beyond that you get into issues about performance and capacity management, and the amount of effort needed for support. A lot of companies don't take those fully into account."
Planning to virtualise every workload on every server without modifying the way IT plans capacity requirements, or the way it allocates computing resources and IT staff support time, leaves IT departments with a lot of duplicated processes and a steadily dropping return on investment as a P2V migration expands, says Chris Wolf, research VP at Gartner.
"Trying to replicate the same structures you had been using, with virtual servers, gets into a cycle of diminishing returns pretty quickly," Wolf says. Keeping virtualisation projects on track requires changes in both organization and technology, and the need to keep the two coordinated according to the particular stage of migration, Staten says. Here's some advice for avoiding stalls during four key phases of a virtualisation project.
Phase 1: Technical efficiency and consolidation
The first, ecstatic wave of virtualisation saves far more money, far more quickly than at any other time during the migration to or operation of a virtual infrastructure, according to Gary Chen, research analyst at IDC.
The cost benefit of eliminating 10 physical servers and replacing it with one larger, more automated box often gives both IT and business unit managers a false sense of success and unrealistic expectations for the future, he says.
Many IT groups stick with the same set of cost metrics to estimate success, which usually means focusing only on how densely virtual machines can be packed into physical hosts, not investing in management tools or training that give IT managers a better idea of how to allocate virtualised resources in new ways, Chen says.
"People have to move their thinking away from something a lot of them are proud of, their physical server-to-virtual server ratio, or how many machines they can take out of an environment," Staten says. "That's an interesting thing to brag about, but completely irrelevant. The real need is to shift to the point that you can deliver greater efficiency: higher sustained utilisation and peak utilisation of their whole pool of computing resources."
Phase 2: Picking targets, simplifying administration
The next phase of a migration, and its cost justification, requires more specific knowledge of what individual VMs are doing, for what business unit, and what resources they require, Staten says.
That requires more than high VM density to keep the ROI positive. It requires changes in IT administration and support to improve processes like change management, provisioning and incident management that don't work effectively within older organisational silos, Staten says. Without the ability to build resource inventory lists that are more detailed than just the number of physical servers available, IT managers can't intelligently distribute particular VMs or workloads across the available servers, let alone to other data centers in companies that have very far reaching virtualised infrastructures, Wolf says.
"You start to look at all the resources, CPU, memory, storage, as a pool you can allocate," he says. "You can't do that without visibility into all the resources or within existing management silos."
Getting to the point of even automating the provisioning of VMs and putting limits on their resource use, mobility and lifespan requires new management tools that are often limited in scope to just one vendor's software, Staten says.
Getting beyond the first big opportunity for VM stall, the reorganisation of administration and allocation of computing resources means giving sysadmins responsibility for a set of VMs according to the business unit that uses the VMs, or the pertinent applications or other factors, not the physical location of the servers.
Failure to allocate human resources efficiently causes efforts to be duplicated, extra work and gaps in responsibility, all adding up to a huge waste of resources when VMs are floating around without anyone clearly responsible for them.
"Sprawl is the typical problem there for companies that are not doing lifecycle management or automating any of the procedures involved in systems administration or support," Staten says. "That's where the change in thinking needs to happen or progress typically starts to slow down." At the most basic level, it's necessary to know what all those virtual machines are doing, or whether they're doing anything at all.
Data centre administrators consistently report that about 15 percent of the servers they maintain aren't doing anything useful. That is, they're running and being properly maintained, but are not being used by any end users or applications in an average month, according to Sumir Karayi, CEO of 1E software, an asset management vendor that sponsors regular studies of resource utilisation efficiency.
"IT is typically asked only for uptime. Their job is to keep things running, not ask why it's running, so they look at utilisation of resources, not workloads." Karayi says. "If a server is being backed up, patched, rebooted to install patches, it can look quite busy just doing housekeeping work even when no one is using them. With virtual servers it's even easier because there's not as much of a perceived cost to running them without doing any real work."
Phase 3: Process automation
Restricting sprawl doesn't keep most companies moving on with ambitious migrations, Wolf says.
The real advantage of virtual infrastructures is flexibility, and to ensure it, IT has to be able to use VM mobility, detailed resource management, automated provisioning and change management, or the whole infrastructure won't keep working efficiently, he says.
The measure here should no longer be how high the utilisation of a single server or group of VMs is in running one application, but how consistently high the utilisation of the whole data centre has become, Staten says.
That requires almost real-time awareness and management of data centre resources, which means being able to instrument, monitor and allocate resources to optimise performance for each workload, each server, each data centre location and each physical server, according to Gartner's 2009 virtualisation best practices analysis.
Performance optimisation is just one part of the equation, though, Chen says. Costs rise dramatically with overuse or unsupervised use of licences, not just the wasteful launch of VMs, he says.
Many companies are re-negotiating their enterprise licence agreements specifically for that reason, Staten says. It's too easy for end users to launch a server or application instance that takes up licences for the OS, application and database, use it for a day, then leave it running and launch another the next morning.
Unfortunately, according to a 2010 IDC study, 25 percent of IT organisations manage servers and storage manually, only 30 percent consider data centre operational costs to be a priority, and only 25 percent are consistently concerned about software licence costs. Fewer than a third of IT managers worldwide, 31 percent, consider integrating server, storage and network management for virtualised infrastructures a prime concern.
Many organisations recognise the potential benefits of granular, consolidated management of virtual infrastructures, but haven't been able to accomplish it themselves, partly because of the limited availability of tools that can handle the requirement, partly because their organisations haven't advanced their thinking enough to be confident in their ability to accomplish it, according to Galen Schreck, VP and principal analyst at Forrester.
Without granular resource management and high level policy based management, however, most organisations are either going to stall below the 50 percent mark in virtual migrations, or waste far more money and effort than they should getting slowly beyond that barrier, Schreck says.
Phase 4: Cost efficiency, chargeback
Despite the lack of granular resource allocation, sophisticated policy based management or even a long range capacity management plan at most companies, 36 of every 100 dollars spent on physical servers in 2014 will go to hardware intended to host virtual servers, according to a December study from IDC. The 2.2 million physical servers that figure represents will actually run 18.4 million servers, at an average of 8.5 VMs per host by 2014, the study predicted.
The number of those servers will drive changes in the way IT does its job, but the $19 billion cost will change the way it reports its spending to the rest of the business, and how it justifies the work it does for business units, Chen says.
"If people were looking primarily at cost, [Microsoft's free] Hyper-V would be selling a lot more," Chen says. "There are a lot of costs, but translating it [so business unit managers can understand] is complicated."
An extra virtual machine looks free because it requires no capital costs to launch, Staten says. Licensing costs, resource use, administration, storage and all the other costs still exist, but are usually not translated clearly in budget analyses to the business side, he says. That failure to understand real costs, as much as any other single factor, can cause even a technically successful virtualization project to hang.