ILM causes virtualisation difficulties
Multiple tiers means multiple virtual pools
By Chris Mellor | Published: 10:00, 14 October 2005
Disk virtualisation aims to combine the space of many drives into a single logical pool. Then servers are allocated space from the pool instead of being allocated individual drives or space on or across them. The idea is to drive up utilisation rates and ease drive replacement.
Servers connect to logical drives rather than physical ones and drives can be added to a arrays, or moved, or replaced, without upsetting servers' logical connection to their data.
Disk utilisation can rise because servers or applications can be given the right amount of storage without having to pre-allocate large amounts of space that sits unused. They can logically have all the space they need but physical disk space is only used up as needed. 3PAR makes a great play out the ability of such 'thin provisioning' to drive up disk utilisation and defer drive purchases.
EMC and HDS stress the increased ability customers have to migrate data from array to array and to refresh drives using their virtualisation products.
Drive array virtualisation is happening now. It is not a coming technology anymore. IBM has 1,400 customers for its SVC. DataCore has users like Moorfields Eye Hospital for its virtualisation product.
Information Lifecycle Management - ILM - is a coming technology. StorageTek, now part of Sun's Data Management Group (DMG) pioneered the term and EMC is focussing on it strongly. A fundamental part of ILM is having multiple tiers of storage and apportioning data to the right tier for it.
There could be fast Fibre Channel or SCSI disk with relatively low capacity drives but fast access. This is used for storing transaction data. There could be slower serial ATA (SATA) drives holding much more data at a lower cost/GB. This tier of storage could be used for less frequently accessed data such as reference data. A term for this tier might be near line storage.
Pillar Data has four tiers of disk storage. StorageTek has five or six with tape providing the final tier.
Virtualisation plus ILM
Let's combine drive array virtualisation and ILM. We start off with a bunch of arrays. Let's give them Fibre Channel and SATA drives. We virtualise them into one pool of disk storage. Now let's set up a 2-tier ILM system. In order to separately allocated reference data to the SATA drives we need to separate them out in our virtual pool.
It can't be done. There is only one single logical pool of storage.
To have multiple tiers of disk storage, as ILM requires, and to have drive array virtualisation then you have to have as many virtualised pools of storage as there are tiers of disk in your ILM scheme.
Pillar Data has four virtualised pools of storage in its system. That's the only way it can offer both facilities: Virtualised disk; and ILM with different qualities of service.
The same *must* be true for any other supplier of virtualised disk space and ILM. If you have three tiers of disk in an IBM-based ILM scheme and you wish to use IBM's SAN Volume Controller then you must have three pools of SVC storage.
ILM with virtualised disk lowers utilisation
What does this mean in terms of utilisation? One argument is that it means lower utilisation. Here's a hypothetical example, ignoring thin provisioning for now.
If you have one tier of disk then let's say you could drive utilisation up to 70 percent. So a 10TB virtual pool would have 3TB of empty space.
Now let's implement a 3-tier ILM system on this array: 2TB of fast disk; 4TB of medium disk; 4TB of slow disk. You need to have space in each tier to accept data as it moves across the tiers in its own lifecycle. That's what ILM also means. Data's value changes over time. Initial often-accessed transaction data becomes consolidated into medium-access reference data which, in turn, becomes even less accessed data destined ultimately for deletion or an archive.
You must have space in the tiers to accept data as it's moved into a tier. Clearly existing data on a tier will be moved off and space is made available that way. So let's assume you can have 60 percent utilisation of your tiers now rather than the 70 percent utilisation theoretically achievable before.
That means you have more empty space: (800GB fast disk + 1.6TB of medium disk + 1.6TB of slow disk) = 4TB. You have just gained more empty space.
Is this example valid in principle? I don't know, not having found customers who have implemented both a multi-tiered ILM system and drive array virtualisation.
It could be though, that the advantages of drive array virtualisation in a multi-tiered ILM environment will centre more on simpler drive space provisioning and migration/drive refresh issues rather than on better disk utilisation.