Quantum lets you decide when to do it
By Chris Mellor, Techworld | Techworld | Published: 06:00, 17 August 2007
Quantum announced its high-end, enterprise datacentre-class de-duping product, the 240TB capacity DXi7500, in June.
De-dupe suppliers are split two ways between the ingest time de-dupers - Diligent, Data Domain, for example - and the post-process de-dupers - like Sepaton and FalconStor - who optimise for backup speed. The DXi7500 is the first de-dupe product to offer users a choice, through a policy-driven de-dupe engine which can do its work when data is ingested or afterwards when the backup run is finished. You can optimise the DXi7500 for data capacity or for backup speed.
It also has several other interesting features which Techworld discussed with Quantum's UK sales director, Steve Mackey.
Mackey said that the DXi7500 has hardware compression as well as de-dupe. The compression can halve the size of a file and is useful for files where de-dupe is not practicable, for example, write-once data sets, or when exporting backup data to physical tape.
The DXi7500 can present itself to servers as a CIFS/NFS mount point, a network-attached storage (NAS) product, or as a virtual tape library (VTL). It can have physical tape libraries or drives connected to it so that it can export files to tape. Quantum is working with Symantec so that NetBackup 6.5 can direct the DXi7500 to export backup data to tape directly. The DXi's hardware compression works faster than software compression and doesn't use up DXi CPU cycles.
One of three
The DXi7500 sits at the top of a three-product range. The smaller DXi3500 is for branch offices or smaller businesses. The mid-range DXi5500 is for mid-range datacentres or businesses. De-duped data can be replicated from either of the smaller DXis to the 7500 providing a disaster recovery facility and paving the way to tapeless remote offices, even mid-range datacentres possibly. MacKey said: "Such data goes straight into the de-duped block pool in the DXi7500."
The DXi3500 and 5500 do inline de-duping, that is, at data ingest time, so that they use all their disk space for de-duped data and so maximise the amount of historical data which they can store. Post-process de-dupe requires that gigabytes of disk be used to hold raw backup data. Mackey said: "It's due to the number of spindles. Post-process de-dupe needs staging spindles."
Suppose you back up 20GB a day and de-dupe works at a 20:1 ratio. The 20GB of disk space needed for the raw data could be used to hold 2 days worth of de-duped backup data.
The DXi7500 can have so much disk storage, from 24 to 240 terabytes, that it can do post-process de-duping without significantly affecting the amount of backup data it can hold. When a system can hold five months of de-duped data with post processing and five and a half months with inline processing there is not that much relative difference.
Mackey said the 7500 uses 750GB drives: "In time we'll offer 1TB drives. It's restricted by when LSI will use them in their arrays. Were 1TB drives to be available then the 240TB upper capacity limit would move up to 320TB
The DXI7500 has different PowerPC CPU hardware from the other two, with dualed CPU power and high-availability. Performance-wise Mackey said: "Inline de-duplication is about as good as the 5500. Realistically we were hoping it would be better."
De-duplication is CPU-intensive. Perhaps adding CPU cores will speed it up.