Follow Us

Can de-dupe data be used for compliance storage?

What if document integrity has to be proved?

Proving originality

The position that Martin Baldock, electronic discovery firm Kroll Ontrack's operations manager, takes is relevant here.

He thinks that the actual format that electronic data is stored in is not the key thing. In effect all electronic representations reconstruct a file for viewing or printing. What matters is that the content is original, not that the representation is in WORM format.

He said: "We look at the hash value of the file's contents compared to what we know was the original value. We are told, for example, that file A on disk is the original file and we compute its hash value and compare it to other copies of the file to see if it has changed." He couldn't necessarily say what has changed, only that something has.

The hash value is the determinant and even so little a change as adding an extra space between words can alter it.

His concern with sub-file-level de-duplication is with the reconstruction of the file when it is needed. "If you are recomputing the file from the components how confident are you that a bit pattern is exactly the same and so will compute exactly the same hash value? It would be a huge burden of concern to me."

Nexsan's Gary Watson is also a strong proponent of hashing as well as other measures to ensure file integrity: "Assureon is highly obsessed with data integrity – files are serialised, stored at least twice on separate RAIDs, and possibly stored on two RAIDs at a DR site, and in all cases are protected with two different hash algorithms which are checked every time the file is touched (plus a dozen other integrity features I won’t bore you with here)."

Referring to de-dupe he said: "In contrast, a given sub-block (say, of zeros) might be referenced by a million files, and the corruption of this single sub-block could have wide-ranging impact though a wide swath of files. It’s like a failure 'amplifier'. I’m not saying this is an impossible challenge to overcome, but an enterprise-class solution to the problem is non-trivial."

The legal holy grail

This seems to be the key thing here. Whatever form the electronic document is stored in: RAIDed and striped; or de-duplicated, as long as it can be provably reconstructed in an unaltered form then it would/should/could be accepted in a court of law.

One way to do that is by computing the file's hash value before electronically altering its representation and then re-computing the hash value when the file is to be used for compliance or legal purposes.

If they are the same then the file is good. If they are not then it isn't.

Will they be the same after the file has gone through a de-duplication process?

No-one knows for sure and until it can be proved that they are the same, de-dupe doubters have a point.



Comments

slayden said: 123




Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Optimise Performance For Global eCommerce

Global is all the rage: eBusiness teams are feverishly building new international initiatives in...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Techworld UK - Technology - Business

Part 2 of your journey to virtualisation

You can still access part 2 of our virtualisation journey - explore how you can improve your servers, storage and networks by developing your infrastructure.

Watch now...
Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *