Missing Meta Data Needed for Effective Storage Management

Ever since time, or at least Computing time, began most of the focus has centred around the processing of data. Until recently it would be straight forward to argue that too little attention has been given to the long term storage of said information. But as the volume of data being generated soars and the costs of storing it escalates dramatically it is clear that something has to be done.

Over the course of the last ten days I have visited the EMCWorld show in Las Vegas, been present at the opening of the IBM Global Archival Centre in Guadalajara and spoken with both HP and Fujitsu Siemens about various aspects of storage management in modern business. And there is one thing on which all agree – it’s time for change.

Traditionally there have been two approaches to the long term storage of data. One was to leave it alone until the “system”, usually a business application, dies or if there is just too much data to leave it at rest move some of it to tape. The decision of just what data to move off of the spinning disks was usually based on how old it was or, in sophisticated cases, when it was last accessed or modified.

Now this might be better than nothing but in today’s high pressure, litigious business world where attention is grabbed by anything that saves money or that can help generate new value it is clear that such basic methods of data archiving are simply not tenable in sophisticated tiered storage architectures where information may need to be retrieved with great speed. Enter Content and Document Management systems coupled with clever archiving management software.

Until recently such systems have required considerable effort to get in and running and as a consequence have usually been deployed in only key situations. But now some vendors have started to deliver software that helps to automate the discovery and categorisation processes that form the foundation of ECM / EDM systems. Step one “Discover” just what data is out there in the enterprise and step two “categorise” it in terms of its importance and in terms of by which management policies it should be controlled.

It then becomes possible to define policies that describe just how and on which platforms different classes of data should be held. Then everything else is relatively straightforward (if you can ignore the internal politics associated with questions of data classification and importance ranking). Modern software and the experience of best practices obtained in the real world are now offering the chance for wide spread data classification to happen. And it is this meta data and classification that hold real promise to help in the effective administration of data over long periods of time. Automation is the key as is not attempting to do everything at once.