Published Articles
 
Webcasts
 
White papers
 
Podcasts
 
 
Published Articles
 
2007  |   2006
 
 

may 8

Four steps to deal with data copies
Most companies have no mechanism to ensure that data access rights are consistent
Computerworld Opinion by GlassHouse CTO, Jim Damoulakis

http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9018886

We love having lots of copies of data. We must, or why would we create so many? Techniques such as mirroring, taking snapshots and replication, coupled with distribution mechanisms like e-mail and file tranfer, have made duplication and sharing of information irresistible.

Many of these copies are purely redundant, intended primarily for data protection and recovery: split mirrors, replication and backups. Others are intended to support particular business requirements such as analysis or regulatory mandates. Sometimes data undergoes transformation as it is copied, such as data denormalization when populating a data warehouse, but often, it is simply duplicated for functions like development, testing and quality assurance (QA).

Unfortunately, we can easily lose track of these copies -- how many exist, where all of them are stored, who actually can access them. This has given birth to a range of tools designed to help us track down, categorize and manage all of this scattered data. The fact that this problem exists at all highlights one of the fundamental limitations of current operating environments and data management paradigms.

This inability to track information creates security challenges -- most environments have no mechanism in place to ensure that data-access rights persist across data instances and permutations. It also leads to serious legal exposure -- one can't help but recall the infamous $1.45 billion Morgan Stanley backup tapes among other cases of rediscovered data. Furthermore, given currently in-vogue initiatives to reduce the storage footprint within data centers, maintaining copies of large data stores is also just plain inefficient, consuming gobs of storage and potentially congesting data-protection mechanisms such as backup and disaster recovery.

As a result, our love affair with data has become somewhat of a love/hate relationship. Today, unfortunately, there is no Holy Grail to address this issue. Initial steps to remediation suggest a multipronged approach that is largely more tactical than strategic.

Here are some places to start:

  • Revisit backup cycles and retention policies. This has an obvious cost impact particularly with disk-based backup, but is also critical as both a cost and liability factor in tape-based environments.

  • Reduce development, test and QA data stores. Tools from companies such as Solix Technologies Inc. and Princeton Softech Inc. can mask sensitive information and transfer reduced subsets to improve security and save space.

  • Manage database dumps and copies. Database administrators are highly risk-averse (often with justification) and are prone to making lots of copies. Efficiency can be improved by establishing standard policies and coordination practices to ensure that extraneous copies aren't also replicated and backed up.

  • Get serious about archiving and deleting data. This means setting real policies and getting selective about what needs to be retained -- much easier said than done, but it's critical to begin the effort.

Jim Damoulakis is chief technology officer of GlassHouse Technologies Inc. , a leading provider of independent storage services. He can be reached at jimd@glasshouse.com.

 

 

  © Copyright 2001 - 2008 GlassHouse Technologies, Inc. All Rights Reserved.

Privacy Policy | Terms of Use