Protecting your data is essential. Our lives are filled with digital data these days, at work and home. Whether you’re new to data protection or a seasoned pro, there are three basic concepts that are key to developing a comprehensive data protection strategy. These concepts typically relate to the business and enterprise, but should be considered with home computing as well.
The key concepts to keep in mind are:
- Backup
- Archive
- Replication
So where do all the other methods fit like high-availability, snapshots/imaging, mirroring, clustering, vaulting, etc? Those are just different implementations of the three base concepts.
Backups
Backups are not archives. Backups are a copy of files from a point-in-time. This can be accomplished with backup software that gets brick-level backups (each file can be restored), by getting an image of the hard-drive, etc. However, the restored data is only as recent as the backup. If you backup once a day, you risk restoring to a state that is 23 hours old.
Archives
Archives are simple, but the generic term often confuses the data protection concept of archiving. Keeping your backup tapes for a really long time does not mean you’ve created an archive. True archiving as a data protection concept means keeping a copy of every version and piece of information that ever existed, whether it is email, files, or something else. That means every email that came in or went out gets saved, or “archived”. It is much more than just a point in time, it is every version or piece that ever existed, retained for a set amount of time as defined by your legal, compliance, or operations department(s) or by other obligations. Often these archives cannot be manipulated as it would violate a compliance policy or law.
Replication
Replication is a whole other animal. Replication will “copy” the current version of files/data over to another location (locally or remote), either frequently or almost real-time as limited by the network speed. It does not retain old versions of files and it will replicate deletions and corruption as well.
So you can see, depending on yours or your company’s needs, you may very well need all three methods of data protection. Replication is important to quickly recover a system (HA) or protect against a disaster, providing BCP. It does not protect against a folder or pile of data being deleted, which is where backups come in. Archives could be used in this case too, but for a developer they’d rather get all of their code from one “point-in-time” than every version that existed with modules that may not like each other.
Data Protection Tree
To extend the “data protection tree” we have something like this:
- Backup
- Snapshot Images
- Volume Shadow Copy Services
- Archive
- Vaulting
- SharePoint versioning
- Email Journaling
- Replication
- Mirroring
- Clustering
- High-Availability
- Continuous Protection Servers
Remember, the data protection design is driven by the business, and the needs of the business will dictate your policies. This will help you develop SLAs (Service Level Agreements), RTOs (Recovery Time Objectives), and RPOs (Recovery Point Objectives), which I’ll talk about soon.
-Mike
(originally published in 2008)
Related Posts:
Tags: archive, backups, replication, shadow copy




