June 15, 2017
Backup and disaster recovery is a speculative business. We ask ourselves, “What if a fire breaks out in our offsite datacenter? What if the one guy who really understands the BDR solution is unreachable when a server fails? What if one of our employees falls for a phishing email and lets in an attack?”
In other words, most IT teams speculate about disasters. Yet it’s hard to spot our downfall in advance – and often times our real troubles begin when the backups we’re counting on don’t work. Anyone who’s discovered a malfunctioning system or corrupt backups knows that plenty can go wrong with even a “good” backup and disaster recovery system. Sometimes it’s human error, sometimes an equipment issue, and sometimes just bad planning.
Here are 5 true backup horror stories that involve all three.
Making only one set of backups
One company we know was very diligent about creating regular backups. Good, right? The problem was that they created just one copy, stored in a datacenter basement that suffered a burst pipe. The flood wiped out all their archived data. This is why redundancy is so crucial when it comes to having more than one set of backups. (We recommend partnering one local set of backups with cloud copies.)
Not testing the system
Pixar's movie Toy Story 2 almost didn't happen because of a backup issue. One of the employees accidentally used a "remove all" command and deleted the film. When they turned to the backup system, they found it hadn't worked in months. The only thing that saved the movie? The technical director had created his own set of backups.
Forgetting to make backups
Automation doesn’t just make workflows easier – it steps in when memory doesn’t. At one insurance company, a vital server died, taking down the organization's system. The IT director realized the person in charge of manually switching out backup tapes hadn’t done so in weeks. Unfortunately the backup server had been failing to make data backups as well, which left them stranded.
When one company’s server went down, including the Exchange server, they weren’t too worried. They had a top of the line BDR solution, after all. But when they booted up their backups, they realized they’d been snapshotting critical servers far too infrequently. They wound up losing a significant amount of email.
Relying on tape
An almost identical server failure happened to a small organization, but with a different outcome. The team headed to their offsite backup datacenter to retrieve their tapes. Their confidence vanished when they realized some of the tapes were corrupt; errors came up when reading data for others. It became clear that backups were not managed properly, were stored near magnetic fields and occasionally placed under heavy objects. The backups that were functional were slow and difficult to restore from.
If you’re a Quorum customer (or just like to read our blog), we hope you know the backup basics like redundancy, automated testing and scheduling, and snapshots timed to reflect the importance of your data. Because your backups are the one factor that can change a crisis into a manageable problem – and while a disaster may not be in your control, your backups definitely are.