Disaster recovery and “high availability” are not the same thing. But, when you plan for infrastructure improvements and maintenance it can be worthwhile to think of them together. Why? Mostly, because both issues require the same mind set. Also, because the two work together and often share infrastructure.
Both disaster recovery planning and high availability planning require that you confront the possibility of bad things happening. You can’t just hope that bad things won’t happen and call it a day. Rather, you need to think about these possible calamities calmly and dispassionately. Of course, you can’t prepare for every possible type of problem that could ever happen, so you really need to figure what is most likely to happen, and what that really could mean for you. Then, you need to explore how best to reduce the impact of those events. And you need to recognize that this is not necessarily a neat and tidy process, with perfect results.
Sometimes planning for high availability reduces the likelihood of needing to go to disaster recovery mode. Other times, disaster recovery mode can reduce the effect of lack of high availability infrastructure. And sometimes the line between the two is blurred. Ultimately, it doesn’t really matter. What matters is whether you have the infrastructure to deal with the curveballs and negative events life throws your way.
What matters is whether you have the infrastructure to deal with the curveballs and negative events life throws your way.
A good starting point for such planning is to ask yourself what types of bad things could happen and how would they affect your ability to operate, absent any mitigation plan in place. Loss of power might make your building effectively inaccessible, and might also severely impair your ability to operate in and outside of your building. Loss of access to your building could keep you from operating as well. For some companies, losing their phone system would effectively shut them down, while for others losing their computer system even with the phone system up will create major issues.
Stick to general types of issues, because otherwise you’ll never get through the list and you’ll never be able to come up with all of the possibilities. After all, how many of us really worry about “weasel shorting out our power” (Large Hadron Collider, April 2016) or “Ants ate through our junction box” (Network World Editor Paul McNamara – twice). But we all know that power outages and telecom failures happen. Likewise, servers and phone systems can go down for a large number of reasons, and things might happen that could keep us out of our building, for reasons mundane, bizarre, or possibly unknown.
When looking at what could go wrong, relate items to other items on your list, while separating out items that could be their own issue. For instance, losing your computer system could be a major show stopping event. It’s also one of the things that is going to happen if you lose power to your building. On the other hand, losing power to your building is not the only way to lose your servers. So, put that on your list and note that it’s one thing that would happen if you lost power to your building.