One of the things we’re learning this decade is that we aren’t at all prepared for the many manmade and environmental disasters that are emerging word wide. It has become very clear that for critical systems that must remain viable during these disasters, DIY (Do It Yourself) just didn’t work. As is often the case, the DIY workarounds weren’t even less expensive. Such efforts don’t usually build on prior experience. As a result, the firm doing the implementation learns on the job, which is arguably the most expensive way to learn about any complex thing you are creating.
IBM just released a comprehensive blog post surrounding its Power platform by Tom McPherson (GM for IBM Power) in which he indicated that IBM recognized that the DIY thing just wasn’t working for critical HA/DR (High Availability/Disaster Recovery) events. In those cases, you need a more prescriptive and automated solution that works out of the box and can work even if you can’t get staff into the site (thus the need for automation). There is a bit of old IBM in this change, and that is a good thing.
Let’s talk about that this week.
The HA/DR Problem
The big problem with High Availability Disaster Recovery systems is that they need to be bulletproof and able to operate for extended periods without human intervention. The last thing you need when you are already dealing with some kind of localized or global disaster (like a pandemic) is to have to go to the office and make a service call or have people on premises that should have been evacuated.
You need systems consistent with that old Timex wristwatch slogan: they can “take a licking and keep on ticking.” These systems will be used to keep the company and government running during a disaster and will be critical to providing the services companies and people need during a disaster.
Many if not most of the existing systems were built from parts by people who wanted to save money and do it themselves. But in most cases, these people are just making assumptions and guesses about what is needed. They don’t have the depth of experience and understanding of a global vendor that has run into problems like this before and can come to the table with compelling solutions from what they learned.
Given IBM’s Mainframe and AS400 background, IBM understands high availability. A few years back there was a site in which constructions workers opened up a wall and found an AS400. It hadn’t been touched in years, but it was still running. These machines had several usability issues when compared to current generation hardware, but one thing they did extremely well was keep running and require lower levels of maintenance than their alternatives.
IBM is historically expert on what is needed to create a HA/DR solution that is more purpose built and automated, and the company showcases that experience. IBM knows that the only way to create a true solution in this unique space is to use all the company’s experience from supporting firms, and itself, in disasters to create a solution that could survive those disasters even if people are not coming into work.
In his recent blog post, Tom McPherson pointed to IBM’s pivot from DIY approaches to the far more reliable, high-performing appliances like plug and play alternatives. By making this move, IBM is showcasing decades of understanding products like this and its unmatched experience in providing related solutions.
This is why banks favor IBM. Financial institutions can lose millions in mere seconds of downtime. IBM’s reliability is even more critical for infrastructure and communications during disasters and for efforts related to national defense, two areas in which IBM is also well known for providing uniquely robust solutions.
In the end, I doubt there is any company (outside of perhaps a specialist in the segment) that would better understand the needs for a HA/DR solution. As this solution rolls out, I expect the firms that use it will be praised for their prowess while the firms that don’t will be criticized for being down at those critical times.