Downtime

Published on 24 September 2010
This post thumbnail

During my freshman year at Georgia Tech, Tina gave me a coffee mug depicting green bar paper bearing the words "While the Computer is Down..." repeating in a Westminster font. That old phrase registered at that time because I then did most of my programming assignments on an overworked and frequently-crashing CDC Cyber (I needed at least sophomore status to use the Xerox Stars, HP 9000s, and other more modern machines). Restarts usually took 20 minutes or more, and there was nothing we could do but wait. So, at the very least, "Cyber is down" meant we could enjoy a good coffee break.

My first real job put me alongside seasoned mainframe programmers who knew well how to handle outages: step away from the green screen, talk a walk, visit the cafeteria, catch up on techniques, chat with friends, and so on. These guys would often swing by my office while I was pecking away at my LAN-connected PCs and servers, totally unaware of the outage. And they were usually pretty good at convincing me to join them in their forced break.

The advent of the personal computing era meant that any outage was under my own control, and there was always a suitable backup nearby. So "always on" became a way of life: if my PC, laptop, PDA, pager, or cell phone died, I typically had several alternatives to turn to. This removed all excuses for ever being disconnected, and we grew to expect 24x7 access to everything. The unexpected result was that, in making these redundant devices slaves to us, we became slaves to them.

Now, of course, cloud computing is chipping away at that and pushing us back into a bygone era when folks were at the mercy of whatever was at the other end of a dumb terminal. As more of what I do gets pushed out into the cloud, I lose more control. Remote VPSes, VMs, and EC2s replace local servers, Gmail and GCal replace Outlook, Mint replaces Quicken, and so on. I have multiple ways to access all these things, but there's still just one of each of these services out there. And along with that, we get back the shared computing pecking order: paying customers with SLAs get all the redundancy, fail-over, and guaranteed up-time, while smaller and free users rank somewhere below computer science freshmen.

So downtime has again become a part of life. Just recently, I was affected by extended outages at Visa, Chase and Intuit, and shorter outages at less critical sites: Google, 1and1, MapMyrun, Facebook, Weather.gov, etc. What can one do when such outages occur?  Not much, other than find something else to do, like take a coffee break. And that's not always such a bad thing.