GMail Growing Pains

Note: This article was originally posted on the WisdomGroup blog.

WisdomGroup is both a proponent and a user of GMail. Of course, the service is not perfect, as proven during yesterday’s GMail outage. “It’s only affecting a small number of users” is hardly consolation when you’re one of the affected users.

Sys Admin Anger

One angry systems administrator posted this message in the Google Apps Discussion Group

Since yesterday around at least 4:00pm my CEO cannot access his mail. He gets a 502 temporary error. Support keeps telling me it is affecting a small number of users. This is not a temporary problem if it lasts this long. It is frustrating to not be able to expedite these issues. I have to speak with the boss again and he’s po’d. This is considered a mission critical issue here. We may have to make other arrangements. Apparently Google mail is not very reliable. I think I would have pushed for something else before we switched if I had known the level of unreliability.

Achieving 100% Uptime

In a perfect world, I could complete this sentence:

Achieving 100% uptime is easy. All you have to do is…

Unfortunately, I don’t know of a 100% uptime solution. Systems are imperfect because they’re designed by imperfect humans. Good engineers design systems with multiple redundancies. Sometimes those redundancies fail.

What Sys Admins Can Do

The GMail web client is outstanding. But when it fails, users can sometimes send/receive email through an old fat client like Microsoft Outlook or Apple Mail. Just point the fat client to GMail, and you’re in business.

What Google Can Do

The best thing a company can do in a time of failure: Communicate. Solve the problem, and make sure you’re sharing updates with users on a regular basis.

Bring Email In House?

When GMail fails, it’s tempting to wish for the good-old-days of the in-house mail server. “What moron moved us to external email?” is a common thought during an outage. But it’s important to balance our thinking:

Sometimes in-house email makes sense. In other cases, the time & energy spent on email management is better applied to business issues. It’s important to weigh the options rationally, especially in a time of crisis.

By the way, in-house email systems are subject to the same “less than 100% uptime” rules as all other systems.