Gmail Outage Was Server Maintenance’s Fault, Says Google
By Jared Newman
Google was trying to upgrade its mail servers when when Gmail went down yesterday, according to Gmail’s engineering vice president Ben Treynor. In a lengthy explanation that began with an apology, Traynor wrote on the Official Gmail Blog that when a fraction of Gmail’s servers were taken offline for maintenance, the routers that send business elsewhere became overloaded. Traynor said this was due to recent changes to the request routers, some of which were ironically meant to improve reliability. As a result, the routers dumped incoming requests onto other routers, which in turn became overwhelmed and sent all the requests to other routers, and so on. IMAP and POP users could still access Gmail because they don’t use the same routers.
The lesson? Don’t let request routers shut down in times of heavy traffic. Instead, merely slow the service, and bring in more capacity to set things back to normal. Traynor says Google is working on these things, so hopefully we won’t see another colossal Gmail Fail anytime soon. [via Official Google Blog]