http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyId=16&articleId=307368&intsrc=hm_topic

November 12, 2007 (Computerworld) — A cascading series of problems with a data center consolidation project at Web hosting company NaviSite Inc. left about 165,000 Web sites offline last week, with some of the sites remaining unreachable for six or more days.

The problems began on Nov. 3, when NaviSite tried to shift processing done on 850 servers at a data center in Baltimore to its headquarters in Andover, Mass. NaviSite acquired the Baltimore facility in August, when it bought Alabanza Corp., another hosting vendor.

Rathin Sinha, NaviSite’s chief marketing officer, said the company decided to move 200 of the 850 servers to Andover and migrate the data from the rest of the systems to new machines at the Massachusetts data center.

Consolidating the data center can be a time consuming and complicated activity, not just from a technical standpoint but also from the arranging downtime, performing the application testing/functionality etc.

What matters in these kind of situations is the planning and the communication, it’s easy for one small step to occur de-railing the end goal regardless of what planning you do. Planning what we’re going to do, who’s responsible for which actions – what the escalation path is (all the usual things). Communication – a unified front, understanding which elements of the transaction have been successful, which components aren’t, what the impact is, who this is impacting and what we are doing to resolve this – who the users/client base can contact for updates.

As with anything, users can accept failure,they can accept a network outage, if you keep them informed, what’s happened, what this means to their functionality, when you think it will be fixed.

There’s nothing more irritating than me finding the issue, having to call and find out what’s going on – and only to be told, “oh yes, the switch has failed.. It’ll be fixed in a while”.  People shouldn’t be scared of communicating with the users, communication allows me to highlight weak points in the infrastructure, where we need to invest, where we need to limit our risk, our liability.




No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

Bookmark and Share

Leave a Reply