Title: 12 years of outages - summarised by Stuart Kendrick
-Tags: english, nuug, standard
+Tags: english, nuug, standard, usenix
Date: 2012-10-26 14:20
<p>I work at the <a href="http://www.uio.no/">University of Oslo</a>
article by <a href="http://www.skendric.com/">Stuart Kendrick</a> from
Fred Hutchinson Cancer Research Center titled
"<a href="https://www.usenix.org/publications/login/october-2012-volume-37-number-5/what-takes-us-down">What
-Takes Us Down</a>" (also
+Takes Us Down</a>" (longer version also
<a href="http://www.skendric.com/problem/incident-analysis/2012-06-30/What-Takes-Us-Down.pdf">available
from his own site</a>), where he report what he found when he
processed the outage reports (both planned and unplanned) from the
what kind of problems affect a data centre, but what really inspired
me was the kind of reporting they had put in place since 2000.<p>
-<p>The centre set up a mailing list, and send fairly standardised
-messages to this list when a outage was planned or when it already
-occurred. Here is the two example from the article: First the
-unplanned outage:
+<p>The centre set up a mailing list, and started to send fairly
+standardised messages to this list when a outage was planned or when
+it already occurred, to announce the plan and get feedback on the
+assumtions on scope and user impact. Here is the two example from the
+article: First the unplanned outage:
<blockquote><pre>
Subject: Exchange 2003 Cluster Issues