New post.

[homepage.git] / blog / data / 2012-10-26-system-downtime.txt
diff --git a/blog/data/2012-10-26-system-downtime.txt b/blog/data/2012-10-26-system-downtime.txt

index bce5c87b0e345f158883257f02b4be0efdf5cbcd..fa25a3e772f77819d66780cb2abfe16220aa05f1 100644 (file)
--- a/blog/data/2012-10-26-system-downtime.txt
+++ b/blog/data/2012-10-26-system-downtime.txt
@@ -1,5 +1,5 @@
  Title: 12 years of outages - summarised by Stuart Kendrick
-Tags: english, nuug, standard
+Tags: english, nuug, standard, usenix
  Date: 2012-10-26 14:20
  
  <p>I work at the <a href="http://www.uio.no/">University of Oslo</a>
@@ -19,7 +19,7 @@ it every time.</p>
  article by <a href="http://www.skendric.com/">Stuart Kendrick</a> from
  Fred Hutchinson Cancer Research Center titled
  "<a href="https://www.usenix.org/publications/login/october-2012-volume-37-number-5/what-takes-us-down">What
-Takes Us Down</a>" (also
+Takes Us Down</a>" (longer version also
  <a href="http://www.skendric.com/problem/incident-analysis/2012-06-30/What-Takes-Us-Down.pdf">available
  from his own site</a>), where he report what he found when he
  processed the outage reports (both planned and unplanned) from the
@@ -28,10 +28,11 @@ etc etc.  The article is a good read to get some empirical data on
  what kind of problems affect a data centre, but what really inspired
  me was the kind of reporting they had put in place since 2000.<p>
  
-<p>The centre set up a mailing list, and send fairly standardised
-messages to this list when a outage was planned or when it already
-occurred.  Here is the two example from the article: First the
-unplanned outage:
+<p>The centre set up a mailing list, and started to send fairly
+standardised messages to this list when a outage was planned or when
+it already occurred, to announce the plan and get feedback on the
+assumtions on scope and user impact.  Here is the two example from the
+article: First the unplanned outage:
  
  <blockquote><pre>
  Subject:     Exchange 2003 Cluster Issues