3 <TITLE>Administrating the CIIPS Network - the easy way
</TITLE> 
   7 Petter Reinholdtsen 
<pere@td.org.uit.no
>, 
2000-
04-
17 
  10 <H1>Administrating the CIIPS Network - the easy way
</H1> 
  12 <BLOCKQUOTE><EM>Some suggestions to make it easier to maintain and
 
  13 improve the CIIPS network.
</EM></BLOCKQUOTE> 
  15 To maintain a multi-user network it is vital to reduse the possibility
 
  16 for human errors as much as possible.  It is also necessary to make
 
  17 sure any error get as little impact as possible.
 
  19 <P>To achieve this, I believe the following must be done:
 
  23  <DT><STRONG>Keep things consistent
</STRONG></DT> 
  25  <DD>Make sure knowledge is generic and not related to the different
 
  26  computers.  Make sure every supported program is available on all
 
  27  platforms and operating systems, with the same version and the same
 
  28  configuration. (
<em>cfengine, Store
</em>)
</DD> 
  30  <DT><STRONG>Automate as much as possible.
</STRONG></DT> 
  32  <DD>By leaving the repetitive tasks to the computer, one assures they
 
  33  are done the same way everywhere, every time. (
<em>cfengine
</em>)
</DD> 
  35  <DT><STRONG>Make it possible to back out changes.
</STRONG></DT> 
  37  <DD>When problems arise, make sure it is easy to get back to a
 
  38  earlier state when things where working.  Use version control systems
 
  39  to keep track of all human-edited configuration files, and make sure
 
  40  it is easy to remove a newly upgraded software package if problems
 
  41  are discovered. (
<em>CVS, Store
</em>)
</DD> 
  43  <DT><STRONG>Detect problems as early as possible.
</STRONG></DT> 
  44  <DD>If potential problems are detected and fixed before they become
 
  45  problems, less work is required to fix the resulting domino-effect
 
  46  when one system fail.  (Example: a full disk can stop the backup
 
  47  system from working properly, or a failing NFS server can hang
 
  48  processes on other servers and eventually bring other servers to
 
  51  <BR>Make sure to monitor all hosts and services, and warn when
 
  52  something is about to go wrong.  Keep statistics to find repeating
 
  53  problems. (
<em>Palantir, mon
</em>)
</DD> 
  55  <DT><STRONG>Keep the users informed
</STRONG></DT> 
  57  <DD>Make sure the users know where to find information on current
 
  58  problems, and the status of their reported problems.  This is easiest
 
  59  solved using a database and web based problem tracking
 
  60  system. (
<em>Bugzilla
</em>)
</DD> 
  62  <DT><STRONG>Prepare for disaster
</STRONG></DT> 
  67   <LI>Hard-disks fails after a few years.  Keep only new HDs in the
 
  68   servers, and reuse the old server HDs in workstations until they
 
  71   <LI>Back up everything, and back up often.  Make it easy to get
 
  72   files back from backup, and let individual users get their files
 
  73   back on their own.  It saves support personnel the hassle, and make
 
  74   it easier for the users to recover from their personal disasters.
 
  76   <LI>Make it easy to replace workstations.  A HD mirroring system to
 
  77   make it trivial to reinstall a workstation and easy to replace a
 
  78   broken computer. (
<em>Autosetup
</em>)
 
  86  <LI><A HREF=
"http://www.iu.hioslo.no/cfengine/">cfengine
</A> 
  87  <LI><A HREF=
"http://www.pvv.org/~arnej/store/storedoc.html">Store
</A> 
  88  <LI><A HREF=
"http://www.sourcegear.com/CVS/Dev/resource">CVS
</A> 
  89  <LI><A HREF=
"http://www.palantir.uio.no/cgi-bin/nodelist">palantir
</A> 
  90  <LI><A HREF=
"http://www.kernel.org/software/mon/">mon
</A> 
  91  <LI><A HREF=
"http://www.mozilla.org/bugs/">Bugzilla
</A> 
  93  <LI>Autosetup is not publicly available, but I can get it from the
 
  94  author.  More expensive alternatives are also available.  
<A 
  95  HREF=
"http://jacal.sourceforge.net/summary.php">jacal
</A> might be an
 
  99 I've already installed Store in the robotics lab, and it will take ~
15 
 100 minutes to set it up on other hosts as well.  With proper setup, this
 
 101 will make most hosts more independent from the NFS servers.