]> pere.pagekite.me Git - homepage.git/blob - blog/data/2012-02-14-raid-disk.txt
Generated.
[homepage.git] / blog / data / 2012-02-14-raid-disk.txt
1 Title: How to figure out which RAID disk to replace when it fail
2 Tags: english, raid
3 Date: 2012-02-14 21:25
4
5 <p>Once in a while my home server have disk problems. Thanks to Linux
6 Software RAID, I have not lost data yet (but
7 <a href="http://comments.gmane.org/gmane.linux.raid/34532">I was
8 close</a> this summer :). But once a disk is starting to behave
9 funny, a practical problem present itself. How to get from the Linux
10 device name (like /dev/sdd) to something that can be used to identify
11 the disk when the computer is turned off? In my case I have SATA
12 disks with a unique ID printed on the label. All I need is a way to
13 figure out how to query the disk to get the ID out.</p>
14
15 <p>After fumbling a bit, I
16 <a href="http://www.cyberciti.biz/faq/linux-getting-scsi-ide-harddisk-information/">found
17 that hdparm -I</a> will report the disk serial number, which is
18 printed on the disk label. The following (almost) one-liner can be
19 used to look up the ID of all the failed disks:</p>
20
21 <blockquote><pre>
22 for d in $(cat /proc/mdstat |grep '(F)'|tr ' ' "\n"|grep '(F)'|cut -d\[ -f1|sort -u);
23 do
24 printf "Failed disk $d: "
25 hdparm -I /dev/$d |grep 'Serial Num'
26 done
27 </blockquote></pre>
28
29 <p>Putting it here to make sure I do not have to search for it the
30 next time, and in case other find it useful.</p>
31
32 <p>At the moment I have two failing disk. :(</p>
33
34 <blockquote><pre>
35 Failed disk sdd1: Serial Number: WD-WCASJ1860823
36 Failed disk sdd2: Serial Number: WD-WCASJ1860823
37 Failed disk sde2: Serial Number: WD-WCASJ1840589
38 </blockquote></pre>
39
40 <p>The last time I had failing disks, I added the serial number on
41 labels I printed and stuck on the short sides of each disk, to be able
42 to figure out which disk to take out of the box without having to
43 remove each disk to look at the physical vendor label. The vendor
44 label is at the top of the disk, which is hidden when the disks are
45 mounted inside my box.</p>
46
47 <p>I really wish the check_linux_raid Nagios plugin for checking Linux
48 Software RAID in the
49 <a href="http://packages.qa.debian.org/n/nagios-plugins.html">nagios-plugins-standard</a>
50 debian package would look up this value automatically, as it would
51 make the plugin a lot more useful when my disks fail. At the moment
52 it only report a failure when there are no more spares left (it really
53 should warn as soon as a disk is failing), and it do not tell me which
54 disk(s) is failing when the RAID is running short on disks.</p>