From: Petter Reinholdtsen Date: Tue, 14 Feb 2012 20:20:57 +0000 (+0000) Subject: New post. X-Git-Url: https://pere.pagekite.me/gitweb/homepage.git/commitdiff_plain/8a0435554b762e08714894223b1c3314438fac63?ds=sidebyside New post. --- diff --git a/blog/data/2012-02-14-raid-disk.txt b/blog/data/2012-02-14-raid-disk.txt new file mode 100644 index 0000000000..eeaa5e2946 --- /dev/null +++ b/blog/data/2012-02-14-raid-disk.txt @@ -0,0 +1,54 @@ +Title: How to figure out which RAID disk to replace when it fail +Tags: english, raid +Date: 2012-02-14 21:25 + +

Once in a while my home server have disk problems. Thanks to Linux +Software RAID, I have not lost data yet (but +I was +close this summer :). But once a disk is starting to behave +funny, a practical problem present itself. How to get from the Linux +device name (like /dev/sdd) to something that can be used to identify +the disk when the computer is turned off? In my case I have SATA +disks with a unique ID printed on the label. All I need is a way to +figure out how to query the disk to get the ID out.

+ +

After fumbling a bit, I +found +that hdparm -I will report the disk serial number, which is +printed on the disk label. The following (almost) one-liner can be +used to look up the ID of all the failed disks:

+ +
+for d in $(cat /proc/mdstat |grep '(F)'|tr ' ' "\n"|grep '(F)'|cut -d\[ -f1|sort -u);
+do
+    printf "Failed disk $d: "
+    hdparm -I /dev/$d |grep 'Serial Num'
+done
+
+ +

Putting it here to make sure I do not have to search for it the +next time, and in case other find it useful.

+ +

At the moment I have two failing disk. :(

+ +
+Failed disk sdd1:       Serial Number:      WD-WCASJ1860823
+Failed disk sdd2:       Serial Number:      WD-WCASJ1860823
+Failed disk sde2:       Serial Number:      WD-WCASJ1840589
+
+ +

The last time I had failing disks, I added the serial number on +labels I printed and stuck on the short sides of each disk, to be able +to figure out which disk to take out of the box without having to +remove each disk to look at the physical vendor label. The vendor +label is at the top of the disk, which is hidden when the disks are +mounted inside my box.

+ +

I really wish the check_linux_raid Nagios plugin for checking Linux +Software RAID in the +nagios-plugins-standard +debian package would look up this value automatically, as it would +make the plugin a lot more useful when my disks fail. At the moment +it only report a failure when there are no more spares left (it really +should warn as soon as a disk is failing), and it do not tell me which +disk(s) is failing when the RAID is running short on disks.