Once in a while my home server have disk problems. Thanks to Linux +Software RAID, I have not lost data yet (but +I was +close this summer :). But once a disk is starting to behave +funny, a practical problem present itself. How to get from the Linux +device name (like /dev/sdd) to something that can be used to identify +the disk when the computer is turned off? In my case I have SATA +disks with a unique ID printed on the label. All I need is a way to +figure out how to query the disk to get the ID out.
+ +After fumbling a bit, I +found +that hdparm -I will report the disk serial number, which is +printed on the disk label. The following (almost) one-liner can be +used to look up the ID of all the failed disks:
+ ++ ++for d in $(cat /proc/mdstat |grep '(F)'|tr ' ' "\n"|grep '(F)'|cut -d\[ -f1|sort -u); +do + printf "Failed disk $d: " + hdparm -I /dev/$d |grep 'Serial Num' +done +
Putting it here to make sure I do not have to search for it the +next time, and in case other find it useful.
+ +At the moment I have two failing disk. :(
+ ++ ++Failed disk sdd1: Serial Number: WD-WCASJ1860823 +Failed disk sdd2: Serial Number: WD-WCASJ1860823 +Failed disk sde2: Serial Number: WD-WCASJ1840589 +
The last time I had failing disks, I added the serial number on +labels I printed and stuck on the short sides of each disk, to be able +to figure out which disk to take out of the box without having to +remove each disk to look at the physical vendor label. The vendor +label is at the top of the disk, which is hidden when the disks are +mounted inside my box.
+ +I really wish the check_linux_raid Nagios plugin for checking Linux +Software RAID in the +nagios-plugins-standard +debian package would look up this value automatically, as it would +make the plugin a lot more useful when my disks fail. At the moment +it only report a failure when there are no more spares left (it really +should warn as soon as a disk is failing), and it do not tell me which +disk(s) is failing when the RAID is running short on disks.
+