unraided Posted June 13, 2012 Share Posted June 13, 2012 I've always wanted to create a easy, well, lazier way to monitor SMART reports. I've constructed a script to execute every morning using at via the go file, reporting on the overall health of all my attached disks, echo "Current status for /dev/sdb" > /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log smartctl -H /dev/sdb |grep "SMART overall-health self-assessment test result" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log echo "" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log echo "Current status for /dev/sdc" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log smartctl -H /dev/sdc |grep "SMART overall-health self-assessment test result" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log echo "" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log echo "Current status for /dev/sdd" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log smartctl -H /dev/sdd |grep "SMART overall-health self-assessment test result" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log echo "" >> /mnt/user/share/_Logs/_smartctl/smartctl_20`date +%y%m%d`.log " " " which produces the following output Current status for /dev/sdb SMART overall-health self-assessment test result: PASSED Current status for /dev/sdc SMART overall-health self-assessment test result: PASSED Current status for /dev/sdd SMART overall-health self-assessment test result: PASSED " " " You can write this log to the path where unRAID keeps the syslogs (/boot/logs), but I placed it to a directory on my main share for easy reading on a client machine. I have no doubt their many other, albeit better ways to do this via some addon or something or other, I haven't been keeping up to date with v5.0, but this IMO is a convenient way to know if the disk(s) are on the blink without creating a terminal session to your box and examine the syslogs, using a addon if you choose not to, log onto the web interface, etc... Of course, if a disk fails, you'll probably know about it when you look for something and notice it's missing Cheers Quote Link to comment
Joe L. Posted June 13, 2012 Share Posted June 13, 2012 "at" will only execute a process once. What are you doing so it executes again the following morning? Quote Link to comment
vca Posted June 13, 2012 Share Posted June 13, 2012 I'm not sure I'd put much faith in the "PASSED" overall status, I've had a number of drives with issues that the overall status was PASSED but were showing problems in some of the individual counters. And once I tested them with the WD Diagnostic they were reported as failed by the diagnostic utility but the SMART report still said PASSED. I generally watch the following counters which track the appearance of bad sectors - if I see changes in any of these I get ready to replace the drive (or at least make sure my backups are current and then test it further): 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 and this counter that tracks SATA cable issues (so might give you a hint that a cable is working loose): 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 and this one, which seems to be associated with errors reading the surface on Western Digital drives (but this is from my own observations, I don't think there is much published about this one): 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 Plus the drive temperature as an indicator that there might be a filter that needs cleaning or a fan that has died: 194 Temperature_Celsius 0x0022 113 110 000 Old_age Always - 37 Regards, Stephen Quote Link to comment
unraided Posted June 14, 2012 Author Share Posted June 14, 2012 "at" will only execute a process once. What are you doing so it executes again the following morning? The server, in my case will power on the morning, just before 6am, on selected days to do a rsync, just after 6am, to another server, hence why I use at, you could use cron to schedule it better I guess. Quote Link to comment
mbryanr Posted June 14, 2012 Share Posted June 14, 2012 I'm not sure I'd put much faith in the "PASSED" overall status, I've had a number of drives with issues that the overall status was PASSED but were showing problems in some of the individual counters. That is why SmartHistory was/is a nice addon for unMENU. Although a scheduled SMART test would be nice to add to the trend http://lime-technology.com/wiki/index.php/UnRAID_Add_Ons#SmartHistory Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.