johnje Posted March 12, 2015 Share Posted March 12, 2015 Hello, Just got a notification email this morning indicating the following: Event: unRAID Disk 2 SMART health [187] Subject: Warning [TOWER] - reported uncorrect is 1 Description: ST3000DM001-1CH166_W1F2VFMM (sdc) Importance: warning How concerned should I be at this point? Should I go purchase a new disk in case of failure? Are there any checks / troubleshooting steps I can perform? Thanks in advance! Link to comment
Squid Posted March 12, 2015 Share Posted March 12, 2015 Post the output of the disk attributes screen (click on the disk in Main, then hit disk attributes) Link to comment
johnje Posted March 12, 2015 Author Share Posted March 12, 2015 I have attached a screenshot of the attribute screen for that disk. I imagine Raw Read Error Rate & Seek Error Rate being so high is not a good thing. Curious why these 2 things would not get reported, both of these are 0 for all my other disks in this setup. Link to comment
bubbaQ Posted March 12, 2015 Share Posted March 12, 2015 I imagine Raw Read Error Rate & Seek Error Rate being so high is not a good thing. Curious why these 2 things would not get reported, both of these are 0 for all my other disks in this setup. Seagate drives report garbage for those values, and are ignored. Link to comment
WeeboTech Posted March 12, 2015 Share Posted March 12, 2015 If attribute 187 continues to increase I would question the drive's integrity. I would suggest capturing md5 hashes of all files with md5sum or md5deep so you can validate the integrity going forward. There are a number of tools on the forum (bitrot and bunker come to mind). I would suggest that a backup procedure be implemented on this drive and/or running the manufacturers validation software. http://en.wikipedia.org/wiki/S.M.A.R.T. http://www.extremetech.com/computing/194059-using-smart-to-accurately-predict-when-a-hard-drive-is-about-to-die https://www.backblaze.com/blog/hard-drive-smart-stats/ Link to comment
johnje Posted March 12, 2015 Author Share Posted March 12, 2015 Thank you for the quick & concise replies! All mission critical files were moved the moment I noticed this flag pop up. I will begin capturing / comparing checksums of files on this disk. I am replacing a few smaller disks in my setup anyways, so may pick up an extra drive in the event this one dies out. Link to comment
lionelhutz Posted March 13, 2015 Share Posted March 13, 2015 If you have an GOOD extra disk the same size (or bigger) then I would immediately let unRAID rebuilt that disk onto the extra. Then, you can run tests on that disk to validate it's health. That could an indication of a bad sector which is a bad thing to have. If the drive checks out OK you can put it back into the array when the new one was going to go. Link to comment
SSD Posted March 13, 2015 Share Posted March 13, 2015 According to wikipedia, the meaning of that attribute is ... The count of errors that could not be recovered using hardware ECC I can only suspect that a second read was attempted and was successful. If not it would have generated a pending sector, because the SMART system would not just allow a known flawed read to the host OS without an error. I would not freak over this incident. Comparing MD5 is a good idea. Running a NON-CORRECTING parity check or two to see if there are signs of the condition getting worse. None of the ideas presented are bad ideas, but I personally would not pull it from the array based on a single reported uncorrect with no apparent data corruption or further problems. I have a Seagate drive with a runtime_bad_block of 1 that has been rock solid ever since. I would put that in a similar category. Don't like these onesy and twosy SMART attribute issues - don't buy Seagate. I have never seen these kinds of things with HGST and they outnumber Seagate drives in my array 10 to 1. YMMV Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.