zzgus Posted September 19, 2018 Share Posted September 19, 2018 The server has been working with no apparent problems some months. A month ago I replaced an HD, well I shrink the array without 1 disc. The disc gave me a lot of problems with read errors. Now, one month after the problems, another time read problems with another disc. I havent touch the server in this month, I refer to cables, etc. Can someone take a look at my diagnostics file to see if I have any thing wrong? It would be greatly appreciated. Thankyou Gus unraid-media-diagnostics-20180919-1323.zip Quote Link to comment
JorgeB Posted September 19, 2018 Share Posted September 19, 2018 Looks more like a connection/power problem, swap both cables (or backplane) with another disk and re-sync parity, then if it happens again see if the problem follows the disk or stays with the cables/backplane. 1 Quote Link to comment
JorgeB Posted September 19, 2018 Share Posted September 19, 2018 P.S.: disk2 is showing some SMART warning, it will likely fail soon, if it's not already, run an extended SMART test. 1 Quote Link to comment
zzgus Posted September 19, 2018 Author Share Posted September 19, 2018 3 hours ago, johnnie.black said: Looks more like a connection/power problem, swap both cables (or backplane) with another disk and re-sync parity, then if it happens again see if the problem follows the disk or stays with the cables/backplane. OK will try ASAP !!!. Thankyou @johnnie.black Quote Link to comment
zzgus Posted September 19, 2018 Author Share Posted September 19, 2018 3 hours ago, johnnie.black said: P.S.: disk2 is showing some SMART warning, it will likely fail soon, if it's not already, run an extended SMART test. What are the values I must look for to know it? Thankyou @johnnie.black Quote Link to comment
xman111 Posted September 19, 2018 Share Posted September 19, 2018 that is the type of info I am looking for.. i have some errors as well.. Quote Link to comment
JorgeB Posted September 19, 2018 Share Posted September 19, 2018 On healthy WD disks this attribute should be 0, or at least a very low value, usually up to lower double digits can be OK: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 200 051 - 89 These together with recent UNC at LBA errors mean the disk had read errors recently, and it will likely have more again in the near future, this is just the last one: Error 9 [8] occurred at disk power-on lifetime: 35417 hours (1475 days + 17 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 03 30 00 01 6c c3 8e 20 e0 00 Error: UNC 816 sectors at LBA = 0x16cc38e20 = 6119722528 Note that current power on hours are 36397, so these are recent errors. You'll find out if it's OK during the parity sync. 1 Quote Link to comment
zzgus Posted April 29, 2019 Author Share Posted April 29, 2019 (edited) After some time with apparent no problems got some new errors. Time to take an in depth search for good/bad disks. Some help will be greatly appreciated. If we look at the smart table on unraid we have the following attributes: #1 - Raw read error rate #3 - Spin up time #4 - Start stop count #5 - Reallocated sector count #7 - Seek error rate #9 - Power on hours #10 - Spin retry count #11 - Calibration retry count #12 - Power cycle count #192 - Power-off retract count #193 - Load cycle count #194 - Temperature celsius #196 - Reallocated event count #197 - Current pending sector #198 - Offline uncorrectable #199 - UDMA CRC error count #200 - Multi zone error rate and for every #Attribute we have those values: Flag Value Worst Threshold Type Updated Failed Raw Value What are the most important #Attributes to know if a disk is reliable and what value must I look? What's the difference between the "Value / Raw Value"? I have seen they differ from disk to disk. ############################################################ As @johnnie.black said one of the values to be aware off is: #1 - Raw read error rate -> with values to 0 up to low double digits values. I have disks from 0 to 45 to 228 of raw value ############################################################ Thankyou Gus Edited April 29, 2019 by zzgus Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.