dheg Posted September 17, 2011 Posted September 17, 2011 Hi guys, This is my first post here. I've been running unRaid without much trouble for about 3/4 months. Indeed the forum and a quick search on google got me out of a few problems, but now I'm completely puzzled. I bought a Western Digital drive (WD20EARX) about 1.5 months ago. I run preclear (v1.11) for 5 times (this is the standard I use after long readings on the forums) and got a PASSED result. Please find below report: http://pastebin.com/jm7HsqLp Then about 2 weeks ago I started having problems on my XBMC box. Image froze and "bufffering" message appeared. I looked into the smart view of the mymain plugin and started seeing errors on this drive, though the overall status was passed, so I didn't pay much attention. Today I got up willing to find out about what was really going on with my drive and run a long smart report from the console, I have attached the report. So now, I'm completely lost. From reading the SMART report I would say the drive is dying: 172 errors on a 700-hours old drive (I think these are too many). Besides all the errors on the log are UNC (I was told by a friend these are unrecoverable). The Raw_Read_Error_Rate is well over 95,000, but the overall result is PASSED ??? Although I'm quite good at computers, I know very little about linux, and even less about hardware diagnostics. I though this would be the right time to get to know you guys BTW: I'm currently running another preclear on a new drive (WD20EARX also) I bought a couple of days ago. Results are good so far, but how much shall I trust the results? smart.txt
mbryanr Posted September 17, 2011 Posted September 17, 2011 While many attributes cannot be interpreted, and specific for each manufacturer - the two below can be an indicator of failure. While a drive may have >1000 spare sectors...this one is dying. 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 300 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 148 Also...Raw Read Errors are high as you noted.- due to the unreadable sectors 1 Raw_Read_Error_Rate 0x002f 174 174 051 Pre-fail Always - 95952 and the smart test aborts because sectors cannot be read from: #11 Extended offline Completed: read failure 90% 114 1572928 from above...looks like this drive was dying early. I noticed that it threw up a single reallocated sector during preclear cycle 2. Not a problem, just an indicator to monitor the drive closely which you did.
dheg Posted September 18, 2011 Author Posted September 18, 2011 Ok what I thought. Why then the tests give me a PASSED result? Will Western Digital accept the drive if it passes the tests?
mbryanr Posted September 19, 2011 Posted September 19, 2011 Because the smart test compares to the threshold value. ie everything passes up to the sector that it can not be read. WD will accept the RMA. I believe wd has utilities to test the drive. I know makes no sense. That is why it is important to trend your actual values over time and monitor the syslog. I send mine to a syslog server which sends emails based willon severity. I also have unraid status reports where you can see drive errors, note it can still report ok because a drive hasn't redballed
lionelhutz Posted September 19, 2011 Posted September 19, 2011 Send it back. Do not run a correcting parity check on the server until you have replaced and rebuilt the drive. Peter
dheg Posted September 23, 2011 Author Posted September 23, 2011 Thank you guys, I think I have it clear. I think I can read now a SMART report
Recommended Posts
Archived
This topic is now archived and is closed to further replies.