SheepContoller Posted July 24, 2018 Share Posted July 24, 2018 Hi there, yesterday, I was greeted with the red X of despair, one of my WD Red 3TB had been disabled. So I went and bought a replacement but since that wasn't precleared and the SMART looked okay-ish to me, I re-activated the drive and the rebuild went flawless ... so is the drive really bad or could it have been something else? Report attached, I run two of these HDDs, marked the reports. I know the drives are collecting age and rust, but is it reasonable to keep them running? Obviously, there is a parity drive and that's pretty new, 1200+ hours. Preclear on the replacement is running right now. Thanks failed - WDC_WD30EFRX-68EUZN0_WD-WMC4N0816310-20180725-0033.txt other - WDC_WD30EFRX-68AX9N0_WD-WMC1T0041512-20180725-0034.txt Quote Link to comment
SSD Posted July 24, 2018 Share Posted July 24, 2018 4 minutes ago, SheepContoller said: Hi there, yesterday, I was greeted with the red X of despair, one of my WD Red 3TB had been disabled. So I went and bought a replacement but since that wasn't precleared and the SMART looked okay-ish to me, I re-activated the drive and the rebuild went flawless ... so is the drive really bad or could it have been something else? Report attached, I run two of these HDDs, marked the reports. I know the drives are collecting age and rust, but is it reasonable to keep them running? Obviously, there is a parity drive and that's pretty new, 1200+ hours. Preclear on the replacement is running right now. Thanks failed - WDC_WD30EFRX-68EUZN0_WD-WMC4N0816310-20180725-0033.txt other - WDC_WD30EFRX-68AX9N0_WD-WMC1T0041512-20180725-0034.txt The drives are not failing. The red X is often due to bad or loose cabling. Especially common when you are opening a server to add or replace a drive, and touch the delicate wiring of some other drive(s), nudging a cable just enough to cause a marginal connection. These are not spring chickens. The one called "failed" has been powered on for 3.7 years. The on called "other" has been powered on for 4.7 years. Quote Link to comment
JorgeB Posted July 25, 2018 Share Posted July 25, 2018 The "failed" one might be OK, but the "other" is definitely failing. Quote Link to comment
SheepContoller Posted July 26, 2018 Author Share Posted July 26, 2018 (edited) Well thanks guys for taking a look - but what makes the "other" a soon-to-fail? "failed" has come up a few times on unRAID checks, like corrected errors and that, "other" has always been the green one. I know "other" has seen more runtime, but is 4.7 years a critical value? (TBH, if it was just me I'd ditch all drives for 2+1 10TB Helium drives, but there's the issue of money) What value other than "uncorrectable" is a critical one to give an extra look, other than the ones being monitored by unRAID anyway ? Edited July 26, 2018 by SheepContoller typos Quote Link to comment
JorgeB Posted July 27, 2018 Share Posted July 27, 2018 10 hours ago, SheepContoller said: but what makes the "other" a soon-to-fail? ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 198 195 051 - 27569 You want this to be 0 or a very low number. It's also showing recent (looking at the power hours) UNC errors (read errors) Quote Error 5116 [3] occurred at disk power-on lifetime: 40112 hours (1671 days + 8 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 03 48 00 01 35 48 6f b0 e0 00 Error: UNC 840 sectors at LBA = 0x135486fb0 = 5188906928 Commands leading to the command that caused the error were: CR FEATR COUNT LBA_48 LH LM LL DV DC Powered_Up_Time Command/Feature_Name -- == -- == -- == == == -- -- -- -- -- --------------- -------------------- 25 00 00 03 48 00 01 35 48 6c b0 e0 08 07:17:30.379 READ DMA EXT 25 00 00 05 40 00 01 35 48 67 70 e0 08 07:17:30.371 READ DMA EXT 25 00 00 03 50 00 01 35 48 64 20 e0 08 07:17:30.365 READ DMA EXT 25 00 00 05 40 00 01 35 48 5e e0 e0 08 07:17:30.355 READ DMA EXT 25 00 00 03 40 00 01 35 48 5b a0 e0 08 07:17:30.351 READ DMA EXT Quote Link to comment
SheepContoller Posted July 27, 2018 Author Share Posted July 27, 2018 (edited) I see, and added the value to the ones unRAID should monitor closely. Well the replacement drive has been precleared 2 runs, guess I'll just keep watching how things go and either replace the one that fails first or maybe if possible get a second HDD and replace them both. Anyway thanks to both of you for sparing a moment to give my problem a look, most appreciated. Edited July 27, 2018 by SheepContoller Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.