PandaCheese Posted August 30, 2022 Share Posted August 30, 2022 (edited) Single parity array of 8x6TB WD Red disks had been running fine for over 5 years, then suddenly I got this warning. Disk 1 - WDC_WD60EFRX-68L0BN1_WD-WX11D76EPA6A (sdc) (errors 3) Disk 3 - WDC_WD60EFRX-68L0BN1_WD-WX11D76EPLE3 (sde) (errors 256) Disk 5 - WDC_WD60EFRX-68L0BN1_WD-WX11D668X8Z9 (sdg) (errors 2) Disk 6 - WDC_WD60EFRX-68L0BN1_WD-WX11D76EP0C8 (sdh) (errors 961) Shortly after disk 5 just went dead (not detectable even in WD's Data Lifeguard). I bought some 10TB replacements, replaced the totally dead disk 5 by doing a parity swap, then replaced the former parity disk, and lastly disk 6. Disk 6 continued to generate a lot of errors during the rebuild. None of the replaced disks would pass Data Lifeguard's extended test, including the former parity disk despite never reporting an error. SMART test and disk attributes on the remaining 6TB disks look OK for now, but I wonder if I should be proactive and swap out more of them since they were all purchased around the same time as the bad disks? Diagnostics attached, appreciate any input! gnosis-diagnostics-20220829-2013.zip Edited August 30, 2022 by PandaCheese typo Quote Link to comment
itimpi Posted August 30, 2022 Share Posted August 30, 2022 It looked to me as if you only did the short SMART tests on the drives. Drives can easily pass this test and still not be reliable. You should carry out the Extended SMART test on any suspect drives and any that do not pass this should be replaced. Note that when carrying out the Extended test it can take many hours (with progress only reported in 10% increments) and you should temporarily disable any spindown on the drives being tested. 1 Quote Link to comment
ChatNoir Posted August 30, 2022 Share Posted August 30, 2022 When the test are done, then post new diagnostics. 1 Quote Link to comment
JorgeB Posted August 30, 2022 Share Posted August 30, 2022 Diags are after rebooting so we can't see what caused the errors or the rebuild, likely some controller/power issue to cause them in multiple disks, you also need to check filesystem on disk6, possibly the result of the errors during the rebuild. 1 Quote Link to comment
PandaCheese Posted August 30, 2022 Author Share Posted August 30, 2022 4 hours ago, JorgeB said: Diags are after rebooting so we can't see what caused the errors or the rebuild, likely some controller/power issue to cause them in multiple disks, you also need to check filesystem on disk6, possibly the result of the errors during the rebuild. Yes this I did already after finding some folders missing. Quote Link to comment
PandaCheese Posted August 30, 2022 Author Share Posted August 30, 2022 6 hours ago, itimpi said: It looked to me as if you only did the short SMART tests on the drives. Drives can easily pass this test and still not be reliable. You should carry out the Extended SMART test on any suspect drives and any that do not pass this should be replaced. Note that when carrying out the Extended test it can take many hours (with progress only reported in 10% increments) and you should temporarily disable any spindown on the drives being tested. I'll do that. Thanks for the tip! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.