Troussdesoin Posted August 30, 2021 Share Posted August 30, 2021 (edited) Hello, guys, This week end, i installed a new cache drive. But in the process, it bend the cables and the card from the M.2 controller where 3 of my HDD were. when i started unraid, they did not show up, then after fiddling a bit, they showed up. But it was not stable and one disconnected in the middle of a write operations. One of the drive became disabled (red cross on disk 1). I did a smart short test on the three drives and they seemed fine. So i rebuild the array with the disk 1 in place. It finished this morning, but there was 504 errors. Mainly because of disk 2 and a current pending sector. I did smart extended test on all three drives. Disk 3 came back fine, but there was errors on disk 1 and 2 (errors occured - check smart report). I just launched a parity check (without correction). I ordered a new m.2 controller just in case there are more errors (could mean that the controller was damaged). What do i do for disk 1 and 2? it's only media on it, i would not die if i lose some of it. Are the errors really bad? or if there is no more errors, i should be fine? Right now, the overall server is working fine. Thanks for the help WDC_WD40EFRX-68N32N0_WD-WCC7K6SKLNHN-20210830-1651.txt WDC_WD40EFRX-68N32N0_WD-WCC7K5VY4XY7-20210830-1604.txt WDC_WD40EFRX-68N32N0_WD-WCC7K5VY42E9-20210830-1727.txt unraidchoupi-diagnostics-20210830-1727.zip Edited August 30, 2021 by Troussdesoin Quote Link to comment
ChatNoir Posted August 30, 2021 Share Posted August 30, 2021 Can you just attach the zip file of the diagnostics ? I doubt people will bother downloading every file one by one. Quote Link to comment
Troussdesoin Posted August 30, 2021 Author Share Posted August 30, 2021 You're right, Thanks for that, i edited the first post Quote Link to comment
JorgeB Posted August 30, 2021 Share Posted August 30, 2021 1 hour ago, Troussdesoin said: What do i do for disk 1 and 2? Unless all your data is not important they should be replaced, don't forget that Unraid requires all other disks to be OK to rebuild a failed disk. Quote Link to comment
Troussdesoin Posted August 30, 2021 Author Share Posted August 30, 2021 (edited) Yes i think you're right, i got disk1 read errors as well on the parity check Edited August 30, 2021 by Troussdesoin Quote Link to comment
trurl Posted August 30, 2021 Share Posted August 30, 2021 You should setup Unraid to monitor attributes 1 and 200 on your WD disks. Click on a disk to get to its settings page. Quote Link to comment
Troussdesoin Posted August 30, 2021 Author Share Posted August 30, 2021 (edited) Ok, I just did it. I got Raw read error rate of 0 on disk 2 and 3, and error rate of 5 on the disk 1. Actually after reading into it, it does not seem to matter much because the worst is at 199. which is far above the trehold. Also the multi zone error rate on disk 1 is also at 1. I was wondering if it could be that only the controller /Cable is bad? Could it provoke all those errors? Should i replace the drives now or wait for the new controller/ cables to arrive? Edited August 30, 2021 by Troussdesoin Quote Link to comment
JorgeB Posted August 30, 2021 Share Posted August 30, 2021 39 minutes ago, Troussdesoin said: I was wondering if it could be that only the controller /Cable is bad? The extended SMART test failed on both disks with a read error, this is a disk problem, can't be controller/cable related. Quote Link to comment
Troussdesoin Posted August 30, 2021 Author Share Posted August 30, 2021 (edited) All right, it's settled then, i'll switch both of them. And how can i test the Cable/controller? So that i know if i need to replace them as well? Edited August 30, 2021 by Troussdesoin Quote Link to comment
JorgeB Posted August 31, 2021 Share Posted August 31, 2021 8 hours ago, Troussdesoin said: And how can i test the Cable/controller? Non correcting parity check is a good test. Quote Link to comment
Troussdesoin Posted August 31, 2021 Author Share Posted August 31, 2021 I did a parity check (non correcting). It returned with 0 errors (with the two failing drives). I am rebuilding the array with a new drive. And then i'll do some preclear on the old drive to see if it was really failing or it was from something else. Quote Link to comment
JorgeB Posted August 31, 2021 Share Posted August 31, 2021 52 minutes ago, Troussdesoin said: with the two failing drives These type of read errors can be intermittent, i.e., they can work today and fail tomorrow. Quote Link to comment
Troussdesoin Posted September 2, 2021 Author Share Posted September 2, 2021 Well, I switched Disk 1. It rebuilded with a few errors (most likely a few seconds of a film were lost). Did a few preclear on the dying disk, and it was really dyning. I'm waiting for a second disk to come in to replace the other failing drive. I'm also waiting for a new controller, Just to be sure. Overall it was a good learning experience. Thanks guys for the help. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.