Hi everybody,
I think I fucked up...
Short version:
Last successful parity check was on August 3rd
Changed MP, CPU, RAM on August 5th, no parity check afterwards
Unraid crashed during reboot (August 19th)
automated parity check after crash resulted in lots of errors, due to read errors on disk 2 (which had a pending sector for a while, but never acted up before)
replaced cabeling, restarted parity check, then disk 3 (marvel controller card) was marked as failed
Removed marvel card, rebuild disk 3 from parity, but had lots of read errors from disk 2 again (August 20), runraid showed health as "green"
After restart filesystem of disk 1 and 3 was corrupted. Repared both.
Disk 2 doesn't complete a extended smart check due to read errors
parity check shows now 27729 errors (finished last night, August 22nd)
Don't know what to do now, as I have to replace disk 2, but probably parity information is fucked up as far as I understand...
Please help
Diagnostic files are attached.
Pictures, documents etc. are all backup up offsite, so no problem. Media files are expandable and can be re-ripped from BDs.
Details:
My last successful parity check was on August 3rd. On August 5th I replaced the MB, CPU and RAM of the server to a supermicro x10 with Xeon E3-1275 v3 and 32GB ECC RAM. Before, I was running an old Asrock with I5 and 8GB non-ECC RAM. Initially I also wanted to replace my old Marvel sata card with two ports with a Dell H310 card. I didn't have enough time, so I thought lets do this in the next days, because the Marvel worked perfectly so far (still had to flesh the IT Bios of the H310 card).
Unfortunately after the replacement it never occurred to me to do a non-correcting parity of the new build.
The server was running fine until Thursday, August 19th, when during a manual reboot the array would not stop (took forever) and then unraid froze completely. I waited a few hours, but no response, so I did a hard reset. When the server was back up, all Docker where gone. Apparently the docker image was corrupted and reset. No biggie, installed everything and all worked fine.
After the system was back up, an automatic parity check was started (with correcting errors) and showed quite many errors (633) after ~ 2h30 at which point I also saw that disk 2 had read errors. So I stopped the parity check and read up on the forum. Then I understood, that automatic parity checks with correcting errors is not such a good idea, because of the risk if you have a broken disk.
In a few threads it was suggested that read errors could be cable problems, so I replaced the cable of this drive and started another parity check (non-correcting). Then disk 3 which was connected to the Marvel card started to act up and was marked as failed by unraid.
I removed the marvel controller and attached disk 3 to the MB controller.
At this moment I still had hope, that problems on disk 2 were due to a bad cable and rebuild disk 3 from parity (big mistake I guess). Rebuild completed with lots of errors (1405) on Friday evening, August 20. Disk 2 still had read errors
Disk 2 smart extended test does not complete due to read errors
Parity check shows now 27729 errors (finished last night, August 22nd)
Don't know what to do now, as I have to replace disk 2, but probably parity information is fucked up as far as I understand... Parity drive is 12TB, I plan to buy one or two new 16TB. Can I relplace disk 2 with the 16TB and have only 12TB usable?
Please help Diagnostic files are attached. Pictures, documents etc. are all backup up offsite, so no problem. Media files are expandable and can be re-ripped from BDs.
PS: I hope all is clear and all info is included in the diagnostic files. If not, please tell me.
bigfoot-diagnostics-20210819-2216.zip bigfoot-diagnostics-20210822-1257.zip bigfoot-diagnostics-20210820-1821.zip