rh535 Posted May 16 Share Posted May 16 Two weeks ago, I noticed what I thought was a bad drive. I replaced it with a brand new 18TB WD Gold drive (pre-cleared fine) and rebuilt the array. I woke up this morning to over 390,000 errors on nearly every drive but one. After only having the array rebuilt for a day or so. Do you have any idea what is going on and how to fix it? I have attached my diagnostics and am thankful for any assistance provided. I have never had something like this happen. The screenshot below is something that shows up know for every drive, but the one drive that doesn't have any errors (super old 1TB WD Black drive). klauss-diagnostics-20240516-0813.zip Quote Link to comment
JorgeB Posted May 16 Share Posted May 16 Syslog already rotated so we cannot see the beginning of the problem, but looks like a controller issue, reboot and post new diags after array start. Quote Link to comment
rh535 Posted May 16 Author Share Posted May 16 (edited) I had to shut down via the power button on the chassis because nothing else would shut it down/reboot.I posted the diagnostics right after it came back online from the shutdown. klauss-diagnostics-20240516-0945.zip The array now doing a parity check - should I cancel that? Edited May 16 by rh535 Quote Link to comment
JorgeB Posted May 16 Share Posted May 16 Everything looks OK for now, you may want to enable the syslog server, to make sure you catch the beginning of the problem, in case it happens again. Quote Link to comment
rh535 Posted May 16 Author Share Posted May 16 I actually already enabled syslog server onto the flash before the last rebuild because the first rebuild stalled at 50ish%. I have posted the logs below - hopefully that caputred it. If not - I have it enabled again. syslog.zip Quote Link to comment
JorgeB Posted May 16 Share Posted May 16 May 15 07:00:11 Klauss root: Capture diagnostics to /boot/logs This is the last entry, and it was before the problem. Quote Link to comment
rh535 Posted May 16 Author Share Posted May 16 Dang. Okay - Let's see what happens now. Thanks for your help! 1 Quote Link to comment
rh535 Posted May 18 Author Share Posted May 18 I have finished the Parity check and have hopefully posted the logs showing what is happening. I did stop the array around 3 am because I started to it still filling with errors again. I also noticed that the parity check said it had corrected 46 errors at around 96.5%, but when it finished, it said it had 0 errors. Is this normal? klauss-diagnostics-20240518-0257.zip syslog Quote Link to comment
JorgeB Posted May 19 Share Posted May 19 No disk errors, some sync errors, I would recommend running another check to confirm 0 sync errors. 21 hours ago, rh535 said: but when it finished, it said it had 0 errors I assume in the notifications? This is a known issue, look a the result on main, that will be correct. Quote Link to comment
rh535 Posted May 23 Author Share Posted May 23 I did another parity check, and about halfway through, the webgui was inaccessible. I could still give it commands over terminal, but it would not shutdown or reboot. It would just hang on that. It showed 163 errors. I have posted the diagnostic files I got. Thanks as always for your help! klauss-diagnostics-20240523-0758.zip klauss-diagnostics-20240522-1842.zip Quote Link to comment
JorgeB Posted May 23 Share Posted May 23 Syslog is filled with call traces, looks more like a hardware issue, RAM speed is not visible in the diags, check here to make sure it's in spec for the config: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.