trussell34 Posted August 6, 2019 Share Posted August 6, 2019 Hello all, So I'm having an issue that when parity check is running and gets to more than 50% done, the server becomes unreachable. I can plug in a monitor and keyboard but only view the screen if I do a hard reboot of the server. Doing a hard reboot also solves the issue and it becomes reachable again but I need to cancel the parity check for it to stay working. So here is where I may have gone wrong. I replaced my 4TB parity drive with a new 10TB drive. Ever since then, I've been having issues. Here are the steps I used and I think I may have messed something up. Stopped array Shutdown Replaced parity drive with new 10TB Turned on Started array Parity rebuild The issue I'm seeing here after some research is that I never set my old 4TB drive as "unassigned" and then stopped the array. Was this step absolutely necessary? If so, where can I go from here? Thanks in advance! Quote Link to comment
Vr2Io Posted August 6, 2019 Share Posted August 6, 2019 20 minutes ago, trussell34 said: Was this step absolutely necessary? No need and won't be the cause. What is current unraid / array status ? Quote Link to comment
trussell34 Posted August 6, 2019 Author Share Posted August 6, 2019 1 minute ago, Benson said: No need and won't be the cause. What is current unraid / array status ? I figured it wouldn't have been that easy. Array is online and passes the array health check. I have a few disks with a small amount of reallocated sectors (<5) and plan on replacing them ASAP but would like to get this figured out first. Not being able to complete a parity check scares me a little bit. Quote Link to comment
trussell34 Posted August 6, 2019 Author Share Posted August 6, 2019 1 minute ago, Benson said: As 50% done, does largest array disk more then 5TB ? FYI the 50% was purely a guess. I haven't been able to determine exactly when the server becomes unreachable. But to answer your question, no my array disks are all 4TB and smaller. Quote Link to comment
Vr2Io Posted August 6, 2019 Share Posted August 6, 2019 (edited) Sorry for remove the reply. Seems you already reboot it and parity sync completed. (All show green ?) Edited August 6, 2019 by Benson Quote Link to comment
trussell34 Posted August 6, 2019 Author Share Posted August 6, 2019 1 minute ago, Benson said: Sorry for remove the reply. Seems you already reboot it and parity sync completed. (All show green ?) No worries! Yes all drives show green. I also see messages of "parity is valid". But if I were to re-run the parity check (with write corrections turned on), it would cause the whole unraid device to be unreachable (dockers, shares, GUI, etc). Should I try running parity check without writing corrections? Quote Link to comment
Vr2Io Posted August 6, 2019 Share Posted August 6, 2019 (edited) Suggest try again with no correction. This just use to confirm parity check will cause hang or not. But if you found error, I think you should no choicce to correct it. Edited August 6, 2019 by Benson Quote Link to comment
trussell34 Posted August 6, 2019 Author Share Posted August 6, 2019 I will try that and let you know how it goes. Appreciate the help! Quote Link to comment
JorgeB Posted August 6, 2019 Share Posted August 6, 2019 You should also enable syslog server, so if it crashes there are logs to check. Quote Link to comment
trussell34 Posted August 6, 2019 Author Share Posted August 6, 2019 Just now, johnnie.black said: You should also enable syslog server, so if it crashes there are logs to check. Apologies, I just stumbled upon this setting this morning and enabled it immediately. If it does crash again, the logs should be on my flash drive and I will upload if/when it occurs. Cheers! Quote Link to comment
trussell34 Posted August 7, 2019 Author Share Posted August 7, 2019 So I ran the parity check without writing corrections and it did the same thing again. I woke up this morning to find that it was unreachable and would only respond to a hard reset. I did turn on the syslog settings and have attached them to this post. Please let me know if you can find anything relevant in there. I'm not entirely sure what I'm looking for. Thanks in advance! syslog Quote Link to comment
JorgeB Posted August 7, 2019 Share Posted August 7, 2019 There are several call traces during the check, these can usually be fixed by lowering the md_sync_thresh tunable (Settings -> Disk Settings) Quote Link to comment
trussell34 Posted August 7, 2019 Author Share Posted August 7, 2019 5 minutes ago, johnnie.black said: There are several call traces during the check, these can usually be fixed by lowering the md_sync_thresh tunable (Settings -> Disk Settings) Thanks for the advice. I've lowered the md_sync_thresh setting from 192 to 140. I kicked off the parity check again as well. Quote Link to comment
JorgeB Posted August 7, 2019 Share Posted August 7, 2019 To expand a little more, you should check the log during the parity check, try to lower the value little by little until the call traces stop, you can do it with the check running. 1 Quote Link to comment
trussell34 Posted August 7, 2019 Author Share Posted August 7, 2019 Appreciate the clarification! Looking more closely at my syslog, I can see that the call traces didn't start for 12+ hours after the parity check. I'll keep an eye on it throughout the day though. Quote Link to comment
trussell34 Posted August 9, 2019 Author Share Posted August 9, 2019 So I lowered that setting (from 192 to 140) and I still had issues. Please see attached syslog for more details. Do I lower the number further? Is this a hardware issue? (RAM/HDDs/etc) Thanks! syslog Quote Link to comment
JorgeB Posted August 9, 2019 Share Posted August 9, 2019 40 minutes ago, trussell34 said: Do I lower the number further? Yes, keep lowering to see if the call traces stop. Quote Link to comment
trussell34 Posted August 9, 2019 Author Share Posted August 9, 2019 Ok I will "rinse and repeat". Any guidance on how low to take it? Quote Link to comment
JorgeB Posted August 9, 2019 Share Posted August 9, 2019 Start with 100, alternatively set an higher md_sync_window Quote Link to comment
trussell34 Posted August 9, 2019 Author Share Posted August 9, 2019 Appreciate the advice! Quote Link to comment
trussell34 Posted August 11, 2019 Author Share Posted August 11, 2019 Hey everyone, I just wanted to thank you all for the amazing advice and help! The most recent changes allowed the parity to complete successfully. Thanks so much everyone! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.