Server unreachable after about 50% Parity Check


Recommended Posts

Hello all, 

 

So I'm having an issue that when parity check is running and gets to more than 50% done, the server becomes unreachable. I can plug in a monitor and keyboard but only view the screen if I do a hard reboot of the server. Doing a hard reboot also solves the issue and it becomes reachable again but I need to cancel the parity check for it to stay working. 

 

So here is where I may have gone wrong. I replaced my 4TB parity drive with a new 10TB drive. Ever since then, I've been having issues. Here are the steps I used and I think I may have messed something up. 

  • Stopped array
  • Shutdown
  • Replaced parity drive with new 10TB
  • Turned on
  • Started array
  • Parity rebuild

 

The issue I'm seeing here after some research is that I never set my old 4TB drive as "unassigned" and then stopped the array. Was this step absolutely necessary? If so, where can I go from here?

 

Thanks in advance!

Link to comment
1 minute ago, Benson said:

No need and won't be the cause. What is current unraid / array status ?

I figured it wouldn't have been that easy. 

 

Array is online and passes the array health check. I have a few disks with a small amount of reallocated sectors (<5) and plan on replacing them ASAP but would like to get this figured out first. Not being able to complete a parity check scares me a little bit.

Link to comment
1 minute ago, Benson said:

Sorry for remove the reply. Seems you already reboot it and parity sync completed. (All show green ?)

No worries! 

 

Yes all drives show green. I also see messages of "parity is valid". But if I were to re-run the parity check (with write corrections turned on), it would cause the whole unraid device to be unreachable (dockers, shares, GUI, etc). Should I try running parity check without writing corrections?

Link to comment
Just now, johnnie.black said:

You should also enable syslog server, so if it crashes there are logs to check.

Apologies, I just stumbled upon this setting this morning and enabled it immediately. If it does crash again, the logs should be on my flash drive and I will upload if/when it occurs. 

 

Cheers!

Link to comment

So I ran the parity check without writing corrections and it did the same thing again. I woke up this morning to find that it was unreachable and would only respond to a hard reset. 

 

I did turn on the syslog settings and have attached them to this post. Please let me know if you can find anything relevant in there. I'm not entirely sure what I'm looking for. 

 

Thanks in advance!

syslog

Link to comment
5 minutes ago, johnnie.black said:

There are several call traces during the check, these can usually be fixed by lowering the md_sync_thresh tunable (Settings -> Disk Settings)

Thanks for the advice. I've lowered the md_sync_thresh setting from 192 to 140. I kicked off the parity check again as well.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.