My System is a mess/ Parity is not looking good.


Sentie

Recommended Posts

So I have something seriously strange going on.  Let me start by saying that I did have another issue that was at/around the same time and I'm not sure if they are connected that was connected to motherboard settings that can be found hear 

 

 

But the reboot that happened to be the one that caused me to find that problem was when I rebooted to finish the install to 6.8.2.  side note I rebuilt my flash drive with a clean install of Unraid and pasted in over my config file when trying to troubleshoot that problem.

 

After I got to booting again I was having trouble with kernel panics so I uninstalled a bunch of plugins and turned off all my dockers, that helped for a boot or two but was definitely not stable found a suggestion to rebuild my flash drive and copy over my config file I tried that but had no luck I think I got it up a time or two in safe mode. Finally i reverted back to a backup of my flash drive that is 6.8.0 and the problem was fixed. I upgraded to 6.8.2 and again kernel panics. So reverted back uninstalled plugins and upgraded... still no luck. so now I'm back to my 6.8.0 and I'm not sure what to do about that but the more pressing problem at this point i have a lot of parity errors and I'm really new to Unraid still so I'm not sure what my next step should be. I am sure there is some kind of corruption somewhere in either my os backup or one of my dockers because when i was trying to stop the dockers three of my processor cores pegged to 100% and got stuck there and the system became unresponsive when trying to load the dashboard or the docker tab and the cpu started heating up outside of it's normal operating temp (not overheating just hotter then it normally runs on Unraid I saw parity errors in that run and had to perform a dirty shutdown to get any kind of system responsiveness again. now the parity check is running again and is not even as far as it was before the lockup and it has found as many errors as it did in the first pass. 

 

attached is my diagnostics report. I hope someone might be able to give me some advice on what to do next.  Sorry for the novel I wasn't sure what would be relevant and not.

skynet-diagnostics-20200217-0326.zip

Link to comment

Okay I will do that. My ram isn't over clocked. Also don't know if it is relevant but it seems that all or most of the parity errors were in the first 10ish% of the drive when i went to bed it was around 10% done and was at just shy of 3000 errors now just after waking up it is at 3500 errors and 86% done. be default does it correct parity errors? if so then it is probably going to be fixed (hopefully) could the repeated failure to boot cause this kind of problem?

 

Will definitely run a few more non correcting checks though and let you know what comes up.

Edited by Sentie
not quite awake yet and didn't actually respond to the suggestion
Link to comment

so the error correcting run that was going when I posted last has finished. I'm running a non correcting now. it is looking like it has a comparable number of errors to before the error correcting run. but I saw a post about a possibility that it might have misread the  first one and now it might be finding errors that were introduced by the correcting run? will run another non correcting after this and see if they post the same info. If they are the same i will run another correcting run and start over. Will let you know what turns out.

Link to comment
13 minutes ago, Sentie said:

Memtest has been running through the night and has found errors

From the looks of it, I'm pretty sure you are overclocking the RAM controller, but I could be wrong. Make sure you are obeying the max speeds for the amount and type of RAM on that CPU.

 

Any errors at all are unacceptable, you need to change either the hardware or the settings until you get zero errors over a 24 hour runtime of memtest.

Link to comment

I can't find anything about xmp in my bios i don't think my server board supports it (ASRockRack X470D4U2-2T). That isn't something that would ship on by default though is it? I did find there is a bios update available so I guess that is actually my next step. 

 

I got an email from someone responding to this thread stating my ram is overclocked but I don't see the comment hear. the mb wouldn't do this by default would it? I did finally find the DRAM Timing config tab in my bios but it has a scary you might break your hardware if you mess with these settings do you wish to proceed button. Since I have no idea what I'm doing when it comes to system clocks I really don't want to go in there if I can avoid it. 

 

Thanks for your help so far everyone.

 

 

Link to comment
19 minutes ago, Sentie said:

I can't find anything about xmp in my bios i don't think my server board supports it (ASRockRack X470D4U2-2T).

That's a server board, so unlikely to support overclock, but it should tell what frequency RAM is running.

 

19 minutes ago, Sentie said:

I got an email from someone responding to this thread stating my ram is overclocked

Yep, that was my bad, it was the CPU clock i saw, not the RAM, so I immediately deleted the reply.

 

Link to comment

Got it. I will dig closer into the ram speed sometime in the next week or so. all of my data is really well backed up and back to the world of work now so my tinker time is much more limited but i will work on playing with those ram tests and trying to find the ram speed. Let you guys know what I find.

Link to comment
  • 2 weeks later...

I know this has been a bit. Took quite a bit of testing to narrow down the problem. My system is stable as long as I only have 2 sticks of ram slotted all the chips are fine as long as I don't have more then 2 in the system. I have tried two different CPUs and get the same behavior with both.  There was a bios update released so I updated and tried again and didn't get any change. I still have not been able to find the ram speed so I have reached out to the motherboard manufacture for assistance. For the moment running okay with 16gb or ram. Hopefully I will be able to get the upgrade taken care of at some point.

 

Also switched out the cooler from the stock amd cooler too so now I can get to ram slot b2 without removing the cpu which should make testing much quicker. let me know if you have any other ideas.

 

Thanks so far

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.