Jump to content

(Solved) Parity-Sync running extremely slowly...currently at 8MB/sec Unable to find out why.


Go to solution Solved by FQs19,

Recommended Posts

2 minutes ago, Vr2Io said:

Sure abnormal.

 

Suggest only use parity disk to test, connect to onboard FCH or Asmedia SATA controller and perform same test, otherwise you really need try different BIOS / Unraid OS to figure out whats going wrong.

 

No point in trying to find what SATA port to connect my parity disks to since I can clearly see multiple issues. I'm throwing in the towel.

I'll reach back out to Asus and see what they can do for me. I doubt anything really. They sure aren't going to give my $850 bucks back for the board. I won't accept a used motherboard in exchange for mine either. 

 

Really appreciate the help. Thank you.

  • Like 1
Link to comment
11 minutes ago, FQs19 said:

No change. I did see 212MB/sec. for the first second, but it quickly fell off to ~80-100MB/sec. 

I'm definitely going to contact Asus and see how they can compensate me. 

Ha Ha .... sometime we will face strange issue during HW / SW change.

 

Less then a month, I upgrade mobo on one of my build, the NVMe have file copy corrupt problem, but problem suddenly gone without reason in a day. Due to I have request Amazon to send me another mobo, so I have change it and problem never happen again. This also make me troubleshoot lot of time.

Edited by Vr2Io
  • Like 1
Link to comment
6 hours ago, FQs19 said:

This still isn't anywhere near the 180MB/sec. I should be seeing. 

What are your thoughts on this??

 

CPU usage is still very high for just two disks going at those speeds:

 

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          2  0.0  0.0      0     0 ?        S    22:22   0:00 [kthreadd]
root      92867 94.9  0.0      0     0 ?        R    22:31   3:35  \_ [unraidd0]

 

Link to comment
4 hours ago, JorgeB said:

 

CPU usage is still very high for just two disks going at those speeds:

 

USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root          2  0.0  0.0      0     0 ?        S    22:22   0:00 [kthreadd]
root      92867 94.9  0.0      0     0 ?        R    22:31   3:35  \_ [unraidd0]

 

 

Are you thinking that maybe the CPU is bad?

Do you know of anything else I can try?

Link to comment

@JorgeB

 

I'm going to move my unraid to a Z390 motherboard.

What's the easiest way for me to move my NVMEcache pool to my Arraycache pool, since I will only be able to use two M.2 NVME drives on that motherboard?

I was just going to change the shares that use the NVMEcache to use the Array, then run Mover. After everything is on the array I'll change those same shares to use the Arraycache and run Mover again, reconfigure my settings to match the new assignments. I was also just considering using Midnight Commander, but I haven't used that very much. I'm sure I can figure it out, but just wanted to ask an expert like yourself before I try. 

 

I might have to lower the amount of disks I have as well, but that's easy to do with the Unbalance plugin. 

Link to comment
18 minutes ago, FQs19 said:

I was just going to change the shares that use the NVMEcache to use the Array, then run Mover. After everything is on the array I'll change those same shares to use the Arraycache and run Mover again, reconfigure my settings to match the new assignments.

That works, just make sure files are not in use.

Link to comment

@JorgeB @Vr2Io

 

So I decided to go back to basics and assume I messed up installing my components when I switched cases. 

I pulled my graphics card.

I pulled the AIO cooler off and cleaned it. Previous thermal paste application was good.

I pulled all the memory sticks.

I pulled my Dimm.2 stick.

I pulled the 24 pin power cable out.

I made sure the two 8 pin power cables were seated all the way. They were.

I pulled the 3960X CPU out of the carriage, after cleaning the thermal paste off. I verified all the pads were still in good shape. They were.

I then took video of the motherboard CPU pins in the socket to verify nothing was damaged. No damage seen.

I then re-inserted the CPU using the supplied torque wrench with the proper sequence of 1>2>3.

I applied a liberal amount of thermal paste to the AIO heatsink. 

I then re-inserted my four memory sticks, my Dimm.2 stick, the 24 pin power cable, and my graphics card. 

I then plugged my PSU in (which is only 1000 watts and several years old. I am replacing this PSU with a new Seasonic PRIME 1300W 80+ Platinum one this week). I'm thinking my server could hit 1000 watts during high work loads and since the PSU is several years old I should replace it with a higher wattage one. I don't believe it was causing my errors, but if I'm going back to basics, this is something that should be replaced with a higher wattage PSU.

I cleared the CMOS.

I configured the BIOS, changing only the TPM to Firmware, the CPU speed to Auto from Default, the boot mode to CSM and Legacy USB boot, then saved and rebooted. 

 

Once in Unraid, I had to reconfigure the network settings since I left the Intel LAN controller enabled, then rebooted Unraid to save the changes. 

After getting back into Unraid and verifying the network changes worked, I added my two parity disks and started the array. 

I immediately saw normal performance, 165MB~185MB/sec!!!

 

878807683_Screenshot2022-07-12164138.thumb.jpg.729be40eedde8244b08ace8f3c5b8add.jpg

 

I don't see any high single thread CPU usage either!!!

 

2042534715_Screenshot2022-07-12170519.thumb.jpg.12b487b0642ff56def34ffef144d3591.jpg

 

Every 1.0s: grep MHz /proc/cpuinfo                                                               Threadripper19: Tue Jul 12 17:16:26 2022

cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 3800.000
cpu MHz         : 3800.000
cpu MHz         : 3800.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 3800.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 4335.127
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2387.163
cpu MHz         : 2200.000
cpu MHz         : 2800.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 3800.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000
cpu MHz         : 2200.000

 

After a while, the Fix Common Problems plugin through a warning saying it was Unable to communicate with Github. 

I had to go into my router, change the DNS provider to 8.8.8.8, then change it back to 1.1.1.1. Once I did that, the DNS error on Fix Common Problems cleared. There might be an issue with my SFP+ module. I'll deal with that annoyance later. 

 

I'm attaching my diagnostics to see if you can spot any difference after reseating my CPU, cooler, memory, and power cables. 

I'm hoping that my problems were just because I didn't have the CPU, cooler, or power cable seated properly after switching cases.

It might also be that my PSU is just at its limit and with its age it couldn't keep up. 

 

I have two new TRX40 motherboards coming as well in case this Asus ROG Zenith II Extreme Alpha motherboard has a problem. 

I'm going to cancel this Parity-Sync, reboot the server, and go into my BIOS settings to change the following:

-Global C-States to Disabled

-IMMOU to Enabled

-turn off the WiFi controller

-set Fan speeds to what I want them at

-make sure virtualization is enabled

 

So what are your thoughts on all of this? 

Spot anything in the diagnostics that's different?

 

Thank you both for the help along the way to my Unraid recovery. haha

 

 

Screenshot 2022-07-12 163709.jpg

threadripper19-syslog-20220712-2109.zip

Link to comment

Just rebooted and started Parity-Sync after making those BIOS changes and what do you know, 10-14MB/sec writes!

So there's obviously something wrong with the BIOS. 
I'm going to revert the BIOS changes I made and see if the performance comes back. Hopefully it does. 

 

I also just realized I attached the syslog, not the diagnostics to my previous post. Sorry.

 

868384983_Screenshot2022-07-12175833.thumb.jpg.951551ca455dbcc8dbb1d5937f11c20e.jpg

threadripper19-syslog-20220712-2109.zip

Link to comment
21 minutes ago, FQs19 said:

-Global C-States to Auto from Enabled

-IOMMU to Auto from Enabled

Could you identify which one ( or both ) actually cause the problem, does always could reproduce.

 

I have IOMMU issue in TR 1920x which cause me missing half PCIe deivce ( device attach in other NUMA node ) in device page , problem solve by change in BIOS from auto to enable.

 

But this never affect parity / array performance and I never got slow parity problem in all my build.

 

For C-state, it only have issue in my 1st gen ryzen, if enable, systm will halt in light load or idle. I generally disable for all AMD build.

Edited by Vr2Io
Link to comment

I don't know which one is causing the issue.

8 minutes ago, Vr2Io said:

Could you identify which one ( or both ) actually cause the problem, does always could reproduce.

 

I have IOMMU issue in TR 1920x which cause me missing half PCIe deivce ( device attach in other NUMA node ) in device page , problem solve by change in BIOS from auto to enable.

 

But this never affect parity / array performance and I never got this problem in all my build.

I'll go through and test each config and let you know.

I just changed both to Auto cause at this point I just want it to work. HAHA

Link to comment
  • Solution

@Vr2Io

 

So it looks like I have to keep Global C-State control set to Auto for proper performance. 

I had it set that way for the prior two years until I had issues this year with UDMA CRC errors and was told to change that to Disabled. 

Does this setting really need to be Disabled for Unraid?

Link to comment
7 hours ago, FQs19 said:

So it looks like I have to keep Global C-State control set to Auto for proper performance. 

Good find, but still looks to me like a BIOS bug. 

 

7 hours ago, FQs19 said:

Does this setting really need to be Disabled for Unraid?

No, but look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar).

 

Link to comment

@Vr2Io @JorgeB

 

My parity-sync finally finished after 25hrs 32mins 13secs with an average speed of 130.5 MB/s.

 

1629441251_ScreenShot2022-07-13at10_06_01PM.thumb.png.fe2850b7077f21ce0b367f43b519d902.png

 

I believe my previous parity checks were around 26~28 hours. I forget. 

Would you say that's an average speed for two WD Red Pro drives as parity disks? 

I do have a couple 5400 rpm disks in the array. 

 

@JorgeB

I'll look for that "Power Supply Idle Control" setting in my BIOS now that my parity-sync completed. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...