[SOLVED] parity check slow, like 1/3 speed


Recommended Posts

TLDR: why is my parity check speed down to 40 from 100+ ranges?

 

Had 8 data disks (12tb and 6tb) previously. Added 16 4tb drives and everything seemed fine, but when monthly parity check came it took 3.5 days instead of about 1. Duration correlates to speed staying around 40. Altho currently at 1.48tb (12.4%) and getting 70, but that is still way less. Does the parity calculation get exponentially more difficult and makes it slower or i something else going on? The disk speed docker test does not have the 4tb disks capped at 40. A 6tb seagate barracuda (shucked) was added since the first slow check. Being as everything else seems fine I haven't worried about that issue thinking at the next monthly check if it does the same thing then I'll post. I had to perform an unclean shutdown earlier which triggered the check, but canceled it to add the 6tb disk and restarted the check.

 

1731402666_diskspeeds.PNG.f32f672de59e5f845e37cd505c99dbcc.PNG

 

Capture.PNG.548f37cfab5ed355789648366bd17a1e.PNG

 

Note: the 89.7 one i when i was doing alot of IO

Edited by Cull2ArcaHeresy
marking as solved
Link to comment

when mover ran (< 1 gig of files) reads dipped a bit but bounced back and have maintained 75 to 88. Still quite a bit slower than used to be. Position is 2.14 tb (17.8%). After it passes the 4tb mark ill update as thats the only main variable left then. Gonna script something to grab the speeds as it goes to be able to see a graph over time and correlations with disk speed graph.

 

no writes.PNG

Link to comment

Excel was not cooperating with how was trying to graph, so 3 graphs (left out time left). Stat are grabbed every minute and parsed after from text file since this was thrown in quickly. Will have it fixed and grab the whole run when monthly runs on the 1st and then will add to the linked post. The old config of 8 disk array did at least 1 or 2 parity checks after updating to 6.8.3. Being as there was a check on 5-20 & 5-28, 3 or 4+ checks have happened on 6.8.3 and the many others in the 6.8.X range (i tend to update 1 to 3 months after an update is out as that requires a restart). Currently at 6.03 tb (50.2%), and only reading from the 5 array 12 tb disks (and 2 parity) is still at ~80 (def better than 30/40). Guess I crossed some threshold of disk count to cause slowness in 6.8, probably.

 

update.PNG

Edited by Cull2ArcaHeresy
forgot hourly vertical gridlines
Link to comment
  • 5 months later...
8 hours ago, kysdaddy said:

should the parity check run at 11mb/s is sabnzb is downloading files?

Anything reading or writing to the array disks will slow down a parity check.    Quite how severely is difficult to determine - it depends on how much activity and how much the drive heads have to move each time to switch between the position required for the file access and for the position currently reached in the parity check.

 

The interaction between normal file access and parity checks was one of the drivers behind developing the Parity Check Tuning plugin so that parity checks can be run in increments outside prime time to minimise the impact on users in their daily use of their Unraid server.

Link to comment
  • 9 months later...

When you added 16 4tb drives I'm guessing you had them on the same controller so hitting all drives at once like parity does will slow it down. The controller becomes a bottleneck at that point basically. 8 drives didn't max it out but 24 sure will. I'd recommend either adding a 2nd HBA, going with a HBA that has the bandwidth to support 24 drives at full speed, or moving some drives  to SATA ports to lower the bandwidth hit.  I'm sure you've already resolved your issue but wanted to add this for anyone who stumbles on this in the future. 

 

You could also simply live with it too. It's not often every drive is active simultaneously in my experience. Mainly during parity operations. Writing and reading from a few drives at once won't show any slowdowns until you have enough going to use all the available bandwidth on the bus or card.

Edited by david11129
Link to comment
2 hours ago, david11129 said:

When you added 16 4tb drives I'm guessing you had them on the same controller so hitting all drives at once like parity does will slow it down. The controller becomes a bottleneck at that point basically. 8 drives didn't max it out but 24 sure will. I'd recommend either adding a 2nd HBA, going with a HBA that has the bandwidth to support 24 drives at full speed, or moving some drives  to SATA ports to lower the bandwidth hit.  I'm sure you've already resolved your issue but wanted to add this for anyone who stumbles on this in the future. 

 

You could also simply live with it too. It's not often every drive is active simultaneously in my experience. Mainly during parity operations. Writing and reading from a few drives at once won't show any slowdowns until you have enough going to use all the available bandwidth on the bus or card.

 

The same way that generally you assume a gigabit switch is not a bottleneck, i didn't even think about the LSI HBA cards being one (gigabit switch comment is about common  "normal" network activity that is not mass file transfers). Looks like the pcie riser card has each of the pcie slots at 8x based on online images of the r720xd riser card. Plenty when that riser just had the 2 9207-8i cards that connect to the 14 r720xd bays. I would assume the riser gets 24 pcie lanes for the 3 8x slots, but even with that, my 2 ds4246 shelfs are connected to a 9202-16e that is in that 3rd slot with only 8 lanes. The ds4246s have 36 drives in them now, but 24 of those are pool drives now instead of array and the other 12 are part of main array (moved them to pools for multiple reasons). Since netapp designed it to use 1 cable, i'm assuming it is not restricting at all, but now thinking i should map it all out and calculate each step to see if there are other places of accidental bottleneck like putting a 16x in an 8x slot. When i reorganized the layout of the drives i did make sure the SSDs and parity drives are in the r720xd bays, as well as the 16tb disks and as many of 12tb as could still fit...really glad i did that (especially if bottleneck was big).

 

Parity check takes 2 to 3.5 days now (or 5.625 when a bunch of random automatic rsyncs happen during it pulling hundreds of gigs from remote server for hours). But next time i shut the server down, since gpu x16 slot is open again, i'll try to remember to move the 9202-16e HBA over to it (well if it came with a full height plate). Guess i've gone from near the limit of my server to pretty much out grown it...but that upgrade has to come later down the line..and 2 to 4 more 16tb drives is needed sooner than the upgrade.

 

Thanks for pointing that out, which could help future peoples.

Link to comment
  • 1 month later...
Quote

Didn’t realized parity check speed was only as fast as the slowest HDD. After I learned that I removed the 2 smallest and slowest drives from the array and my check speed went up 3 times the speed. Just upgraded hardware in-part to optimize that and check now running around 240 mb/s. Over twice as fast as it has ever run! 14tb check will be done in 15 hrs. It use to take from 1.5 days to over 4 days the last time to complete

 

 

Link to comment
  • 8 months later...

For the ones reading, I had mine somehow cancel or stop, even though running overnight and so on.

Mine was 20+ days @ 2/mbit which was a bit crazy so what fixed for me was to reboot in "safe" mode and start parity.

It went from typical 95/105/mbit up to 198 to 203/mbit/sec which is huge improvement.  Went from 25+ days to 1 day + 8 hours @ 98/105/mbit/sec to 15 hrs @ 200/mbit/sec.

Much improvement and hope it finishes it.  I will also expand parity to monthly or every time hardware change vs weekly on Mon.

 

Link to comment
  • 3 months later...

I've had some time recently to look into this on my system as well. I have all newer drives that are 8-14TB that individual tests at around 200MB/s in the DiskSpeed utility. The DiskSpeed utility also does a test reading all drives at one time and it had nearly the same read speeds (only a 2% loss).

 

Yet, I never see my parity check go above 120MB/s and it probably averages around 100MB/s.

 

Based on my parity check history I can also verify like others did here that it slowed down significantly when I upgrade from 6.7.x to 6.8.x (I am now on 6.9.2).

 

Has anyone made any discoveries as to WHY the parity check is this slow???

Link to comment
On 10/26/2021 at 10:08 PM, david11129 said:

...controller becomes a bottleneck...

forgot to come update this when tried moving 9202-16e to the 16x slot, in multiple reboots neither server hardware scanning thing that runs on boot nor unraid saw those drives so ended up just moving it back to the riser that has the 2 9207-8i cards that connect to the 14 server drive bays. Due to a different server issue (web ui problem a little while after booting up) i can't look up parity history, but iirc most checks have been just under 2 days this year. In the past the GPU that was in that slot was allocated to a VM, or at a different point a network card was in that slot allocated to a different VM, but that dealt with blocking off a PCIe device id (at least on the NIC) not the PCIe slot. One of those things on the "possibly investigate in the future" list...or when it is needed for some other reason.

 

10 minutes ago, tk40 said:

Based on my parity check history I can also verify like others did here that it slowed down significantly when I upgrade from 6.7.x to 6.8.x (I am now on 6.9.2).

 

Has anyone made any discoveries as to WHY the parity check is this slow???

notification of your comment reminded me to reply above, but if there was an OS change, would be curious what since my speed has gone from ~120-140 down to ~80-100 (more like 80). Or if the subset of us with the issue are just having confirmation bias 😜

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.