Jump to content
We're Hiring! Full Stack Developer ×

Extremely long parity checks


Menaan

Recommended Posts

Hello all,

I wanted to see if anyone had any recommendations on how to deal with some extremely long parity checks I'm seeing on my unraid server.

 

First off for my server, I've got 5 x 16 TB disks that I'm using ZFS plugins to run a raidz2 config on.  These are not the issue.

I have 3 x 8 TB disks in the default unraid pool with 1 set as a parity drive.  I also have a 1 TB Sabrent NVME drive as a cache drive.  This is the pool that I'm having issues with.

 

I currently have parity checks to run once a week, and they have been getting slower and slower.  To the point now where they are taking upwards of 3 days to run.  And I have less that 600 GB of total data on this pool (I don't use this pool for any primary storage, only for VM storage and as a ingest pool for things going to my larger zfs pool)....  So I'm a little baffled as to why the parity check would be taking so long.  I'm also not sure why the Cache drive gets hammered so hard during a parity check?  While the parity check is running I can see it's read and write counts going up just as fast as the pool disks.  Again, this is showing my limited knowledge of how this process actually works, but I would think the Cache drive wouldn't be part of the parity check as it should only have data on it that has not yet been written out to the primary disks in the pool.  At least I think that is how it works in Unraid?  I don't think Unraid does anything with frequently accessed data being moved to the Cache drive does it?

 

This system has a Core i5-11400 CPU, and during the parity check it averages around 25% load.  128 GB of ram, a lot of which is used up by ZFS, but also there are a couple of VMs running on the system.  Nothing seems to be maxed out though.

 

Right now, the parity check is running and the Main page is reporting it processing at 7 MB/s.  I honestly don't know if that speed is normal, seems pretty slow to me though given that even the 8 TB drives are capable of > 100 MB/s read and write speeds.

 

So, I'm just not sure where to go from here to diagnoses the slowness further.  Or if this is normal expected speeds?  If this is normal speeds I'm going to have to decrease the frequency of parity checks =/

 

Thanks

Edited by Menaan
Link to comment
11 minutes ago, JorgeB said:

Once a month is enough, and please post the diags during a parity check.

I've pulled a diag since the parity check is currently running.  Are there certain files it would be helpful for you to see? I'd rather not upload the entire zip as even though I had anonymize diagnostics selected, I glanced through and there is still some data that I'd consider sensitive in this zip.

Link to comment

VM / docker names are what caught my attention.  I have some named in a way for organization purposes that even disclosing that VM name could be considered a violation of NDAs I have with some clients =/

 

I see those names in a few places, ps.txt, and in the xml and qemu folders.  Not sure if they are in other logs that I just didn't spot them.

 

I'm also not thrilled about user names showing up in places, but that at least doesn't violate any NDAs I have with clients lol.

Link to comment
6 hours ago, Squid said:

I don't see where any user names are showing up anywhere.  But I put in a feature req for anonymizing VM names

 

/config/smb-extra.conf is one place where I saw them.  In the read and write lists for permissions on the shares.

 

I'm attaching a modified zip of the diagnostics with the information removed that I don't want to share that I don't think will be pertinent for review.

 

I'm going to remove the file after it's been reviewed just in case I missed something somewhere.

 

Edited by Menaan
Link to comment

Logs are basically full because it appears that you have the ssh port forwarded to the internet at large (or the server in the router's DMZ).  Script kiddies from around the world  are constantly attempting to log in, so we can't really even see what's going on.

 

If you need remote access use MyServers or a VPN solution

  • Haha 1
Link to comment

Yeah, port 22 is open for offsite backup pulling purposes using rsync over ssh (for various reasons I have to pull from the remote server not push from this server).  I'm not too concerned with the script kiddies cause password login is disabled, requires a cert to connect.  I've seen the attempts in the log but considering the rotated logs go back to the 14th before the parity check started, and it only effects the system.log I assume anything relevant to the parity check would still be in the logs.

 

If you need me to I can easily strip all the failed connection lines out of the logs.  I didn't realize that log was singularly important since all the files were needed.

Link to comment
1 hour ago, trurl said:

Wireguard VPN is built-in to Unraid.

 

 

 

I'm aware of wireguard vpn and I do use it for some things.  However I'm not really sure how that pertains to my parity check taking a really long time issue?  If this is just about the ssh traffic, that isn't really the point of this support ticket.  But, also not sure of the relevancy even then because does it really matter which port is open if there is going to be a port open for a direct ssh connection, or a vpn connection?  They are both going to get picked up and hammered by script kiddies running port scanners.  I'm not going to claim to be a networking expert, but I don't see any security advantage in my use case for connecting the ssh connection through a vpn vs connecting it directly using certificate based authentication only.  VPN is going to use more system resources on both ends for the same end result. But, again, not the point of this thread, so that is a topic for another thread.

 

I'd really like to get this parity check thing figured out.  t looks like this most recent run is going to take even longer than 3 days this time which is just crazy for such little data to be on this pool to me.  Currently at 2 days 7 hours and is only 61% done.

Link to comment
39 minutes ago, JorgeB said:
22 hours ago, JorgeB said:

and please post the diags during a parity check.

 

16 hours ago, Menaan said:

 

I'm attaching a modified zip of the diagnostics with the information removed that I don't want to share that I don't think will be pertinent for review.

 

I'm going to remove the file after it's been reviewed just in case I missed something somewhere.

 

 

I did post them, and subsequently removed them after they were viewed as I said I would.

 

Here they are again if you specifically want to review them, but I will remove them again once they have been reviewed.

 

 

Edited by Menaan
Link to comment

 

EDIT:

For anyone else that may reference this thread, don't do what I did below :P  Apparently the shfs is supposed to be running haha.  Had to reboot my server cause of some docker and vm errors after killing that command.  I acted a little hasty there not researching the command before killing it hehe.  Everything sems to be fine after rebooting at least.

 

 

36 minutes ago, JorgeB said:

there's something writing to disk1 in those diags, that will slow dow the check by a lot.

 

Interesting, thank you for pointing that out.

I looked and I did find an shfs command apparently stuck running doing something with writing a file to disk1, not sure exactly what it was doing.  The files it had open have already moved to my main zfs storage pool, so I double checked their integrity really quick and they were fine, so I killed the shfs command.

 

Now the parity check speed shot up to 100 MB/s.  So I suspect it should get done a lot quicker.

 

So, now I guess I'll have to figure out what that shfs command was actually trying to do, and see if there is either a way to not let it do it during a parity check or something...  Not really sure what the solution is but it at least gives me a point to research more from.

 

Edited by Menaan
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...