Copying large files makes server unresponsive


Recommended Posts

For some reason, since upgrading to 6.8.x it seems that my server becomes somewhat unresponsive if I copy large files (10Gb+) across my network to it. I have tried copying to a share that has cache enabled and also to a share with no cache enabled. I see the same symptoms whereby for about the first 20-30 seconds it copies at ~110Mb/sec, maxing out a gigabit connection but then starts to drop massively, sometimes down to single digit Mb/sec. While this is happening it becomes almost completely unresponsive at times. Dockers become inaccessible, or just take a very long time to respond.

 

I know I need to attach my syslog, but there are literally zero errors stated in it.

Link to comment

Doing some more testing, and it seems that while copying to a non-cache enable share still slows down significantly, it doesn't seem to cripple the server like copying to a cache enabled share does. It still suffers exactly the same speed drop from ~110Mb/sec to anywhere between 10 and 50 meg/sec, it fluctuates constantly.

Link to comment

Interestingly as well, copying from the same client machine to my win10 VM on my server copies at ~110meg consistently until the copy finishes. The VM lives on a btrfs pool created outside of the array and mounted using the unassigned devices plugin so I know that there is no issue with the network cabling etc. Copying from the VM then to array then exhibits the same issue with the slowing down and then inconsistent speed fluctuations.

Link to comment

I can understand that if copying to a share without cache enabled, I'd expect to see 40-50Mb/sec, but would it not be consistent rather than fluctuate so much? Also, when writing direct to cache I see exactly the same symptoms but it's worse in that it seems to basically cripple my server until sometime after the copy finishes! I don't recall seeing these issues prior to 6.8.x.

 

I also see similar when backing up to my other server which is also now running 6.8.x and running on entirely different hardware with no cache drives specified.

Link to comment

@johnnie.black So by switching from "Auto" to "reconstruct write" seems to solve the issue when copying directly to the array, doesn't fix the issue with cache though obviously. This also seems to solve the issue of slow/intermittent speeds when backing up to my other server.

 

Is there a potentially that my cache drives have issues even though nothing is reported as being wrong? I have 2 drives that aren't the same make but are the same size.

 

 

Edited by allanp81
Link to comment
51 minutes ago, allanp81 said:

So by switching from "Auto" to "reconstruct write" seems to solve the issue when copying directly to the array, doesn't fix the issue with cache though obviously.

This suggests like mentioned that is the device(s) the can't keep up with writes, not all SSDs are super fast, also make sure they are trimmed regularly.

Link to comment

I have a mix of disks, could that be the problem? One is a Samsung SSD 850 PRO and the other is a SanDisk SD8SB8U512G1001.

 

Running Trim didn't help either, I'll have to look into maybe replacing the drives then. I don't have anything spare though so if replacing doesn't fix I've wasted £100 or so.

Edited by allanp81
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.