2.5gb networking and the cpu


TSM
Go to solution Solved by MAM59,

Recommended Posts

Hello Folks,

 

I've been playing around with it for a few days, and I finally got 2.5gb networking working between my pc and unraid server. 

 

Let me preface my question here by saying that the 2.5gb networking is just the first step in several upgrades I'm planning to my home LAN setup.  So this question may end up being moot but I'm curious none the less.  

 

In my Unraid server I have a Celeron 3920, which is a dual core unit running at 2.9ghz.  I am not currently using a cache drive, but I will be after all of my upgrades are complete.  And I will also be using a much faster processor, but I'm not there yet.  So, dual parity calculations are taking place in real-time as the files are being moved to the server.  Slow cpu yes, but at 1gigabit it was never a problem.  In a setup like this would the cpu be a bottleneck for 2.5gb networking?  Is the overhead of having to do dual parity calculations at that speed going to kill this cpu?  

 

Both pc and unraid server report network adapters running at 2.5gb, and with small single file copies I see speeds that are consistent with what you'd expect with 2.5gb going to a mechanical drive.  Between 140 megabytes and 180 megabytes.  But doing large multi file copies with teracopy it was choppy, inconsistent.  Sometimes seemed to be going very fast, and right around the 4th or 5th gigabyte it would stop completely and then start again several times until it was done depending on the overall size of the copy.  Looking at the dashboard screen, at the times it was stopping, either 1 of the 2 cpu cores was being pegged at 100%, or both cores were getting up into the 90% range on utilization.  If I turned Teracopy off, the regular Windows explorer copy process, behaves similarly but different. After 4 or 5 gigabytes, the cpu starts to get high utilization again maybe getting into the 90s for a second, but instead of file copy stopping altogether for a moment, explorer throttles the copy speed down considerably to like 30 or 40 megabytes, but it never seems to stop and the unraid server cpu never gets pegged at 100%.  I'm guessing that some sort of Windows QOS is kicking in where it detects a potential problem coming up and changes it's behavior so it doesn't completely stop.  

 

I have another computer on my network, that still has only a 1gigabit adapter, and file copies between it and my unraid server, and it and my primary computer seem completely normal and work fine.  So I don't think the networking components are faulty.  

 

Ultimately after all of my upgrades are done, I'm going to be using a much faster cpu in my primary unraid server, and a cache drive.  But I'm several weeks to a month out from completing everything I want to do.  And was planning on moving a lot of files around between now and then, thus why I decided to do the 2.5gigabit networking first, so that the file moves would be faster.  Is there any setting I might be able to change in the interim that would make file moves and copies more consistent while still getting benefit from the 2.5gb networking?  For example is there any way to manually throttle the speed to where it's faster than 1gigabit, but isn't throwing work at the cpu at 2.5gigabit speeds? I've also looked at the drive tuning settings, but the only one I'm really familiar with is tunable md write method, and changing that didn't do anything of value.  

 

I might just change back to 1 gigabit until my upgrades are done. Don't want to do that, but I'd really like for file management behaviors to be more consistent.  

Edited by TSM
Bad grammar
Link to comment
  • Solution

What you see is the effect of low ram+slow cpu+slow disks (no cache drive). Its a combination of all of them, not a single problem.

During File Copy to UNRAID, incoming data is written out to disk (the parity calculation is almost nothing, but the parity write then takes away 50% speed). If the disks cant keep up, data is put into ram until full (btw, UNRAID shows 100% CPU usage during this "no free io slot" period. This is a false reading, look at the CPU temperature, if it keeps low, the io is blocked, not the CPU overloaded). Finally it sends a "STOP PLEASE!" packet to the sender.

Then it takes some time to write out the RAM content to the disk, this is the "stalled" period you see. After this it sends a "YOU MAY CONTINUE" packet and transfer starts again.

 

Terracopy and Windows Copy have their own approach how to handle this situation. Windows seems to be more polite, but at the end it is usually much slower. Anyway, the stall is no bug, its a must.

 

You may install the "Tips & Tweaks" Plugin, there you find two settings to play with.

"vm.dirty_backround.ratio" and "vm.dirty.ratio" control the level of how much ram is used to cache before io starts and how much  ram must be free again until the blockade is realeased.

If you buffer too much data in ram, the duration of the stalls will be longer (needs more time to free up more ram from data). If you set them to very low value, transfers will look to be smaller, but stall do not happen anymore or are really short. There is no "good" setting for them, it depends on your hardware and you LAN. You need to experiment a bit.

 

The faster your LAN gets, the more visible this effect will be. Even a cache drive (for writing only, I like to see a read cache too) does not fully compensate for this. For 2,5G LANs a SATA SSD might be sufficient, for higher speeds you need a PCIe NVMe drive. But even then stalls can happen if the cache is not capable to hold the speed for sustained writes.

 

 

  • Like 1
Link to comment

Thanks again MAM59.  I tried the "vm.dirty_backround.ratio" and "vm.dirty.ratio", using the suggested settings in the plugin of 2 and 3 respectively.  It's not completely perfect, but the performance on large file copies is definitely much better.  So, fingers crossed that when I am using better hardware, and planning on using nvme for the cache it will be completely a non-issue.  

Link to comment

You will need to adjust those values for almost every hardware  again, at least a bit.

 

And, (VERY IMPORTANT!) before buying an NVMe SSD, read tests carefully. There are SOME out there that are able to survive long writes with high speed, but not many.

Most of them give up either because they run out of cache themselfs or simply overheat and slow down to fix it.

So buy carefully!

(there was a recent test of 20 currently available NVMes here in Germany in a magazine, some were fast AND rather cheap. But usually you need to buy the "better" ones of a brand, like the "PRO" line from Samsung. And believe me, if you do not follow this advice, you will get the same hickups and regret it quite fast.)

 

Edited by MAM59
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.