Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[SOLVED] High inward network traffic upsets Docker

Featured Replies

Hi all,

I've been facing a problem with my Unraid server for a while whereby a large file transfer to the server will somehow upset Docker and cause problems, including:

  • The Docker portions of the webGUI become slow
  • Attempts to stop, remove, or kill Docker containers fail with error messages along the lines of "attempted to kill container but did not receive a stop event from the container"
    • This is regardless of whether I use the webGUI, the command line, or Compose
  • Connections to Docker containers from elsewhere on the network fail.
  • (particularly annoying) my ADS-B receiver Docker container stops working and needs to be manually restarted before it'll work again, even if the offending transfer has already finished.

 

Examples of big transfers in that can cause this problem:

  • A backup coming in to the server from offsite (through rsync directly on Unraid)
  • Windows File History backing up to the server (through Unraid's built-in SMB server functionality)
  • A large download through Lancache on the server (through the Lancache Docker container)

You'll notice that the first two of those should have nothing to do with Docker.

 

Frustratingly, the following do not cause the problem:

  • A backup going offsite from the server (rsync again)
  • Mac OS backing up through (a Time Machine Docker container)
  • Media being served out to other devices (built-in SMB)

 

Clearly, my server is running out of some resource that Docker needs, but I'm at a loss as to what it is. Despite the problem being associated with a lot of network traffic, I doubt it's bandwidth itself, since one of the triggers comes from offsite, and it's unlikely my 100Mb connection (less VPN overheads) is saturating the server's gigabit Ethernet. While the CPU does run up during problematic transfers, it doesn't seem to be totally overloaded, and there's bags and bags of free memory. Let's take as a specimen what happened yesterday evening and see what Netdata recorded:

 

image.png.b127eb6f993e67dfac3e50e3e396e421.png
As you can see, the ADS-B receiver fell over sometime around 17:00. I believe this was triggered by Windows backing up to Unraid's SMB server. The CPU runs up a bit for about 20 minutes at 16:30, then a bit more an hour later, but it never goes above 60%:

CPU.thumb.png.68043c949d64cdccdd6b981c7f5d67d8.png

There's some disk IO at the same times:

1504735918_DiskIO.thumb.png.d80eb0eec93d29643a13cd746d9c2ec4.png

Nonetheless, we have all of the memory:

1249524685_SystemRAM.thumb.png.947715044c5359db7c6acf18e9553eb1.png

And there's a network traffic spike, but nothing gigabit Ethernet shouldn't be able to handle:

Network.thumb.png.d00d98fd1fe115101c1d1f2b0a4b9940.png

 

Can anyone suggest other metrics I should examine? Or is there a way to decrease the general niceness of the Docker daemon so it will just take a bigger share of what resources there are?

 

Vital statistics:

  • Unraid v6.12.6
  • Dell PowerEdge R720XD
  • Diagnostics are attached

 

Edited by ScottAS2
Mark as solved

Solved by tpill90

  • 1 month later...
  • Solution

What drives (model number is helpful) are you running in the server?  Do they have a dram cache?

 

This sounds like your drives simply can't handle the write workload. The SLC cache on the drive is exhausted,  so the remaining writes go at the actual full speed of the drive.  The huge drop in performance causes huge contention between the applications writing to disk, amplifying the performance issues they all are having.  

The reads aren't affected by this because reads don't suffer from the same kind of issues. They will generally be at full speed. 

 

 

  • Author
14 hours ago, tpill90 said:

What drives (model number is helpful) are you running in the server?  Do they have a dram cache?

 

This sounds like your drives simply can't handle the write workload. The SLC cache on the drive is exhausted,  so the remaining writes go at the actual full speed of the drive.  The huge drop in performance causes huge contention between the applications writing to disk, amplifying the performance issues they all are having.  

The reads aren't affected by this because reads don't suffer from the same kind of issues. They will generally be at full speed. 

 

This chimes with a discovery I've made in the meantime: bypassing the cache and writing directly to the array seems to solve the problem. Both cache drives (BTRFS/RAID1) are Crucial BX500 1TB SATA SSDs; part number CT1000BX500SSD1. I'm not sure how to find out if they have a DRAM cache, although the data sheet makes no mention of it, which probably means "no".

  • 4 months later...
  • Author

Ultimately, I "solved" this by adding two new SSDs and creating a second pool on them for the appdata and system shares. That suggests tpill90 was very much on the right track, if not correct.

  • SarahAS2 changed the title to [SOLVED] High inward network traffic upsets Docker
  • Community Expert
On 2/21/2024 at 10:08 AM, ScottAS2 said:

I'm not sure how to find out if they have a DRAM cache, although the data sheet makes no mention of it, which probably means "no".

Here is a review on the Crucial BX500 series of SSD drives:

 

      https://www.pcworld.com/article/394337/crucial-bx500-sata-ssd-review.html

 

Read the section under Performance about what happens to the BX500 when the NAND memory is filled and the QLC is being filled in real time.  This information, apparently, is not included on most spec sheets so you have to look for reviews that actually test for it!  (NAND memory is much more expensive per bit than QLC memory and it is more power hungry.) 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.