[6.8.3] docker image huge amount of unnecessary writes on cache

bonienl · May 11, 2020

I am not saying it is not a problem ...

KriS · May 11, 2020

Today found this post...

I run iotop for few hours:

Start -> Mon May 11 11:45:17 CEST 2020
End -> Mon May 11 18:49:54 CEST 2020

root@VDS:~# iotop -oa
Total DISK READ :       0.00 B/s | Total DISK WRITE :      24.44 M/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:      24.18 M/s
  TID  PRIO  USER     DISK READ DISK WRITE>  SWAPIN      IO    COMMAND
 6169 be/0 root         65.20 M    100.45 G  0.00 %  1.06 % [loop2]
18510 be/4 root         58.20 M   1852.94 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18491 be/4 root         59.39 M   1839.84 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
10221 be/4 root         59.02 M   1809.12 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18515 be/4 root         57.02 M   1775.67 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18601 be/4 root         58.44 M   1751.80 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18600 be/4 root         60.01 M   1746.48 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18492 be/4 root         59.45 M   1741.75 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18502 be/4 root         58.53 M   1730.81 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18451 be/4 root         58.43 M   1666.26 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
18505 be/4 root         60.94 M   1611.01 M  0.00 %  0.03 % shfs /mnt/user -disks 511 2048000000 -o noatime,allow_other -o remember=0
 6192 be/4 root        140.00 K   1516.05 M  0.00 %  0.11 % [btrfs-transacti]
 [...]

Everything after last line is less then 200M writes.

I've new SSD disk with 400 TBW and 5y warranty

PTRFRLL · May 11, 2020

Another affected user here. My SSD has 499.90 TB in writes just over a year in use!

Echoing what others have said, I have 2 x SSDs in a btrfs unencrypted RAID-0 setup for cache. I see anywhere from 25Mb/s to 80+Mb/s writes on the cache pool.

I have another Unraid server, set up the same way (6.8.3, 2xSSD btrfs) that does not seem to be affected. The main difference between the two systems is the dockers that are running.

In my case, the official Plex, MariaDB, and Zoneminder dockers seem to be the main offenders. If I disable those, I can drop the writes to ~ 200Kb/s. Plex I switched to linuxserver but unfortunately I can't simply disable my CCTV software...

Edited May 11, 2020 by PTRFRLL
Add more details

Niklas · May 11, 2020

I see the same regarding MariaDB. Lots of small writes to my databases but it generates huge amount of writes to the ssds. For now, I have my databases on one of my unassigned devices but that's not optimal. Just to save some ssd cache wear. I still see lots of writes to my cache from other dockers.

hovee · May 11, 2020

EDIT: Nevermind... still screwed.

I posted earlier that I'm affected by this issue as well. I am using a btrfs unencrypted cache pool with 2x 1TB Samsung 860 EVO SSDs. One thought occurred to me that I wanted to run by everyone. For your containers that you have running do you have them set to /mnt/cache/appdata or /mnt/user/appdata? I had all of mine set to /mnt/cache/appdata. Every time I would install a new container I would change the variable path to use cache. I just went through all of my containers that I have running and switched them to /mnt/user/appdata. Right now I'm at 2gb writes for loop2 writes after 10 minutes.

I need to do more testing and see if this is any better. I'm grasping at straws as my SSDs are trashed now.

I currently have 25 containers running:

1activegeek/airconnect

grafana/grafana

homeassistant/home-assistant

influxdb:latest

jlesage/filebot

jlesage/filezilla

jlesage/firefox

jlesage/handbrake

linuxserver/ddclient

linuxserver/duckdns

linuxserver/letsencrypt

linuxserver/mariadb

linuxserver/nextcloud

linuxserver/plex

linuxserver/unifi-controller

modenaf360/youtube-dl-nas

oznu/homebridge

portainer/portainer

saspus/duplicacy-web

spants/mqtt

telegraf:alpine

Edited May 11, 2020 by hovee

Niklas · May 11, 2020

I have tried that. Pointing to /mnt/cache/appdata or /mnt/user/appdata. Made no big difference. Just other names in iotop for the processes that eats away data.

Edit

shfs vs the real process name

Edited May 11, 2020 by Niklas

hovee · May 11, 2020

1 minute ago, Niklas said:

I have tried that. Pointing to /mnt/cache/appdata or /mnt/user/appdata. Made no big difference. Just other names in iotop for the processes that eats away data.

yep.. I was just doing some math from my post above and came to the conclusion it made no difference and I'm still screwed...

nas_nerd · May 12, 2020

Bad news on my end, my writes have crept back up again, and arguably just as bad as it was before I switched my cache to XFS.

This suggests it is a docker issue again.

I need to do more testing but at this stage I am suspecting the Nextcloud/MariaDB dockers are causing the excessive writes.

I would be interested if other people are using these dockers and can test by stopping them and tracking writes?

I need to back-track from blaming BTRFS now too, at least in my case, the file system does not appear to be a factor in excessive writes.

grigsby · May 12, 2020

24 minutes ago, nas_nerd said:

I would be interested if other people are using these dockers and can test by stopping them and tracking writes?

I am not using either nextcloud or mariadb.

I wonder if there's a faster way to test this. I have two SSDs in the server, since the original plan was to have them both in a (btrfs) cache pool. I had to remove the pool, and reformat one of the drives as xfs, which is now running as a single cache drive.

The second SSD is now just sitting idle, unformatted, not mounted, doing nothing. Is there an easy way to format the second SSD as btrfs, mirror the contents of the cache disk to the btrfs disk, and tell unraid to use SSD_2 as the cache? It might make it easier to switch back and forth a little faster between an xfs cache disk and a btrfs cache disk and watch what happens with I/O.

S1dney · May 12, 2020

20 hours ago, bonienl said:

The priority "Urgent" means something is seriously wrong and prevents the system from working normally.

This is not really the case here...

The priority "Minor" may sound as insignificant, but it does mean Limetech is looking into the issue and address it as appropriate.

You're right, which is why @itimpi's suggestion of having another category in between sounds like a good one.

8 hours ago, hovee said:

yep.. I was just doing some math from my post above and came to the conclusion it made no difference and I'm still screwed...

The unRAID file system is not the issue here. To get the writes down you have to take the loopdevice (loop2) out of the equation by mounting the docker directory in the filesystem (e.g. creating a symlink from /var/lib/docker to a location on disk/cache). The error seems to be docker in combination with the loopdevice, reading though some comments it's not entirely clear where (and if) btrfs (with or without encryption and/or pooling) has a relation with this problem also, my guess is yes.

There has been a lot of reports of certain docker containers (like the official Plex) writing a lot also, so it's easy to mistake that and this bug, also since it's possible you're affected by both of them

To have this solved you'd just need some devs on board. Someone that spins up a test machine and who is able to understand how this relates to page flushes etc (sadly this goes beyond my knowledge).

Then again I'm sure the devs have other issues also which might have more priority at the moment, although priority is usually driven by community calls right? So this might shift

bastl · May 12, 2020

3 hours ago, S1dney said:

reading though some comments it's not entirely clear where (and if) btrfs (with or without encryption and/or pooling) has a relation with this problem also, my guess is yes.

I use a single btrfs drive unencrypted and have the same issue.

3 hours ago, S1dney said:

There has been a lot of reports of certain docker containers (like the official Plex) writing a lot also, so it's easy to mistake that and this bug, also since it's possible you're affected by both of them

Earlier I already reported my findings. For me it doesn't matter which docker is up and running or if all dockers are stopped. As soon as I enable docker in Unraid I see that increased writes. Dissabling docker itself, Boom, problem disapears. Enabling it with no container running, tada, writes from loop2 are back with 2-5mb/s. Most docker containers people are reported, for myself I don't even use. No Plex, no download managers. Sure, you can reduce the amount of writes by disabling a docker, but it doesn't change the behaviour. Containers like unifi, netdata or nextcloud for example will always produce writes if some monitoring is enabled or mobile devices randomly connecting and checking for new files. Let's hope someone will figure this out. Maybe the next Unraid with a newer docker engine will already have a fix for this. Who knows.

S1dney · May 12, 2020

14 minutes ago, bastl said:

I use a single btrfs drive unencrypted and have the same issue.

Earlier I already reported my findings. For me it doesn't matter which docker is up and running or if all dockers are stopped. As soon as I enable docker in Unraid I see that increased writes. Dissabling docker itself, Boom, problem disapears. Enabling it with no container running, tada, writes from loop2 are back with 2-5mb/s. Most docker containers people are reported, for myself I don't even use. No Plex, no download managers. Sure, you can reduce the amount of writes by disabling a docker, but it doesn't change the behaviour. Containers like unifi, netdata or nextcloud for example will always produce writes if some monitoring is enabled or mobile devices randomly connecting and checking for new files. Let's hope someone will figure this out. Maybe the next Unraid with a newer docker engine will already have a fix for this. Who knows.

That seems to be exactly the issue I was facing indeed.

A container that wrote a lot, would just bump up writes a lot faster but in general every write docker makes seemed to just get multiplied by who knows what factor

So that could potentially rule out encryption and pooling of disks and would leave the combination between BTRFS and a loop device, I don't believe XFS was affected by this (I heard/read some users reporting XFS did not had these issues).

Edited May 12, 2020 by S1dney

chanrc · May 12, 2020

I'm 3 days now since the switch to a single un-encrypted XFS cache and consistently getting better results. loop2 is producing only ~9MB/min during idle for writes with all my dockers started (included binhex Plex, sonarr, radarr, sabnzbd, deluge, mariaDB, nextcloud, letsencrypt, cloudflare-ddns, pihole, ombi, grafana, teleconf, influxDB) compared to the ~8MB/s I was seeing before after stopping all my dockers but only having docker enabled on my un-encrypted BTRFS cache.

Not sure what the trigger for @nas_nerd's XFS issue but I can't repro it with mariaDB and nextcloud enabled (no user connects in the last 3 days though, maybe I should try and upload something).

over 10 minutes using iotop -oa -d 600

over four hours using iotop -oa -d 14400 with several small uploads to nextcloud and a couple of downloads.

Edited May 12, 2020 by chanrc
Adding four hour screenshot

pottlepaul · May 12, 2020

I am not sure if it would be helpful to anyone, but I moved the system share from the cache to the array as a temporary measure and it seems to have helped a lot. I went from an average of 160 GB / day of write to around 60 GB / day. Not perfect, but there is a good chance that is what my dockers are actually using as I have a lot running. Just wanted to share what I found to prevent beating on my SSDs a little.

S1dney · May 12, 2020

38 minutes ago, pottlepaul said:

I am not sure if it would be helpful to anyone, but I moved the system share from the cache to the array as a temporary measure and it seems to have helped a lot. I went from an average of 160 GB / day of write to around 60 GB / day. Not perfect, but there is a good chance that is what my dockers are actually using as I have a lot running. Just wanted to share what I found to prevent beating on my SSDs a little.

By this I think you’re essentially moving the docker image (and thus the mount on /var/lib/docker) onto the array. So these writes should not go to the cache anymore, I guess.

Docker will keep you array up non-stop though, which kind op defeats unRAID selling point in being able to spin down disks.

When you combine this with the unassigned disks plugin, you might be able to put it on a single disk for now (I think, haven’t used the plugin before) and have the array still fall asleep.

Good suggestion for some people that are not into making unsupported CLI/script tweaks, thanks for sharing!

Also as @chanrc is reporting, this really looks to be btrfs related, which is sad, cause it’s your only option if you want to have a redundant cache.

Edited May 12, 2020 by S1dney

hovee · May 12, 2020

On 3/21/2020 at 5:36 AM, S1dney said:

I've read though my posts real quick and noticed I have not yet provided my final solution, so let me share that.

Basically what I did was:

I tried following your tutorial, but after reboot it tells me I don't have any docker containers installed.

It shows me the docker service is running.

Checking the logs I can see it created the softlinks.

If I remove that entry to copy the rc.docker file in the /boot/config/go directory everything works again after a reboot. However, then it puts it back to original and uses loop2.

S1dney · May 12, 2020

19 minutes ago, hovee said:

I tried following your tutorial, but after reboot it tells me I don't have any docker containers installed.

It shows me the docker service is running.

Checking the logs I can see it created the softlinks.

If I remove that entry to copy the rc.docker file in the /boot/config/go directory everything works again after a reboot. However, then it puts it back to original and uses loop2.

It behaves as expected.

All your docker containers are downloaded into the docker image (located in the system folder somewhere on the cache, docker.img is the file I think).

After the changes you’ve made, you’re not mounting that anymore, but have docker targeted to a different directory.

Docker will create the needed working directories upon service start, meaning that all container are still inside the docker.img file.

I initially re-created them, using the templates from the dockerMan GUI this isn’t too much work and all persistent data should not reside in the docker.img anyways or you might lose it if the docker.img gets corrupted. I guess you could also copy all data over before implementing the script that mounts the cache directory but I would recreate the containers if I were you.

You should also recreate the docker.img image if you’re done with everything, so that when something changes in future unRAID versions which potentially breaks this, you’ll notice that you have no containers after a reboot and know the docker.img file is mounted or something else is wrong :-)

hovee · May 12, 2020

2 hours ago, S1dney said:

It behaves as expected.

All your docker containers are downloaded into the docker image (located in the system folder somewhere on the cache, docker.img is the file I think).

After the changes you’ve made, you’re not mounting that anymore, but have docker targeted to a different directory.

Docker will create the needed working directories upon service start, meaning that all container are still inside the docker.img file.

I initially re-created them, using the templates from the dockerMan GUI this isn’t too much work and all persistent data should not reside in the docker.img anyways or you might lose it if the docker.img gets corrupted. I guess you could also copy all data over before implementing the script that mounts the cache directory but I would recreate the containers if I were you.

You should also recreate the docker.img image if you’re done with everything, so that when something changes in future unRAID versions which potentially breaks this, you’ll notice that you have no containers after a reboot and know the docker.img file is mounted or something else is wrong 🙂

Thank you for the explanation. That makes sense. I will give that a try!

NewDisplayName · May 13, 2020

how to check if this is a problem for me?

one shows 934731420222 LBAS (2y6m)
the other 899487348326 LBAS (2y4m)

if im right its about 500gb a day, which seems much (?!)

sadly iotop is not working for me (was already installed)

root@Unraid-Server:~# iotop
libffi.so.7: cannot open shared object file: No such file or directory
To run an uninstalled copy of iotop,
launch iotop.py in the top directory

root@Unraid-Server:~# iotop -ao
libffi.so.7: cannot open shared object file: No such file or directory
To run an uninstalled copy of iotop,
launch iotop.py in the top directory
root@Unraid-Server:~# iotop.py
-bash: iotop.py: command not found
root@Unraid-Server:~# py iotop.py
-bash: py: command not found
root@Unraid-Server:~# python iotop.py
python: can't open file 'iotop.py': [Errno 2] No such file or directory

/dev/sdd Power_On_Hours 22510 hours / 937 days / 2.57 years
/dev/sdd Wear_Leveling_Count 44 (% health)
/dev/sdd Total_LBAs_Written 445842.14 gb / 435.39 tb
/dev/sdd mean writes per hour: 19.806 gb / 0.019 tb

/dev/sdc Power_On_Hours 20447 hours / 851 days / 2.33 years
/dev/sdc Wear_Leveling_Count 46 (% health)
/dev/sdc Total_LBAs_Written 428919.80 gb / 418.87 tb
/dev/sdc mean writes per hour: 20.977 gb / 0.020 tb

is that normal?

Idle windows 10 VM + the usual linux iso download dockers and some light plex...

I also wonder why limetech isnt posting anymore here, i mean its 6 months after he read it, was it addressed in some RC?

Edited May 13, 2020 by nuhll

nas_nerd · May 13, 2020

1 hour ago, nuhll said:

how to check if this is a problem for me?

one shows 934731420222 LBAS (2y6m)
the other 899487348326 LBAS (2y4m)

if im right its about 500gb a day, which seems much (?!)

sadly iotop is not working for me (was already installed)

root@Unraid-Server:~# iotop
libffi.so.7: cannot open shared object file: No such file or directory
To run an uninstalled copy of iotop,
launch iotop.py in the top directory

root@Unraid-Server:~# iotop -ao
libffi.so.7: cannot open shared object file: No such file or directory
To run an uninstalled copy of iotop,
launch iotop.py in the top directory
root@Unraid-Server:~# iotop.py
-bash: iotop.py: command not found
root@Unraid-Server:~# py iotop.py
-bash: py: command not found
root@Unraid-Server:~# python iotop.py
python: can't open file 'iotop.py': [Errno 2] No such file or directory

/dev/sdd Power_On_Hours 22510 hours / 937 days / 2.57 years
/dev/sdd Wear_Leveling_Count 44 (% health)
/dev/sdd Total_LBAs_Written 445842.14 gb / 435.39 tb
/dev/sdd mean writes per hour: 19.806 gb / 0.019 tb

/dev/sdc Power_On_Hours 20447 hours / 851 days / 2.33 years
/dev/sdc Wear_Leveling_Count 46 (% health)
/dev/sdc Total_LBAs_Written 428919.80 gb / 418.87 tb
/dev/sdc mean writes per hour: 20.977 gb / 0.020 tb

is that normal?

Idle windows 10 VM + the usual linux iso download dockers and some light plex...

I also wonder why limetech isnt posting anymore here, i mean its 6 months after he read it, was it addressed in some RC?

Make sure you have libffi installed and updated to get iotop to work.

I'm sure limetech are aware of this issue, they are no doubt busy on 6.9 at the moment, hopefully after it is released they can turn their focus to this issue.

goodGame · May 13, 2020

7 hours ago, nas_nerd said:

Make sure you have libffi installed and updated to get iotop to work.

I'm sure limetech are aware of this issue, they are no doubt busy on 6.9 at the moment, hopefully after it is released they can turn their focus to this issue.

LT have known about this since atleast December of last year, I really hope what you're saying isn't true.

woble · May 13, 2020

Switched to XFS, rebuilt the docker.img with the same dockers as I had before. Now `loop2` is around 9-30MB/min.

For those who are still on BTRFS perhaps it's worth trying removing the docker.img and re-adding all dockers. It might help. I should have tried that first.

My original post: https://forums.unraid.net/bug-reports/stable-releases/683-docker-image-huge-amount-of-unnecessary-writes-on-cache-r733/?do=findComment&comment=9019

Edit: after having it run for almost a day, the `loop2` write seems to be stable at around 1.7GB/h.

Edited May 15, 2020 by woble

grigsby · May 13, 2020

17 hours ago, nas_nerd said:

I'm sure limetech are aware of this issue, they are no doubt busy on 6.9 at the moment, hopefully after it is released they can turn their focus to this issue.

Maybe, but you said the same thing about them working on 6.8 when this bug was reported back in version 6.7 and we still haven't heard anything from them about it. This is not a minor issue -- I suspect that it's actually happening for a LOT of installs, but most people don't know it's happening because it requires actually looking for it.

This is a serious bug -- potentially costing users a lot of money in trashed SSDs. And this is commercial software -- we're paying for a license to use it, so it's not just a FOSS project where expectations should be low. In my opinion, LimeTech should be investigating this issue and prioritizing it before releasing the next version.

muwahhid · May 14, 2020

This is just a disaster.

I've been using Unraid for almost a month now. I put btrfs raid1 in the cache without encryption. Two brand new samsung ssd 860 2tb. Just yesterday I paid for the Unraid pro license for $ 89.
Today I accidentally learned from a person that a forum is discussing this problem. I decided to check my SSD. And now I ask you, are you kidding me guys? My two new SSDs already have 90tbw, and 28 percent of my life. THAT is 1 percent life ssd for every day?

And this problem has been known to you since the month of November, six months have passed and you still haven’t fixed it?
I'm just in shock. All impression spoiled about Unraid and Limetech.

Edited May 14, 2020 by muwahhid

nas_nerd · May 14, 2020

41 minutes ago, muwahhid said:

This is just a disaster.

I've been using Unraid for almost a month now. I put btrfs raid1 in the cache without encryption. Two brand new samsung ssd 860 2tb. Just yesterday I paid for the Unraid pro license for $ 89.
Today I accidentally learned from a person that a forum is discussing this problem. I decided to check my SSD. And now I ask you, are you kidding me guys? My two new SSDs already have 90tbw, and 28 percent of my life. THAT is 1 percent life ssd for every day?

And this problem has been known to you since the month of November, six months have passed and you still haven’t fixed it?
I'm just in shock. All impression spoiled about Unraid and Limetech.

How are you calculating 90TBW?

[6.8.3] docker image huge amount of unnecessary writes on cache

User Feedback

Recommended Comments

bonienl 1768

Link to comment

KriS 0

Link to comment

PTRFRLL 32

Link to comment

Niklas 57

Link to comment

hovee 4

Link to comment

Niklas 57

Link to comment

hovee 4

Link to comment

nas_nerd 0

Link to comment

grigsby 6

Link to comment

S1dney 49

Link to comment

bastl 208

Link to comment

S1dney 49

Link to comment

chanrc 0

Link to comment

pottlepaul 1

Link to comment

S1dney 49

Link to comment

hovee 4

Link to comment

S1dney 49

Link to comment

hovee 4

Link to comment

NewDisplayName 117

Link to comment

nas_nerd 0

Link to comment

goodGame 3

Link to comment

woble 0

Link to comment

grigsby 6

Link to comment

muwahhid 19

Link to comment

nas_nerd 0

Link to comment

Join the conversation