• [6.8.3] docker image huge amount of unnecessary writes on cache


    S1dney
    • Solved Urgent

    EDIT (March 9th 2021):

    Solved in 6.9 and up. Reformatting the cache to new partition alignment and hosting docker directly on a cache-only directory brought writes down to a bare minimum.

     

    ###

     

    Hey Guys,

     

    First of all, I know that you're all very busy on getting version 6.8 out there, something I'm very much waiting on as well. I'm seeing great progress, so thanks so much for that! Furthermore I won't be expecting this to be on top of the priority list, but I'm hoping someone of the developers team is willing to invest (perhaps after the release).

     

    Hardware and software involved:

    2 x 1TB Samsung EVO 860, setup with LUKS encryption in BTRFS RAID1 pool.

     

    ###

    TLDR (but I'd suggest to read on anyway 😀)

    The image file mounted as a loop device is causing massive writes on the cache, potentially wearing out SSD's quite rapidly.

    This appears to be only happening on encrypted caches formatted with BTRFS (maybe only in RAID1 setup, but not sure).

    Hosting the Docker files directory on /mnt/cache instead of using the loopdevice seems to fix this problem.

    Possible idea for implementation proposed on the bottom.

     

    Grateful for any help provided!

    ###

     

    I have written a topic in the general support section (see link below), but I have done a lot of research lately and think I have gathered enough evidence pointing to a bug, I also was able to build (kind of) a workaround for my situation. More details below.

     

    So to see what was actually hammering on the cache I started doing all the obvious, like using a lot of find commands to trace files that were written to every few minutes and also used the fileactivity plugin. Neither was able trace down any writes that would explain 400 GBs worth of writes a day for just a few containers that aren't even that active.

     

    Digging further I moved the docker.img to /mnt/cach/system/docker/docker.img, so directly on the BTRFS RAID1 mountpoint. I wanted to check whether the unRAID FS layer was causing the loop2 device to write this heavy. No luck either.

    This gave me a situation I was able to reproduce on a virtual machine though, so I started with a recent Debian install (I know, it's not Slackware, but I had to start somewhere ☺️). I create some vDisks, encrypted them with LUKS, bundled them in a BTRFS RAID1 setup, created the loopdevice on the BTRFS mountpoint (same of /dev/cache) en mounted it on /var/lib/docker. I made sure I had to NoCow flags set on the IMG file like unRAID does. Strangely this did not show any excessive writes, iotop shows really healthy values for the same workload (I migrated the docker content over to the VM).

     

    After my Debian troubleshooting I went back over to the unRAID server, wondering whether the loopdevice is created weirdly, so I took the exact same steps to create a new image and pointed the settings from the GUI there. Still same write issues. 

     

    Finally I decided to put the whole image out of the equation and took the following steps:

    - Stopped docker from the WebGUI so unRAID would properly unmount the loop device.

    - Modified /etc/rc.d/rc.docker to not check whether /var/lib/docker was a mountpoint

    - Created a share on the cache for the docker files

    - Created a softlink from /mnt/cache/docker to /var/lib/docker

    - Started docker using "/etc/rd.d/rc.docker start"

    - Started my BItwarden containers.

     

    Looking into the stats with "iotstat -ao" I did not see any excessive writing taking place anymore.

    I had the containers running for like 3 hours and maybe got 1GB of writes total (note that on the loopdevice this gave me 2.5GB every 10 minutes!)

     

    Now don't get me wrong, I understand why the loopdevice was implemented. Dockerd is started with options to make it run with the BTRFS driver, and since the image file is formatted with the BTRFS filesystem this works at every setup, it doesn't even matter whether it runs on XFS, EXT4 or BTRFS and it will just work. I my case I had to point the softlink to /mnt/cache because pointing it /mnt/user would not allow me to start using the BTRFS driver (obviously the unRAID filesystem isn't BTRFS). Also the WebGUI has commands to scrub to filesystem inside the container, all is based on the assumption everyone is using docker on BTRFS (which of course they are because of the container 😁)

    I must say that my approach also broke when I changed something in the shares, certain services get a restart causing docker to be turned off for some reason. No big issue since it wasn't meant to be a long term solution, just to see whether the loopdevice was causing the issue, which I think my tests did point out.

     

    Now I'm at the point where I would definitely need some developer help, I'm currently keeping nearly all docker container off all day because 300/400GB worth of writes a day is just a BIG waste of expensive flash storage. Especially since I've pointed out that it's not needed at all. It does defeat the purpose of my NAS and SSD cache though since it's main purpose was hosting docker containers while allowing the HD's to spin down.

     

    Again, I'm hoping someone in the dev team acknowledges this problem and is willing to invest. I did got quite a few hits on the forums and reddit without someone actually pointed out the root cause of issue.

     

    I missing the technical know-how to troubleshoot the loopdevice issues on a lower level, but have been thinking on possible ways to implement a workaround. Like adjusting the Docker Settings page to switch off the use of a vDisk and if all requirements are met (pointing to /mnt/cache and BTRFS formatted) start docker on a share on the /mnt/cache partition instead of using the vDisk.

    In this way you would still keep all advantages of the docker.img file (cross filesystem type) and users who don't care about writes could still use it, but you'd be massively helping out others that are concerned over these writes.

     

    I'm not attaching diagnostic files since they would probably not point out the needed.

    Also if this should have been in feature requests, I'm sorry. But I feel that, since the solution is misbehaving in terms of writes, this could also be placed in the bugreport section.

     

    Thanks though for this great product, have been using it so far with a lot of joy! 

    I'm just hoping we can solve this one so I can keep all my dockers running without the cache wearing out quick,

     

    Cheers!

     

    • Like 3
    • Thanks 16



    User Feedback

    Recommended Comments



    Niklas

    Posted (edited)

    1 hour ago, tjb_altf4 said:

    Is it as simple as a folder delete to destroy what was previously an image file ?

     

    I just changed to directory, added my docker containers back and deleted docker.img when all looked good.

    Edited by Niklas
    Link to comment
    1 hour ago, tjb_altf4 said:

    Is it as simple as a folder delete to destroy what was previously an image file ?

    I would guess, that once setup in the new 6.9 options, that it would be the same as rebuilding your docker.img - as in, resetting up your templates/downloading images again.

    Just this time it writes directly to the file system.  

    Once I've got all the other bugs with 6.9 RC2 worked out, I may give it a go at the weekend, setup a share just for the docker images and see how it goes.

    Link to comment

    OT, I just picked up an old server locally and it had a few Samsung 840DC Pro 400gb SSD's in it. I was not even aware these existed.

     

    They have a 7.3 PETABYTE write rating! Performance is pretty good on them as well for a SATA based drive, particularly considering they are from 2014.

     

    These were not even broken in with 13TB on some and 100TB on others written.

    Link to comment

    Docker Stop The Service, Delete the image, Switch to be a folder, Restart the service, Reinstall via Previous Apps

    Link to comment

    I'm wondering if it's not possible to copy the contents from inside the image to the new folder and avoid reinstall the dockers?

    Link to comment
    29 minutes ago, Squid said:

    You'll spend more time and aggravation doing that than reinstalling.

    if your images are still available on docker hub..

    Link to comment
    52 minutes ago, danielb7390 said:

    I'm wondering if it's not possible to copy the contents from inside the image to the new folder and avoid reinstall the dockers?


    Without proper tooling the layers from the BTRFS will likely inflate and the image data won't be usable.

    With proper tooling this may/could work I think but just wiping is way easier 😂

    Link to comment
    47 minutes ago, uldise said:

    if your images are still available on docker hub..

    Then probably time to switch to a different repository

    Link to comment
    14 minutes ago, Squid said:

    Then probably time to switch to a different repository

    huh, great tools are on free docker accounts - and according to new docker hub rules, all "unused" images are gone very soon..

    Link to comment
    56 minutes ago, Squid said:

    Then probably time to switch to a different repository

    Assuming an alternative exists.

    Link to comment
    1 hour ago, BRiT said:

    Edit: They do not enforce image retention and have moved over to resource pull limits instead:

    mid of 2021 will be there very soon :) 

    Link to comment
    20 minutes ago, elmetal said:

    wouldn't libvirt.img also benefit from this change?

     

    Since its contents are mostly static, probably not very much.

     

    Link to comment
    2 hours ago, John_M said:

     

    Since its contents are mostly static, probably not very much

    That's true.

     

    As someone who just upgraded from 6.8.3 to 6.9, how would you go about this? Move everything to array, remove cache pool, create new cache pool, move everything back, change docker location to folder in the docker settings, add each docker and point to the proper appdata location etc and everything should be good to go?

    Link to comment
    11 hours ago, elmetal said:

    That's true.

     

    As someone who just upgraded from 6.8.3 to 6.9, how would you go about this? Move everything to array, remove cache pool, create new cache pool, move everything back, change docker location to folder in the docker settings, add each docker and point to the proper appdata location etc and everything should be good to go?

    Yes, I just did that. I recreated all containers from templates. I will look at the number of entries. I usually had an average of 2.000.000 million entries per day on the btrfs image. These are 74 containers

    Edited by muwahhid
    Link to comment

    With this option, docker takes up twice as much space with the same containers. )) The btrfs image was 30 gigabytes.

     

     

     

    image.thumb.png.1a31886d4abbb32f6800d1ecfa30d4e7.png

    Link to comment
    Niklas

    Posted (edited)

    57 minutes ago, muwahhid said:

    With this option, docker takes up twice as much space with the same containers. )) The btrfs image was 30 gigabytes.

     

     

     

    image.thumb.png.1a31886d4abbb32f6800d1ecfa30d4e7.png


    My docker dir show 72-79GB (using du) but my cache drive only show 19GB (df -h) used. 
    Edit: Using "Compute..." in the gui for the "system" share in unraid show 85,1 GB used. Cache only.

    Edited by Niklas
    Link to comment

    So coming back on this bug report.

    I have upgraded to 6.9 on March 2nd and also wiped the cache to take advantage of the new partition alignment (I have Samsung EVO's and perhaps a portion of OCD 🤣).

    Waited a bit to get historic data.

     

    Pre 6.9

    TBW on 19-02-2021 23:57:01 --> 15.9 TB, which is 16313.7 GB.
    TBW on 20-02-2021 23:57:01 --> 16.0 TB, which is 16344.9 GB.
    TBW on 21-02-2021 23:57:01 --> 16.0 TB, which is 16382.8 GB.
    TBW on 22-02-2021 23:57:01 --> 16.0 TB, which is 16419.5 GB.

    -> Writes somewhere on 34/35GB's average a day.

     

    6.9

    TBW on 05-03-2021 23:57:01 --> 16.6 TB, which is 16947.4 GB.
    TBW on 06-03-2021 23:57:01 --> 16.6 TB, which is 16960.2 GB.
    TBW on 07-03-2021 23:57:01 --> 16.6 TB, which is 16972.8 GB.
    TBW on 08-03-2021 23:57:01 --> 16.6 TB, which is 16985.3 GB.

    -> Writes round 12/13GB's a average a day

     

    So I would say 6.9 (and reformatting) made a very big improvement.

    I think most of these savings are due to the new partition alignment as I was running docker directly on the cache already and recall made a few tweaks suggested here (adding mount options, cannot remember which exactly).

     

    Thanks @limetech and all other devs for the work put into this.

    This bug report is Closed 👍

    • Like 2
    • Thanks 1
    Link to comment
    On 3/1/2021 at 5:59 AM, Squid said:

    Docker Stop The Service, Delete the image, Switch to be a folder, Restart the service, Reinstall via Previous Apps

     

    I'm not sure if it's just my brain melting from reading to many pages of this thread or if I'm just being a dummy.  Delete the image I imagine is the file docker.img (/mnt/user/system/docker/docker.img).  Switch to be a folder, yeah I'm lost haha.  When you reinstall after that through previous apps are all settings and configurations saved?  Forgive my noob questions but got everything running pretty smooth now and don't want to break it.  Fingers crossed for a layman step by step for relative newbies :)

    Link to comment
    13 hours ago, CBPXXIV said:

    Switch to be a folder, yeah I'm lost haha.

     

    If you go to Settings -> Docker and stop the docker service a check box labelled "Delete vDisk file" should appear. Tick it and click Delete then change Docker data-root from its default "btrfs vDisk" to "directory". Then you can choose a path for the docker directory. Most people would accept the default but if you want to move it to a different pool then now is the time to do that. When you reinstall the containers via Community Apps -> Previous Apps all your settings are preserved.

    Link to comment

    When using directory I see this:

     

    Capture_root@Server_mntcachesystemdockerdockerdata_2021-03-12_17-18-26_78948373.png.7f09b4584163e2ee1e8c15684fde9a67.png


    "Compute..." from the GUI. Stats show 10% usage for my cache drive (not correct if the system share is 83,4GB)

    Capture_ServerShares_-_Google_Chrome_2021-03-12_17-24-53_54660518.thumb.png.4fbc8b4a39e5ff7873335f1287d236a5.png


    My dockerdata-dir (on cache) show 77G used. My cache drive show 20G used.

    What's wrong here? My docker.img was 30GB with about 50% used.

    Could the cache drive become full without Unraid noticing it?

    Edited by Niklas
    Link to comment
    20 hours ago, John_M said:

     

    If you go to Settings -> Docker and stop the docker service a check box labelled "Delete vDisk file" should appear. Tick it and click Delete then change Docker data-root from its default "btrfs vDisk" to "directory". Then you can choose a path for the docker directory. Most people would accept the default but if you want to move it to a different pool then now is the time to do that. When you reinstall the containers via Community Apps -> Previous Apps all your settings are preserved.

    Thanks John_M.   For reference if anyone does this and has a custom reverse proxy network setup, yeah that gets deleted too.  Wish I realized that before hand lol.   Completely broke my setup for Emby, Ombi, etc.  swag and several other containers aren't getting ips now.  hmmm

    Edit:  Figured it out, network was still showing up as none after switching back to the proxy network.  Force updating each container fixed that.  good to go

    Edited by CBPXXIV
    Fixed it
    Link to comment

    Finally updated to 6.9.1, I'm either having great results or something else is causing this great results.

     

    My TBW log used to show ~500gb/day ... 2 days in with 6.9.1... ~15gb/day this is like 33x less 🤔

     

    I swapped the SSD with a same model one (crucial mx500 1tb) the old one was already at 125TB TBW, this one is at 12.6TB TBW. Installed the same exact dockers. Formatted the new SSD cache as XFS, changed docker to use a folder instead of a image file.
     

     

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.