S1dney

Members
  • Posts

    104
  • Joined

  • Last visited

Report Comments posted by S1dney

  1. 17 minutes ago, TexasUnraid said:

    So looking into the appdata writes now, noticed that lancache was causing a fair amount of writes to the log files, I have never needed the logs so I disabled the appdata bind for those and instead left them in the containers and setup a ramdisk for the log folders to stop those writes but they are still accessible inside the container if needed.

     

    After that the worst offenders by far are the *arr's (aka, sonarr etc). The constant updates and logging that can not be disabled, adjusted or tweaked in any way causes quite a bit of writes unnecessarily. I don't need the logs and wish I could disable them until needed and also wish I could adjust the update timing, it is way more often then I need. Also causes my drives to spin up in the middle of the night for no reason.

     

    As a workaround I am testing a setup like this:

     

    I added a ramdisk folder to appdata

    In my array startup script I mount a ramdisk in that folder

    In the array start script I rsync all the *arr's appdata folders into the ramdisk

    I changed all those dockers to use the ramdisk

    I setup a new script that will run every 1-3 hours (TBD) that will rsync the ramdisk back to appdata so that I won't loose data in those dockers (at least more then a few hours worth).

     

    The only issue is that it has to copy the whole database back into appdata everytime it rsyncs, between them all that works out to around ~150mb per sync.

     

    So still around 2gb/day in writes doing it every 2 hours but it should have much less write amplification this way so only way to know the effects is to try it.

     

    It also requires 4gb of ram for the ramdisk and I have 8gb allocated just in case, fine for me but would not be so practical on a system with limited memory.

     

    Gonna leave this for a day or 2 and see the results.

     

    If only there was a way to selectively change the cache on a per folder or drive basis.


    Hahahah you went down the rabbit hole on this one didn’t you.

    I’ve spent a fair amount of time of this as well, but at some point thought those few gigabytes weren’t worth my time (also became a father in the meantime so less time to play 🤣👊🏻).

     

    Having that said, aren’t you better off writing a guide to cut down on these writes and post them in a separate thread? This thread has become so big that it will scare off anyone accidentally coming here via Google or something. Would be a waste of knowledge that you have gained (that others could benefit from) if you ask me.

     

    Cheers 🤙🏻

  2. So coming back on this bug report.

    I have upgraded to 6.9 on March 2nd and also wiped the cache to take advantage of the new partition alignment (I have Samsung EVO's and perhaps a portion of OCD 🤣).

    Waited a bit to get historic data.

     

    Pre 6.9

    TBW on 19-02-2021 23:57:01 --> 15.9 TB, which is 16313.7 GB.
    TBW on 20-02-2021 23:57:01 --> 16.0 TB, which is 16344.9 GB.
    TBW on 21-02-2021 23:57:01 --> 16.0 TB, which is 16382.8 GB.
    TBW on 22-02-2021 23:57:01 --> 16.0 TB, which is 16419.5 GB.

    -> Writes somewhere on 34/35GB's average a day.

     

    6.9

    TBW on 05-03-2021 23:57:01 --> 16.6 TB, which is 16947.4 GB.
    TBW on 06-03-2021 23:57:01 --> 16.6 TB, which is 16960.2 GB.
    TBW on 07-03-2021 23:57:01 --> 16.6 TB, which is 16972.8 GB.
    TBW on 08-03-2021 23:57:01 --> 16.6 TB, which is 16985.3 GB.

    -> Writes round 12/13GB's a average a day

     

    So I would say 6.9 (and reformatting) made a very big improvement.

    I think most of these savings are due to the new partition alignment as I was running docker directly on the cache already and recall made a few tweaks suggested here (adding mount options, cannot remember which exactly).

     

    Thanks @limetech and all other devs for the work put into this.

    This bug report is Closed 👍

    • Like 2
    • Thanks 1
  3. 22 hours ago, hawihoney said:

    6.8.3 is a stable release and there's a prebuilt image with NVIDIA 450.66 available. Just copy an handful of files over to the stick and reboot. That's all.

     

     

    I know haha, but I'm waiting for the upgraded kernel ;)

    I'm still on the latest build that had the 5+ kernel included.

  4. 9 minutes ago, limetech said:

    Since you cross posted this in two places, I'll cross post my reply:

     

    The loopback approach is much better from the standpoint of data management.  Once you have a directory dedicated to Docker engine, it's almost impossible to move it to a different volume, especially for the casual user.

    You're right, sorry about that, went through here before I went through my bug report.

    I do think that the casual user will benefit more from the loopback approach indeed since it's less complex and requires less maintenance.

    Have a great day 👍

  5. 1 minute ago, limetech said:

    That works for most containers and we highly encourage not storing data in image layers for just that reason BUT if someone does store data in the image this is something to be aware of.

    Surely, you will loose it when you upgrade the containers also so you'll find out soon enough.

    Wiping out the directory is essentially recreating the docker image so that's fine.

    Also I understand that you're trying to warn people and agree with you that for most users taking the loopback approach will work better and causes less confusion.

    It's great that we can decide this ourselves though, unRAID is so flexible, which is something I like about it.

  6. 2 minutes ago, limetech said:

    The loopback approach is much better from the standpoint of data management.  Once you have a directory dedicated to Docker engine, it's almost impossible to move it to a different volume, especially for the casual user.

    Agreed, which is why having options for both the loopback image and the folder is best of both worlds.

    Also if I ever wanted to move the data I would just remove the entire folder and recreate it anyways since it's non-persistent data.

  7. On 7/13/2020 at 1:25 PM, Squid said:

     

    False.  BTRFS is the default file system for the cache drive because the system allows you to easily expand from a single cache drive to be a multiple device pool.  If you're only running a single cache drive (and have no immediate plans to upgrade to a multi-device pool), XFS is the "recommended" filesystem by many users (including myself)

    The docker image required CoW because docker required it.  Think of the image akin to mounting an ISO image on your Windows box.  The image was always formatted as BTRFS, regardless of the underlying filesystem.  IE: You can store that image file on XFS, BTRFS, ReiserFS, or via UD ZFS, NTFS etc.

     

    More or less true.  As said, you've always been able to have an XFS cache drive and the image stored on it.

     

     

    The reason for the slightly different mounting options for an image is to reduce the unnecessary amount of writes to the docker.img file.  There won't be a big difference (AFAIK) if you choose a docker image formatted as btrfs or XFS.

     

    But, as I understand it any write to a loopback (ie: image file) is always going to incur extra IO to the underlying filesystem by its very nature.  Using a folder instead of an image completely removes those excess writes.

     

    You can choose to store the folder on either a BTRFS device or an XFS device.  The system will consume the same amount of space on either, because docker via overlay2 will properly handle duplicated layers etc between containers when it's on an XFS device.

     

    BTRFS as the docker.img file does have some problems.  If it fills up to 100%, the it doesn't recover very gracefully, and usually requires a delete of the image and then recreating it and reinstalling your containers (a quick and painless procedure)

     

    IMO, choosing a folder for the storage lowers my aggravation level in the forum because by it's nature, there is no real limit to the size that it takes (up to the size of the cache drive), so the recurring issues of "image filling up" for some users will disappear.   (And as a side note, this is how the system was originally designed in the very early 6.0 betas)

     

    There are just a couple of caveats with the folder method which is detailed in the OP (my quoted text).  

    1. Cache only share.  Simply referencing /mnt/cache/someShare/someFolder/ within the GUI isn't good enough.
    2. Ideally within its own separate share (not necessary, but decreases the possibility of ever running new perms against the share)
    3. The limitations on this first revision of the GUI supporting folders, that doesn't make how you do it exactly intuitive.  Will get improved by the next rev though.
    4. Get over the fact that you can't view or modify any of the files (not that you ever need to) within the folder via SMB.  Just don't export it so that it doesn't drive your OCD nuts.

     

    There is also still some glitches in the GUI when you use the folder method.  Notably, while you can stop the docker service, you cannot re-enable it via the GUI (Settings - docker).  (You have to edit the docker.cfg file and reenable the service there, and then stop / start the array)

    This is great!!

    I have been running docker on its own (non-exported) share on a btrfs partition from December last year on, very happy with it.

    I thought that when the docker/btrfs write issue was going to be solved I would have to revert to a docker image again, but being able to keep using my share in a supported way from the GUI is just perfect. I would the folder approach over a loop device any day!

     

    I'll keep an eye on this when it makes it out of beta, for now, keep up the good work, very much appreciated 😄

  8. On 6/28/2020 at 10:33 AM, itimpi said:

    I found this research article to be of great interest as it indicates that a large amount of write amplification is inherent in using the BTRFS file system.

     

    I guess this raises a few questions worth thinking about:

    • Is there a specific advantage to having the docker image file formatted internally as BTRFS or could an alternative such as XFS help reduce the write amplification without any noticeable change in capabilities.
    • This amplification is not specific to SSD's.
    • The amplification is worse for small files (as are typically found in appdata share).
    • Are there any BTRFS settings that can be applied at the folder level to reduce write amplification.  I am thinking here of the 'system' and 'appdata' folders.
    • If you have the CA Backup plugin to provide periodic automated backup of the appdata folder is it worth having that share on a single drive pool formatted as XFS to keep amplification to a minimum.  The 6.9.0 support for multiple cache pools will help if you need to segregate by file format.

     

    Very interesting indeed.

    This got me thinking...

     

    I noticed that writing directly onto the BTRFS cache reduced writes by a factor or roughly 10.

    Now I did felt like this was still on the high side, as it's still writing 40GB a day.

    What if.... this is still amplified by a factor of 10 also.

    Could this mean the a BTRFS formatted image on a BTRFS formatted partition results in 10x10=100 times write amplification?

    If I recall correctly someone pointed out a 100x write amplification number earlier in the thread? 

     

    I think this is well suited for a test 🙂

    I've just recreated the loopimage formatted on XFS.

    I'll test my TB's written in a few minutes and check again after an hour.

     

    EDIT: 

    Just noticed your comment @tjb_altf4

     

    23 hours ago, tjb_altf4 said:

    XFS isn't a supported backend for Docker, overlay2 seems to be the other usual choice.

     

    The default seems to work already, XFS is formatted nicely with the correct options:

    root@Tower:/# docker info
    Client:
     Debug Mode: false
    
    Server:
     Containers: 21
      Running: 21
      Paused: 0
      Stopped: 0
     Images: 35
     Server Version: 19.03.5
     Storage Driver: overlay2
      Backing Filesystem: xfs
      Supports d_type: true

    According to the docker docs this should be fine, which xfs_info /var/lib/docker seems to confirm:

    root@Tower:/# xfs_info /var/lib/docker
    meta-data=/dev/loop2             isize=512    agcount=4, agsize=1310720 blks
             =                       sectsz=512   attr=2, projid32bit=1
             =                       crc=1        finobt=1, sparse=1, rmapbt=0
             =                       reflink=1
    data     =                       bsize=4096   blocks=5242880, imaxpct=25
             =                       sunit=0      swidth=0 blks
    naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
    log      =internal log           bsize=4096   blocks=2560, version=2
             =                       sectsz=512   sunit=0 blks, lazy-count=1
    realtime =none                   extsz=4096   blocks=0, rtextents=0

    I'm just testing it out, cause I'm curious if it matters.

     

    EDIT2, after 1 hour of runtime:

    TBW on 28-06-2020 13:44:11 --> 11.8 TB, which is 12049.6 GB.
    TBW on 28-06-2020 14:44:38 --> 11.8 TB, which is 12057.8 GB.

    1 hour of running on XFS formatted loopdevice equals 8,2GB written, this would translate into 196,8GB a day.

    This would most likely be a bit more due to backup tasks at night.

    It's still on the high side, compared to running directly onto the BTRFS filesystem, which results in 40GB a day.

     

    In December 2019 I was seeing 400GB's a day though (running without modifications), and my docker count has increased a bit, so 200 is better. Haven't tried any other options, like the mount options specified. I expect these will bring the writes down regardless of the loopdevice being used since they're ran on the entire BTRFS mount, so the amplification with the loopdevice is likely to occur with them also.

     

    Still kind of sad though, I would have expected to see a very minor write amplification instead of 5 times. Guess that theory of 10x10 doesn't check out then..

    Rolling back to my previous state, as I'd take 40GB over 200 any day 😄 

     

    EDIT3: 

    Decided to remount the cache with space_cache=v2 set, running directly on the cache this gave me 9GB of writes in the last 9 hours.

    When the new unRAID version drops I'll reformat my cache with the new alignment settings. For now that space_cache=v2 setting does its magic :) 

    • Like 1
  9. 1 hour ago, limetech said:

    Yes we are looking into this.

     

    15 minutes ago, jonp said:

    Hi everyone and thank you all for your continued patience on this issue.  I'm sure it can be frustrating that this has been going on for as long as it has for some of you and yet this one has been a bit elusive for us to track down as we haven't been able to replicate the issue, but we just ordered some more testing gear to see if we can and I will be dedicating some serious time to this in the weeks ahead.  Gear should arrive this weekend so I'll have some fun testing to do during the 4th of July holiday (and my birthday ;-).

    Thank you both for this, communication is very much appreciated, as well as your efforts!

     

    Most of us know how busy you all have been so don’t worry about it 🙂

     

    I have not read anyone reporting this on HDD’s (read all comments actively), @TexasUnraid has been shifting data around a lot, you tried regular harddrives and btrfs by any chance?


    @jonp hope you’ll find a good challenge here, and also, happy birthday in advance! 🥳

  10. 14 minutes ago, Moz80 said:

    After applying the "fixes" of changing a couple of docker containers (i haven't reformatted my cache drive to another filesystem).

    You cannot fix this by changing a couple of docker containers, cause docker containers are not the root cause of the problem, a busy container will just show this problem more.

     

    The only "fixes" that have been working for other were:

    1) Formatting to XFS (works always)

    2) Remounting the BTRFS cache with the nospace_cache options, see @johnnie.black's https://forums.unraid.net/bug-reports/stable-releases/683-docker-image-huge-amount-of-unnecessary-writes-on-cache-r733/?do=findComment&comment=9431 (seems to work for everyone so far)

    3) Putting docker directly onto the cache (some have reported no decreased writes, although some have, this is the one I'm using and it's working for me)

     

    I may have missed one, but suggestion 2 is your quickest option here.

  11. 10 hours ago, Niklas said:

    And it is not only writes to docker.img, seeing amplified writes to appdata (on cache) too. If I put my MariaDB databases in appdata I will get several gigs written every hour. Even with really small changes to the databases. I have had to move the Mariadb dir with databases from appdata to one of my unassigned devices to save some wear and tear on my ssds. 

     

    I would like to move it back to appdata asap but I also want my hardware to live longer. My drives have 5 years of warranty OR 400+ in TBW. I really want to retain the warranty for at least 5 years but if this strange writes continues, people will have lots of drives out of warranty here because of reaching the warranty tbw limit, fast. My drives are dedicated nas drives, hence the 400+ tbw limit but I will guess that most users use more ordinary brands like Samsung where the limit could be as low as 140 TBW. With this black hole of writing, they could be out of warranty in under a year. That's why I switched from Samsung to Seagate IronWolf ssds. 

     

    10 hours ago, TexasUnraid said:

    Yep, I saw around 100X the writes to appdata when it is on a BTRFS drive vs XFS (500mb/hour vs 5mb/hour)

    I must say that I was reluctant in believing these statements, I have been testing writing stuff to the BTRFS cache devices in the beginning, could not notice the write amplification there.

     

    Now going back to the fact that my SMART data still shows my drives writing 40GB a day, this does seem quite a lot on second hand.

    TBW on 14-06-2020 23:57:01 --> 12.1 TB, which is 12370.2 GB.
    TBW on 15-06-2020 23:57:01 --> 12.1 TB, which is 12392.6 GB.
    TBW on 16-06-2020 23:57:01 --> 12.1 TB, which is 12431.4 GB.
    TBW on 17-06-2020 23:57:01 --> 12.2 TB, which is 12469.0 GB.
    TBW on 18-06-2020 23:57:01 --> 12.2 TB, which is 12507.4 GB.
    TBW on 19-06-2020 23:57:01 --> 12.3 TB, which is 12547.5 GB.

    I'm not really complaining though cause this writes are neglectable on 300TBW warranty drives. 

    However.... Since docker lives directly on the BTRFS mountpoint this might as well be lower since my containers aren't that busy ones.

     

    Still considerably lower though then the 300/400GB daily writes while still using the docker.img file.

    TBW on 11-11-2019 23:59:02 --> 3.8 TB, which is 3941.2 GB.
    TBW on 12-11-2019 23:59:01 --> 4.2 TB, which is 4272.1 GB.
    TBW on 13-11-2019 23:59:01 --> 4.5 TB, which is 4632.5 GB.
    TBW on 14-11-2019 23:59:01 --> 4.9 TB, which is 5044.0 GB.
    TBW on 15-11-2019 23:59:01 --> 5.2 TB, which is 5351.3 GB.
    TBW on 16-11-2019 23:59:01 --> 5.3 TB, which is 5468.8 GB.
    TBW on 17-11-2019 23:59:01 --> 5.5 TB, which is 5646.1 GB.

     

  12.   

    13 hours ago, limetech said:

    227 comments(!) in that topic.  is there a tldr?

    I think @tjb_altf4 sums it up quite well.

    9 hours ago, tjb_altf4 said:
    • write amplification in cache, most users are reporting a sizable uptick in writes across different cache configurations
    • exacerbated further by heavy write dockers such as plex
    • I believe it's related to copy on write setting of the docker.img, or at least how its mounted
    • @johnnie.black mentioned he was aware a fix was on its way for next release? 

     

    Not sure about the CoW setting though, I remember having played with that initially (it's a while ago).

    Every write is indeed amplified, definitely seems related to the loopdevice implementation cause mounting directly on the btrfs mountpoint makes it stop. Also it doesn't appears to be just a reporting issue as the drive's lifespan decreases.

     

    Nevertheless a very very very big salute on the work here, I can only imagine the amount of work that went in! 🤯

    People tend to complain when a bug isn't fixed in 2 days, but forget the amount of work that is being done behind the scenes.

    I don't have a test server to test this on, so I'm waiting patiently on the RC or stable builds.

    🍺🍺🍺🍺

     

    EDIT:

    Just noticed @johnnie.black's comments on the thread, that it might be solved by something that has been changed in this build.

    As I'm skipping this one I'm curious for the results other people are seeing 🙂 

    • Like 1
  13. 14 hours ago, TexasUnraid said:

     

    Yeah, those are the ones I saw, I have done so much reading on this subject it all kind of runs together at this point lol. Is there another write up on the subject?

     

    Only issue I saw with that writeup was it seems like when an official fix is released, I will have to rebuild all of the dockers to go back to the official setup? This is my first time using dockers (been working with PC's since DOS 3.0 but only really messed with linux in VM's before this), I get them in theory but are no settings/information stored inside the containers themselves that would be lost?

     

    Would hate to get everything setup just right and then have to do it all over again when a patch is released.

     

    Course it would be easier to figure that out if we knew an approx timeframe for a fix.

     

    Does anyone know when "run at array start" in user scripts actually runs? Is it when you click the button or after the array has started? It would be cool if it could be scripted into a user script so it is a simple matter of enabling it or disabling it and everything is reverted.

     

    Is it possible to stop / start docker from a script so it could all be automated into a single script?

     

    There is no other write-up on the subject, as far as I know. I improvised in this one to find a solution that would not destroy my SSD's ;) 

     

    Well no docker container should contain persistent data.

    Persistent data should always be mapped to a location on the host.

    Updating a docker container destroys it too, and if setup correctly this doesn't cause you to loose data, by design.

     

    You're right though, if this gets fixed in a newer release, you would have to redownload all image files, but because of docker's nature, this will only take some time to download (depending on your line speed) and (again if setup correctly) will not cause you to lose any data.

     

    The user scripts plugin is indeed able to run scripts when the array comes up, you don't have to press any button or so, it runs like the names says on array startup (I guess straight after).

     

    Taking the approach from page two makes this persistent though, and reverting back to default in a feature upgrade would just require you to remove the added lines from the /boot/config/go file and docker will mount its image file again.

  14. 23 hours ago, TexasUnraid said:

     

    Is there a way to extract the BTRFS image? Or will the docker settings be saved so it is just a 1 click reinstall using the instructions from page 2?

    Well the instructions from page 2 (you meant these right?) are meant to make the changes persistent by editing the go files and injecting a file from flash into the OS on boot.

     

    If you want a solution that is temporary you should follow the steps to create a directory on the cache  and edit the start_docker() function in the /etc/rc.docker/rc.docker file.

    Then stop docker from the docker tab and start it again.

    Docker will now write directly into the btrfs mountpoint.

     

    One thing here though, is that you end up with no docker container.

    To recreate every one of them, go to add container and select the containers from the user-templates one by one (assuming all your containers were created via the GUI).

    This downloads the images again and all the persistent mappings are restored.

     

    If you want to revert, simple reboot and voila.

     

    Easy way to find where the investigation you did stand against this, kind of curious also 🙂

    Cheers.

     

  15. 7 minutes ago, TexasUnraid said:

    Very interesting, thanks for the info.

     

    So the btrfs-transacti is doing about the same amount of writes with the symlink as putting the docker on an xfs drive. Yet with docker disabled it does nothing at all. Not thrilled with the idea of my SSD's being eaten away when they should be doing nothing at all. They could be with me for a good long time.

     

    While it is a heck of a lot better then the gigs I was also getting in 10 minutes, it is still over 3TB of writes per year for no reason and will really chew threw SSD's. Also not a lot of reason to possibly mess up the unraid config for when a real fix is released vs just leaving the docker on the Array HDD.

     

    Really wish there was a time frame for when this would be fixed. I am still on the trial and don't really feel comfortable buying the license with this issue active.

    Totally understandable.

    3TB on my two Samsung Evo 860 1TB drives are neglectable though, since their warranty will void after 300TB written hahaha. They should last 100 years 😛 

     

    It would feel strange however that moving to n XFS volume would still have btrfs-transacti "do stuff".

    If I interpret my quick Google query correctly this is snapshotting at work, which is a btrfs, but not an XFS process.

     

    Also, indeed, only mess with unRAID's config if you confident about doing it 🙂 

    Great thing about unRAID in these cases is that a reboot sets you back to default with a default go file and no other scripts running, at least on the OS side.

  16. 28 minutes ago, TexasUnraid said:

    Anyone that has gone the path of mapping the docker image directly without loop 2, does btrfs-transacti still cause writes to the cache? I might have to go that route if the official fix is months away.

    There you go!

    10 minutes into "iotop -ao", btrfs-transacti produced nearly 60MB of writes.

    My /var/lib/docker is symlinked to /mnt/cache/docker, so all writes that should go into the image, go straightly on the btrfs mountpoint.

     

    Note that running iotop initially gave me Gigabytes of data,

    After several months I'm still one happy redundant btrfs camper 😁

     

    Total DISK READ :       0.00 B/s | Total DISK WRITE :       0.00 B/s
    Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       0.00 B/s
      TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    25615 be/4 root          0.00 B     13.98 M  0.00 %  0.25 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    25612 be/4 root          0.00 B     12.67 M  0.00 %  0.23 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    23465 be/4 root          0.00 B     12.51 M  0.00 %  0.21 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    24974 be/4 root          0.00 B     11.56 M  0.00 %  0.21 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    25613 be/4 root          0.00 B      8.93 M  0.00 %  0.16 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    23449 be/4 root          0.00 B      8.39 M  0.00 %  0.15 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    24682 be/4 root          0.00 B      7.35 M  0.00 %  0.12 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    25959 be/4 root          0.00 B      7.48 M  0.00 %  0.12 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    23468 be/4 root          0.00 B      6.84 M  0.00 %  0.12 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
     6192 be/4 root          0.00 B      6.37 M  0.00 %  0.12 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    25616 be/4 root          0.00 B      4.96 M  0.00 %  0.10 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    16693 be/4 65534         0.00 B      6.95 M  0.00 %  0.09 % sqlservr
    23464 be/4 root          0.00 B      4.44 M  0.00 %  0.09 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
    23441 be/4 root          0.00 B      2.51 M  0.00 %  0.05 % dockerd -p /var/run/dockerd.pid --log-opt max-size=50m --log-opt max-file=1 --storage-driver=btrfs --log-level=error
     5523 be/4 root          0.00 B     58.58 M  0.00 %  0.04 % [btrfs-transacti]
    26247 be/4 root          0.00 B      2.19 M  0.00 %  0.03 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    27636 be/4 root          0.00 B   1952.00 K  0.00 %  0.03 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    25702 be/4 root          0.00 B      2.00 M  0.00 %  0.03 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    26773 be/4 root          0.00 B   1460.00 K  0.00 %  0.02 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    29889 be/4 root          0.00 B   1272.00 K  0.00 %  0.02 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
     7181 be/4 root          0.00 B   1192.00 K  0.00 %  0.02 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    30421 be/4 root          0.00 B    992.00 K  0.00 %  0.01 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    14571 be/4 root          0.00 B      2.86 M  0.00 %  0.01 % [kworker/u8:1-btrfs-endio-write]
    26949 be/4 root          0.00 B      2.88 M  0.00 %  0.01 % [kworker/u8:4-btrfs-endio-write]
    12185 be/4 root          0.00 B   1600.00 K  0.00 %  0.01 % [kworker/u8:2-bond0]
    32373 be/4 root          0.00 B   2016.00 K  0.00 %  0.01 % [kworker/u8:5-btrfs-endio-write]
    24913 be/4 root          0.00 B    588.00 K  0.00 %  0.01 % containerd --config /var/run/docker/containerd/containerd.toml --log-level error
    30128 be/4 root          0.00 B      2.38 M  0.00 %  0.01 % [kworker/u8:3-btrfs-endio-write]

     

  17. 3 hours ago, johnnie.black said:

    Maybe it doesn't work with all dockers, like mentioned it's repeatable and works consistently for me, but I only had the single plex docker installed.

    Thanks for your work @johnnie.black, it would feels strange though that moving the image would work for one docker container, but doesn't for the other?

    Any ideas what changed after the mover kicked in?

     

    50 minutes ago, -Daedalus said:

    However, we shouldn't have to hear this from you. I'm sure Limetech are working on this, I'm sure there's some sort of fix coming at some point, but radio silence for something this severe really shouldn't be the norm, especially if Limetech is shooting for a more official, polished vibe, as a company.

    I think Johnnie is very closely affiliated to Limetech so I'm sure this is coming from first hand. Do remember that the fact that they are not actively responding does not mean they're not working on it. 

    Although I do kind of agree a bit with you, not hearing anything tends to let users believe that no work is being done. It might be a quick win for the dev's to give some info once in a while, I have read some ideas somewhere on the forum to create some kind of blog (or a newsletter by mail or something) about "What we're doing" , which sounded appealing to me.

     

    Nevertheless you have to acknowledge the fact that they do a lot and we all profit from it.

    Keep up the good work guys 🍻

  18. Hey All,

     

    It is very much appreciated to update your findings, but like many have said here, this bug simply amplifies writes and it affects BTRFS not XFS, so reformatting to XFS or putting the loopdevice out of the equation fixes this bug, but in case of XFS you'll lose redundancy and modifying shell scripts is obviously unsupported.

    So please post containers that write like crazy in the accompanying threads for that specific container, also note that these writes are most likely amplified by a factor x, so the real number of writes might not be that bad.

     

    Cheers.

  19. 22 minutes ago, caplam said:

    i tried to copy /var/lib/docker but failed (out of space). There must be some volumes that are mounted in /var/lib/docker/btrfs because when looking at the size of btrfs directory it's 315GB (docker.img is 60GB with 50GB used)

    Interesting.  

    I see a lot of files in the /mnt/user/docker/btrfs/subvolumes that correspond to container I deleted a big while ago, so I reckon docker doesn't really cleanup after itself, I might wipe out the entire /mnt/user/docker directory on time to save some space, I have a fast download link so don't care about redownloading 🙂

    However this doesn't seem to be the issue here I think.

    The /var/lib/docker/volumes folder contains persistent data, meaning it will most likely contain symlinks to your array.

    You could exclude certain directories from the copy action (like the volumes directory, since persistent data is persistent anyways), but I won't get into much details here since I don't want people to wreck their systems when they mistype certain commands 🙂

     

    You have a PM!

  20. 10 hours ago, caplam said:

    i've dropped the use of docker compose. 

    The 3 docker i recreated were recreated with dockerman Gui and still only 2 appear in the gui even when they were started.

    docker template in unraid is very convenient. In my case there is a big downside: i have a poor connection so redownloading 60+ gig of images will take me around 2 days if shut down netflix for the kids. 🥵

    I wish i could extract images from docker.img.

    Hahaaah I see. You could try to cheat (remember at own risk, but I don’t think it could really hurt though).

     

    - Put a # before the line that copies over the modified rc.docker script in the /boot/config/go file

    - Reboot, so that the changes revert and the docker image is mounted again.

    - clear out the entire docker directory you’ve created (the new one so not the docker.img 🙂 , for the example I use /mnt/user/docker) so that just an empty docker directory remains

    - Stop all docker containers

    - Run cp -r /var/lib/docker /mnt/user/docker (so this recursively copies all files from the point where the docker image is mounted, to the location where you’ll be creating a symlink to)

    - Remove the # again from the go file

    - Reboot.

     

    I “think” this should work, only thing I could think of is that docker (since it’s still started when you copy the files) holds a lock on some files, causing the copy to fail. If that is the case you would need to stop the docker service, but that would also unmount the image again, so to not have that unmounted it would require some temporary changes to the dc.docker file again.

     

    It’s worth the try, you’re only copying files out of the image and can always just empty the /mnt/user/docker directory again.

     

    EDIT:

    After some PMs with @caplam we can safely advise NOT to go down the copy over data rabbit hole.

    It seemed like copying over the data will “inflate” it, most likely due to btrfs features like deduplication and/or layering/CoW. Not really into btrfs that much to fully explains what’s going on here, but I imagine you would be required to use of more specific tools to handle copying that btrfs specific data over.
    Recreating the containers (and thus redowloading) creates all the needed btrfs subvolumes. If your not a specialist I would not recommend missing with the btrfs data 🙂

     

  21. 8 hours ago, JTok said:

    For anyone else that needs it, I was having more issues with libvirt/loop3 than docker/loop2, so I adapted @S1dney's solution from here for libvirt.

     

    A little CYA: To reiterate what has already been said, this workaround is not ideal and comes with some big caveats, so be sure to read through the thread and ask questions before implementing.

     

    I'm not going to get into it here, but I used S1dney's same basic directions for the docker by making backups and copying files to folders in /boot/config/.

     

     

    Create a share called libvirt on the cache drive just like for the docker instructions.

     

    edit rc.libvirt 's start_libvirtd method as follows:

    
    start_libvirtd() {
      if [ -f $LIBVIRTD_PIDFILE ];then
        echo "libvirt is already running..."
        exit 1
      fi
      if mountpoint /etc/libvirt &> /dev/null ; then
         echo "Image is mounted, will attempt to unmount it next."
         umount /etc/libvirt 1>/dev/null 2>&1
         if [[ $? -ne 0 ]]; then
           echo "Image still mounted at /etc/libvirt, cancelling cause this needs to be a symlink!"
           exit 1
         else
           echo "Image unmounted succesfully."
         fi
      fi
      # In order to have a soft link created, we need to remove the /etc/libvirt directory or creating a soft link will fail
      if [[ -d /etc/libvirt ]]; then
        echo "libvirt directory still exists, removing it so we can use it for the soft link."
        rm -rf /etc/libvirt
        if [[ -d /etc/libvirt ]]; then
          echo "/etc/libvirt still exists! Creating a soft link will fail thus refusing to start libvirt."
          exit 1
        else
          echo "Removed /etc/libvirt. Moving on."
        fi
      fi
      # Now that we know that the libvirt image isn't mounted, we want to make sure the symlink is active
      if [[ -L /etc/libvirt && -d /etc/libvirt ]]; then
        echo "/etc/libvirt is a soft link, libvirt is allowed to start"
      else
        echo "/etc/libvirt is not a soft link, will try to create it."
        ln -s /mnt/cache/libvirt /etc/ 1>/dev/null 2>&1
        if [[ $? -ne 0 ]]; then
          echo "Soft link could not be created, refusing to start libvirt!"
          exit 1
        else
          echo "Soft link created."
        fi
      fi
      # convert libvirt 1.3.1 w/ eric's hyperv vendor id patch to how libvirt does it in libvirt 1.3.3+
      sed -i -e "s/<vendor id='none'\/>/<vendor_id state='on' value='none'\/>/g" /etc/libvirt/qemu/*.xml &> /dev/null
      # remove <locked/> from xml because libvirt + virlogd + virlockd has an issue with locked
      sed -i -e "s/<locked\/>//g" /etc/libvirt/qemu/*.xml &> /dev/null
      # copy any new conf files we dont currently have
      cp -n /etc/libvirt-/*.conf /etc/libvirt &> /dev/null
      # add missing tss user account if coming from an older version of unRAID
      if ! grep -q "^tss:" /etc/passwd ; then
        useradd -r -c "Account used by the trousers package to sandbox the tcsd daemon" -d / -u 59 -g tss -s /bin/false tss
      fi
      echo "Starting libvirtd..."
      mkdir -p $(dirname $LIBVIRTD_PIDFILE)
      check_processor
      /sbin/modprobe -a $MODULE $MODULES
      /usr/sbin/libvirtd -d -l $LIBVIRTD_OPTS
    }

     

    Add this code the the go file in addition to the code for the docker workaround:

    
    # Put the modified libvirt service file over the original one to make it not use the libvirt.img
    cp /boot/config/service-mods/libvirt-service-mod/rc.libvirt /etc/rc.d/rc.libvirt
    chmod +x /etc/rc.d/rc.libvirt

     

    That's interesting, thanks for the work!

    Do you have any metrics to share?

     

    I'm using a simple script via the User Script plugin to keep track of total TB written every day:

     

    #!/bin/bash
    
    #>)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>)
    #>)
    #>) Simple script to check the TBW of the SSD cache drives on daily basis
    #>)
    #>)>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>)
    
    # Get the TBW of /dev/sdb
    TBWSDB_TB=$(/usr/sbin/smartctl -A /dev/sdb | awk '$0~/LBAs/{ printf "%.1f\n", $10 * 512 / 1024^4 }') 
    TBWSDB_GB=$(/usr/sbin/smartctl -A /dev/sdb | awk '$0~/LBAs/{ printf "%.1f\n", $10 * 512 / 1024^3 }') 
    
    echo "TBW on $(date +"%d-%m-%Y %H:%M:%S") --> $TBWSDB_TB TB, which is $TBWSDB_GB GB." >> /mnt/user/scripts/unraid/collect_ssd_tbw_daily/TBW_sdb.log
    
    
    # Get the TBW of /dev/sdb
    TBWSDG_TB=$(/usr/sbin/smartctl -A /dev/sdg | awk '$0~/LBAs/{ printf "%.1f\n", $10 * 512 / 1024^4 }')
    TBWSDG_GB=$(/usr/sbin/smartctl -A /dev/sdg | awk '$0~/LBAs/{ printf "%.1f\n", $10 * 512 / 1024^3 }')
    
    echo "TBW on $(date +"%d-%m-%Y %H:%M:%S") --> $TBWSDG_TB TB, which is $TBWSDG_GB GB." >> /mnt/user/scripts/unraid/collect_ssd_tbw_daily/TBW_sdg.log

    You would have to locate the correct devices in /dev though.

    I use it to look into the files once in a while to spot containers that write abnormally.

    If you would be able to rollback the fix once, have it run a few days and then reapply again to see what savings you get, that would be great. Surely you could use iotop the GUI to spot MB/s but this gives some more grasp on the longer term.

     

    45 minutes ago, caplam said:

    i think problems with my vm were related to qcow2 image format. I converted it to raw img and now kb written to disk are coherent between inside the vm and unraid.

    Cache writes seem to stabilize around 800kb/s for that vm.

     

    edit : could be related to btrfs driver. I downgraded to 6.7.2 and for the vm the problem is still there. 

    I think now i have to upgrade to 6.8.3 and apply the workaround of @S1dney

    Do i have to delete docker.img or move it elsewhere ? Mine is 80GB so it takes lot of space.

     

    edit2 : i applied the workaround in 6.7.2. I have to redownload all docker images (with a 5Mb/s connection it's a pain in the ass). I've redownloaded 3 docker but only 2 appear in the docker gui while portainer see them all.

    You should only remove the docker image once your 100% sure that there's no data in there you still need, cause recreating or deleting it is a permanent loss of data.

    I'm not really familiar with portainer, I can only speak for docker-compose, as I use that alongside the DockerMan GUI. Docker-compose created containers are only visible within the unRAID GUI once started, so this might be a similar thing?

     

    Containers that were created outside of the unRAID docker GUI (DockerMan) also don't get a template to easily recreate. But if you're using docker compose or something, recreating should be easy right.