Jump to content

Lignumaqua

Members
  • Content Count

    32
  • Joined

  • Last visited

Community Reputation

8 Neutral

About Lignumaqua

  • Rank
    Advanced Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I had the same thought, and have written to the authors of the paper here in Austin TX to see if they have any knowledge of how the two BTRFS systems interact. One of the authors is a lecturer and researcher on containerization so may well have studied this. I’ll report back here if/when I get a reply. If this were a genuine concern then, with their reported worst case of 32x amplification, that could lead to 32x32 = 1024x! 😲
  2. Caveat - I am in no way any kind of expert on either Unraid or Linux fiiing systems. I'm just trying to join the dots here! I could easily have done so incorrectly so feel free to shoot this down. I won't be offended. You may already all know this, but I didn't, and I feel a little bit more knowledgeable now. The link below is to an interesting paper from 2017 that compares the write amplification of different file systems. BTRFS is by far the worst with a factor of 32x for small files overwrite and append when COW is enabled. With COW disabled this dropped to 18.6X, which is still pretty significant. This is three years ago, so things may have changed. In particular space_cache V2 could be a reaction to this? BTRFS + writing or amending small files = very high write amplification. https://arxiv.org/abs/1707.08514 This suggests that BTRFS is a great system for secure storage of data files, but not necessarily a good choice for writing multiple small temporary files, or for log files that are continually being amended. Looking at common uses of the cache in Unraid might lead to the following suppositions. A BTRFS cache using Raid 1 is a good place for downloaded files before they are moved into the array. It's also good for any static data files. However, it's likely not to be the best place for a Docker img file or any kind of temporary storage. Particularly if redundant storage isn't needed. XFS might be a better choice there. Docker appdata is a tricky one. That's likely data you want to be redundantly stored, but it might also be changing rapidly. Likely to contain databases for example. I can see that an SQLite or MySQL database could be a real issue with BTRFS write amplification. The Docker img itself also being BTRFS is a further complication that makes my head hurt... The new cache pools will likely be a great way to help deal with this dilemma! 🙂
  3. There’s an elephant in this room which needs mentioning. We have a long thread marked urgent in the Bugs forum and Limetech state they have no knowledge of it? Were all the reports here a waste of time? After repeated requests for official acknowledgment of the issue we got posts from insiders telling us not to worry and that Limetech had it in hand. Were those posts untrue? As one of those who has had an SSD die extremely early with a huge number of writes that took it out of warranty, this is very, very disappointing. Please, please tell me that this was just a misunderstanding.
  4. Just as a data point. The average daily write volume to the cache drive when formatted btrfs, unencrypted, was just over 1.1 TB. Now that the drive is xfs, with the identical dockers and VMs, that has dropped to around 40 GB. That’s 4% of the previous amount.
  5. That was my solution as well. Used a new nvme SSD (that i'd purchased as a spare as the current cache was being worn out so quickly) as an XFS formatted cache. loop2 writes now down to a sane level. btrfs and loop2 together seem definitely to be the culprit. I would also prefer a cache pool, but this with daily backups of appdata is better for now.
  6. As others have posted here, you can't blame Plex, or any single Docker. Something is taking normal writes and amplifying them massively. In my case stopping Plex makes a difference, but only reduces it by about 25%, and rampant writes continue. About 1 GB a minute as I look at it right now! I don't want anyone to think the problem has been solved and the cause was Plex. That isn't the case. It's much more fundamental than that.
  7. This thread explains a lot. I had a cache SSD die a couple of months ago after only two years of use. I just assumed it was a random failure and replaced it. Now, after, checking with iotop it looks like I'm seeing this problem with over 40 GB an hour being written from loop2 to the cache. Overall, the new nvme SSD drive has been running 1,660 hours and has had 70TB written to it for an average of 43 GB per hour. Running a single cache drive, btrfs, un encrypted. Like everyone else here I'd really appreciate a fix for this. Replacing nvme drives is expensive. The temporary fix posted by OP is clever, and I thank them for providing it, but it's not something I'd want to try. I have too many dockers in daily use to want to risk going that far off piste...
  8. Just an FYI for anyone running Poste.io email server or any other email system using Dovecot IMAP server. It seems that hard links are required for the Dovecot IMAP engine within Poste.io to operate correctly as it uses hard links to copy messages. I had to re-enable hard links to get IMAP back. It looks like it is possible to configure Dovecot to not use hard links, but I haven't tried that yet.
  9. All makes good sense, thank you. For those of us running 6.8.0-rc7 with no problems (I used a second NIC for VMs), what would you advise? Sticking with rc7 and waiting for 6.9.0-rc1? I don’t need v5 of the kernel for my hardware so I’m not sure which is the best decision?
  10. I have the same problem. Docker tab freezes and won’t scroll on a iPad. Exactly the same in Safari, Chrome, and Firefox. I suspect the page is just too complex. Likely dependent on how many containers you have running, 47 in my case. IPad Pro running iPadOS 13.2. The workaround for basic Docker operations is using the Dashboard tab, but you can’t access full details that way.
  11. Thanks @bonienl for the diagnosis. Upgraded to RC1 after moving all my VMs to a separate NIC feeding br1 and everything seems good. As far as I can tell the disk write speed is also back to where it should be. Hopefully the move from 6.6.7 will be permanent this time! 🙂
  12. Dagnabbit. That's another show stopper that means I’ll have to remain on 6.6.7. Thanks for testing.
  13. Does disabling GSO and/or all NIC offloading avoid this problem, or does that slow things down too much?
  14. Ah, thank you. I misinterpreted the 'incompatible' error message to mean that it had updated, but the new version was incompatible. Now I see that it didn’t actually update. My mistake, apologies.
  15. For those of us stuck on 6.6.7 until the slow array performance is fixed in 6.8, is there a way to revert FCP to the prior version?