[6.8.3] docker image huge amount of unnecessary writes on cache

testdasi · August 11, 2020

1 hour ago, TexasUnraid said:

If you go back a ways in this thread, you will find a few pages of me testing every possible scenario.

While the docker image is the main culprit for sure, appdata was not far behind. With just appdata on the BTRFS I was still seeing around 800mb/hour IIRC. Vs both on the XFS and ~200mb/hour combined.

Have you redone the tests on 6.9.0 + partition align to 1MiB?

It makes a huge difference.

Also you probably missed my point a bit. There is a balance to be struck between the needs for endurance vs resiliency.

Docker image has the lowest need for resiliency (everything is reinstallable so recovering from complete lost is a mundane mouse-clicking affair) so the need to increase longevity for the SSD naturally floats to the top.
- Then you add the loop2 amplification, which is the consistently the highest and exclusively affects docker image. That builds the case for having docker image in the xfs disk.
Appdata does have some needs for resiliency because reconfiguring every app is a pain in the backside, if impossible in some cases. So one has to debate if the need to reduce SSD wear would trump the need to protect the appdata against failure.
- In an ideal scenario, you would have a backup to mitigate the risk but just as parity is not a backup, a backup isn't a parity either (note: a mirror i.e. RAID-1 is a special case of parity).

It's like the UK government misguided effort to promote diesel cars to reduce carbon emission. The end result was air quality went down the drain due to particulate matter and nitrogen oxides in diesel exhaust.

So people don't die 10 years down the road because of global warming. They die next year because of lung cancer.

TexasUnraid · August 11, 2020

Yeah, I tested it on 6.9 as well, and while writes were lower across the board (roughly half vs 6.8 IIRC) it was still many times higher then using an XFS cache.

I am aware of the risks with data loss. I am not worried about it personally for a few reasons.

1: I have never had an SSD I trust die on me (had one dead when I first plugged it in but that was DOA).

2: Appdata is backed up with the CA backup tool on a weekly basis. Now that everything is setup and working, having to fall back a week on the dockers is not something I am worried about. I can manually run a backup if I make a lot of changes. I would much rather have the reduced writes.

Worst case an hour of work would restore my XFS cache drive.

vakilando · August 11, 2020

9 hours ago, testdasi said:

(...) Are you using 6.9.0? Did you also align the parition to 1MiB? That requires wiping the pool so I would assume quite few people would do it.

No, I'm on 6.8.3 and I did not align the parition to 1MiB (its MBR: 4K-aligned).

What is the benefit of aligning it to 1MiB? I mus have missed this "tuning" advice...

TexasUnraid · August 11, 2020

4 minutes ago, vakilando said:

No, I'm on 6.8.3 and I did not align the parition to 1MiB (its MBR: 4K-aligned).

What is the benefit of aligning it to 1MiB? I mus have missed this "tuning" advice...

You can't align it until 6.9, it won't work with 6.8.

It was explained a few pages back but basically it ensures that each 4kb block of the drive is accessed individually. As it is it might need to access 2 4kb blocks for every write, possibly doubling the writes (which is almost what I saw).

testdasi · August 11, 2020

57 minutes ago, vakilando said:

No, I'm on 6.8.3 and I did not align the parition to 1MiB (its MBR: 4K-aligned).

What is the benefit of aligning it to 1MiB? I mus have missed this "tuning" advice...

Yep, 6.9.0 should bring improvement to your situation. But as I said, you need to wipe the drive in 6.9.0 to reformat it back to 1MiB alignment and needless to say it would make the drive incompatible with Unraid before 6.9.0.

Essentially back up, stop array, unassign, blkdiscard, assign back, start and format, restore backup. Beside backing up and restoring from backup, the middle process took 5 minutes.

I expect LT to provide more detailed guidance regarding this perhaps when 6.9.0 enters RC or at least when 6.9.0 becomes stable.

Not that 6.9.0-beta isn't stable. I did see some bugs report but I personally have only seen the virtio / virtio-net thingie which was fixed by using Q35-5.0 machine type (instead of 4.2). No need to use virtio-net which negatively affects network performance.

PS: been running iotop for 3 hours and still average about 345MB / hr. We'll see if my daily house-keeping affects it tonight.

Edited August 11, 2020 by testdasi

John_M · August 12, 2020

6 hours ago, testdasi said:

No need to use virtio-net which negatively affects network performance.

Doesn't the change from virtio to virtio-net happen automatically when you open and then save a template under the 6.9.0-betas, making it quite difficult to avoid, unless you're aware of it and revert manually?

From the release notes:

Quote

You need to edit each VM and change the model type for the Ethernet bridge from "virtio" to "virtio-net". In most cases this can be accomplished simply by clicking Update in "Form View" on the VM Edit page.

bonienl · August 12, 2020

In the next version there is a selection field in the GUI to chose between virtio and virtio-net.

testdasi · August 12, 2020

So repeated my test overnight for 15 hours

Unraid 6.9.0-beta25
2x Intel 750 1.2TB
BTRFS RAID-0 for data chunks, RAID-1 for metadata + system chunks
Both partitions aligned to 1MiB
35 dockers running in BAU pattern i.e. not trying to keep things idle

Still average about 350 MB/hr (or 8.5 GB/day) on loop2 so sounds like that's my best baseline.

Loop2 is 5th on the list, only about 2% the top one on the list (which I know for sure has written that much data). So basically negligible.

DerfMcDoogal · August 13, 2020

So I was planning to add 2x 860 EVO 1TB SSDs this weekend as a Raid-1 btrfs Cache Pool... I'm on 6.8.2. Is it just my best bet to wait for 6.9 to get released? Is there anything I can do on my current version to get this going now without having to wait?

testdasi · August 13, 2020

1 hour ago, DerfMcDoogal said:

So I was planning to add 2x 860 EVO 1TB SSDs this weekend as a Raid-1 btrfs Cache Pool... I'm on 6.8.2. Is it just my best bet to wait for 6.9 to get released? Is there anything I can do on my current version to get this going now without having to wait?

Update to 6.9.0-beta25? 😉

Longer answer: What you can do is to update to 6.9.0-beta25 now and test your server thoroughly (+ doing any necessary tweaks e.g. VM 5.0 / virtio-net etc.). As long as it's stable for you, there's no need to worry about the beta label. Then when you are ready, plop the 2 SSD in a new pool and format.

DerfMcDoogal · August 13, 2020

21 minutes ago, testdasi said:

Update to 6.9.0-beta25? 😉

Longer answer: What you can do is to update to 6.9.0-beta25 now and test your server thoroughly (+ doing any necessary tweaks e.g. VM 5.0 / virtio-net etc.). As long as it's stable for you, there's no need to worry about the beta label. Then when you are ready, plop the 2 SSD in a new pool and format.

Thanks. So the 1MB alignment should fix the excessive write issue? I'm good to just allow mover to move my docker image as it exists over to the newly created cache? I see a lot of posts of XFS vs btrfs. .img vs folder. It's all too confusing.

My goal is to get my appdata moved to a cache pool without destroying $300 worth of SSD. LOL.

vakilando · August 13, 2020

On 8/11/2020 at 9:57 PM, testdasi said:

Yep, 6.9.0 should bring improvement to your situation. But as I said, you need to wipe the drive in 6.9.0 to reformat it back to 1MiB alignment and needless to say it would make the drive incompatible with Unraid before 6.9.0.

Essentially back up, stop array, unassign, blkdiscard, assign back, start and format, restore backup. Beside backing up and restoring from backup, the middle process took 5 minutes.

I expect LT to provide more detailed guidance regarding this perhaps when 6.9.0 enters RC or at least when 6.9.0 becomes stable.

Not that 6.9.0-beta isn't stable. I did see some bugs report but I personally have only seen the virtio / virtio-net thingie which was fixed by using Q35-5.0 machine type (instead of 4.2). No need to use virtio-net which negatively affects network performance.

PS: been running iotop for 3 hours and still average about 345MB / hr. We'll see if my daily house-keeping affects it tonight.

Thanks!
The procedure "back up, stop array, unassign, blkdiscard, assign back, start and format, restore backup" is no problem and not new for me (except of blkdiscard) as I had to do it as my cache disks died because of those ugly unnecessary writes on btrfs-cache-pool...

As said before, I tend changing my cache to xfs with a singel disk an wait for the stable release 6.9.x

Meanwhile I'll think about a new concept managing my disks.

This is my configuration at the moment:

Array of two disks with one parity (4+4+4TB WD red)
1 btrfs cache pool (raid1) for cache, docker appdata, docker and folder redirection for my VMs (2 MX500 1 TB)
1 UD for my VMs (1 SanDisk plus 480 GB)
1 UD for Backup data (6 TB WD red)
1 UD for nvr/cams (old 2 TB WD green)

I still have two 1TB and one 480 GB SSDs lying around here..... I have to think about how I could use them with the new disk pools in 6.9

Crogge · August 16, 2020

I can confirm this problem on 6.8.3. I managed to minimize the amount of written data by disabling all non-critical logs on vms and docker images but it is still 1-2GB per hour even if barely any data is actually written.

I will wait for 6.9.0 RC1 but I'd love to see a hotfix for 6.8.x as well.

Alexstrasza · August 16, 2020

When these changes hit the release candidate, will we be automatically prompted to re-create our cache pools if needed? Or is this fix applied without needing to re-create the pool? I'm a little confused.

John_M · August 17, 2020

3 hours ago, Alexstrasza said:

Or is this fix applied without needing to re-create the pool?

The revised mount option will be applied automatically (it already is in the latest beta) but if you want to change the partition alignment of your SSD devices or the filesystem type or the docker image format you'll have to do it manually, because not everyone needs to change anything. I expect some guidance will be provided in the release notes.

Alexstrasza · August 19, 2020

On 8/17/2020 at 4:00 AM, John_M said:

The revised mount option will be applied automatically (it already is in the latest beta) but if you want to change the partition alignment of your SSD devices or the filesystem type or the docker image format you'll have to do it manually, because not everyone needs to change anything. I expect some guidance will be provided in the release notes.

Thanks for the information, that makes a lot more sense to me now.

Koenig · August 21, 2020

On 8/11/2020 at 9:57 PM, testdasi said:

Yep, 6.9.0 should bring improvement to your situation. But as I said, you need to wipe the drive in 6.9.0 to reformat it back to 1MiB alignment and needless to say it would make the drive incompatible with Unraid before 6.9.0.

Essentially back up, stop array, unassign, blkdiscard, assign back, start and format, restore backup. Beside backing up and restoring from backup, the middle process took 5 minutes.

I expect LT to provide more detailed guidance regarding this perhaps when 6.9.0 enters RC or at least when 6.9.0 becomes stable.

Not that 6.9.0-beta isn't stable. I did see some bugs report but I personally have only seen the virtio / virtio-net thingie which was fixed by using Q35-5.0 machine type (instead of 4.2). No need to use virtio-net which negatively affects network performance.

PS: been running iotop for 3 hours and still average about 345MB / hr. We'll see if my daily house-keeping affects it tonight.

Not that this is really the thread to adress this but anyway:

I just tried yesterday with 2 newly created VM Q35-5.0 machines (Windows 10), on Beta25 and I still get the "unexpected GSO type" flooded in my logs when i use "virto" so I don't see how using Q35-5.0 would be a solution.

Only way I get rid of that in the logs for me is to use "vitio-net" with the severely diminished performance.

Edit:

Just tried again and still same results, attaching my diagnostics if you wish to see for yor self.

unraid-diagnostics-20200821-0848.zip

Edited August 21, 2020 by Koenig

thecode · August 21, 2020

1 hour ago, Koenig said:

Not that this is really the thread to adress this but anyway:

I just tried yesterday with 2 newly created VM Q35-5.0 machines (Windows 10), on Beta25 and I still get the "unexpected GSO type" flooded in my logs when i use "virto" so I don't see how using Q35-5.0 would be a solution.

Only way I get rid of that in the logs for me is to use "vitio-net" with the severely diminished performance.

Edit:

Just tried again and still same results, attaching my diagnostics if you wish to see for yor self.

unraid-diagnostics-20200821-0848.zip 182.76 kB · 0 downloads

I suggest you take it in the Beta25 thread:

testdasi · August 21, 2020

2 hours ago, Koenig said:

Just tried again and still same results, attaching my diagnostics if you wish to see for yor self.

unraid-diagnostics-20200821-0848.zip 182.76 kB · 0 downloads

It looks like switching to 5.0 fixes it for some and not for others (it was suggested somewhere earlier in the topic).

The officially guaranteed method is to switch to virtio-net (or set up VLAN or use separate NICs for docker and VM).

LT said next release will allow user to pick between virtio vs virtio-net, which I think is better than defaulting to virtio-net in beta25 since there are other ways to guarantee no errors.

Edited August 21, 2020 by testdasi

mgutt · October 8, 2020

Found the source of the writes (they still exist, even if SSD has been reformatted to XFS):

Stiefmeister · October 31, 2020

Hi all,

by chance I stumbled upon this problem and immediately changed my cache drive (samsung evo 850 500gb) from btfrs encrypted to xfs encrypted. Constant writes dropped from ~30MB/s to ~700KB/s which of course is a great improvement and might just ensured me a few more months of life of my SSD (although based on SMART data stating 644631465019 lbas written it shouldnt even work anymore).

Now my questions:

Cache drive is XFS encrypted now, but file system within Docker image is still btfrs. Is this because I set up docker image when still using btfrs encrypted file system on my cache or is btfrs the standard docker image file system?

As ~700KB/s is still pretty high in my opinion (only loop writes according to iotop), maybe I could further reduce writes by switching file system within docker image to XFS as well!? Or are there any functional reasons why docker image uses btfrs?

If I were to recreate my docker image file on XFS encrypted cache drive, would the file system within docker image still be btfrs?

Is switching to 1 MiB with Unraid 6.9 also beneficial for XFS formatted drives or does btfrs only benefit from this?

Thanks a lot for your input!

Edited October 31, 2020 by Stiefmeister

TexasUnraid · October 31, 2020

Upgrading to the beta and reformatting should indeed help.

Formatting the docker image to XFS as well has had mixed results. The writes are reduced but not always enough to make the switch worth it.

Squid · October 31, 2020

2 hours ago, Stiefmeister said:

Or are there any functional reasons why docker image uses btfrs?

There used to be, but not for a while now.

Probably the least amount of excess writes is instead of using a docker image file, you utilize a docker folder instead. Avoids any overhead involved in the loopback. But you need the beta to utilize either an XFS formatted image or a directory system

mgutt · November 1, 2020

13 hours ago, Squid said:

utilize a docker folder instead

Guide? ^^

Stiefmeister · November 1, 2020

11 minutes ago, mgutt said:

Guide? ^^

Officially supported in 6.9 beta only, see following blog post for details:

https://unraid.net/blog/unraid-6-9-beta25

[6.8.3] docker image huge amount of unnecessary writes on cache

User Feedback

Recommended Comments

testdasi 500

Link to comment

TexasUnraid 113

Link to comment

vakilando 73

Link to comment

TexasUnraid 113

Link to comment

testdasi 500

Link to comment

John_M 413

Link to comment

bonienl 1764

Link to comment

testdasi 500

Link to comment

DerfMcDoogal 0

Link to comment

testdasi 500

Link to comment

DerfMcDoogal 0

Link to comment

vakilando 73

Link to comment

Crogge 0

Link to comment

Alexstrasza 8

Link to comment

John_M 413

Link to comment

Alexstrasza 8

Link to comment

Koenig 9

Link to comment

thecode 49

Link to comment

testdasi 500

Link to comment

mgutt 2528

Link to comment

Stiefmeister 1

Link to comment

TexasUnraid 113

Link to comment

Squid 4987

Link to comment

mgutt 2528

Link to comment

Stiefmeister 1

Link to comment

Join the conversation