Quite a lot of writes to SSD cache, how identify? (looks like writes to docker.img)


21 posts in this topic Last Reply

Recommended Posts

Hello,

 

My SSD cache has about 2TB written in two weeks only. That feels much for a 250GB ssd.

No mover is used. Writing files to array directly. The ssd is used for the docker img and appdata.
~1 TB/week gives me about 2.8 years before hitting the TBW limit of 150TBW for warranty. 5 years or 150TBW, what comes first (Samsung SSD 860 Evo).

I want to dig deeper. How can I see what it is, doing lots of writes to the ssd?

Edit:
I answer myself.
"iotop" installed using Nerd Pack should do the trick!

Edited by Niklas
Link to post

So, I have done some testing. Moved appdata to array and left only docker.img on cache. Still lots of writing.

Is it possible to identify what docker container doing lots of writes? I have monitored container sizes using the GUI and I'm not seeing any big differences there. Just megabytes of data differences. Feels like temporary data going somewhere but I can't find any bad configured containers.

 

With my current setup, I see about 5TB(!) of data written every month. That's several times the size of the ssd (256GB).
Emby and other apps that need temporary storage for stuff like transcoding and downloading is doing that on one of my unassigned devices (hdd, spinner), but still. 5TB is A LOT.

NOT using mover. NO share with "Use cache: Yes".
Shares using cache is cache only.

Mon Apr  1 00:00:02 CEST 2019 Cache 1 TBW: 11.8136 TB
Tue Apr 30 04:40:10 CEST 2019 Cache 1 TBW: 16.3354 TB

Edited by Niklas
Link to post
  • 1 month later...

I see several gigabytes of data written to cache ssd every hour. The server is almost idle.

 

"docker stats" looks good. BLOCK I/O show low values. Some megabytes written at most. The containers are not doing the writing.
 

Using "iotop -ao" (install using Nerd Tools plugin if interested) for some time to get an aggregated summary of disk IO show "[loop2]" doing gigabytes of disk writing in no time (running for like 15-20 minutes show loop2 writing almost 3GB). This seem to be where all the writing comes from. With df I can see /dev/loop2         40G   24G   15G  63% /var/lib/docker


What is it writing? 🤨

Capture_root@Server_~_2019-07-03_22-15-28_58444530.png


Check your smart value "Total_LBAs_Written" (calculator) if you are using ssd's. It could show quite high values for others here too? It will eat up the warranty of consumer ssd's fast. My Samsung 860 EVO has 5 years OR 150 TBW. With 5-6 TB of writing every month, the warranty will be void after ~2 years.

Edit: 5.08 GB written in ~30-40 min. Continuously writing data. If it was usable data, the free space would drop the same amount(?), it does not. Where does all the written data go? Some form of temporary data not stored?

Edit2: ~90G written in 24h.

Edited by Niklas
Link to post
  • 1 month later...

@Niklas Did you manage to find the cullprit of this nasty problem? I've got a 2 month old samsung 860 EVO 250GB that has almost 40TB written to it (god knows where the 40TB went....certainly not on the array). Stopping all docker containers seems to halt the continuous writing, the more containers that run, the more write IO I get. Several GB's/couple minutes, ssd wear level is already down to 89%.

 

She sure is a silent SSD killer.

 

EDIT: I notice you using "Cache: Encrypted btrfs. Array: Encrypted xfs." just like me, could the combo encryption+btrfs cause these massive writes?

Edited by hotio
Link to post
38 minutes ago, hotio said:

@Niklas Did you manage to find the cullprit of this nasty problem? I've got a 2 month old samsung 860 EVO 250GB that has almost 40TB written to it (god knows where the 40TB went....certainly not on the array). Stopping all docker containers seems to halt the continuous writing, the more containers that run, the more write IO I get. Several GB's/couple minutes, ssd wear level is already down to 89%.

 

She sure is a silent SSD killer.

 

EDIT: I notice you using "Cache: Encrypted btrfs. Array: Encrypted xfs." just like me, could the combo encryption+btrfs cause these massive writes?


40TB! That's crazy. Do you use mover? 20TB a month will give you a total of 7.5 months of warranty with the 860 EVO..

I have moved to IronWolf 110 SSD. 5 years or 435TBW. Gives me some peace of mind.
I have not found the culprit. "inotifywait -r -e modify,delete,create,close_write -m /var/lib/docker" gives no big clues.

Link to post

I know right.... I do use mover, downloads go to cache and then moved to array....but that's not the problem, I can see the GB's fly by with an idle system. And we certainly are not the only ones, lots of people seem to be having this issue, but every thread just stops dead in the water.

Link to post
Just now, hotio said:

I know right.... I do use mover, downloads go to cache and then moved to array....but that's not the problem, I can see the GB's fly by with an idle system. And we certainly are not the only ones, lots of people seem to be having this issue, but every thread just stops dead in the water.


I don't think the users know. You need to understand the smart values and monitor them. Also, people need to understand the "limited" warranty. 3-5 years might sound like much but i think people don't know that the TBW is a big factor and will void your warranty very early if you have lots of terabytes written in short amount of time.

I don't know if the encryption makes any difference. We need to hear from more users before drawing any conclusions...

Link to post

This is the TBW for my SSD cache drive. Is this OKAY?

 

Samsung 850 EVO 500 GB
18 months Usage
Holds all Appdata permanently
Holds incoming data until 80% capacity...then mover started
 

sudo /usr/sbin/smartctl -A /dev/sdb | awk '$0~/LBAs/{ printf "TBW %.1f\n", $10 * 512 / 1024^4 }'
TBW 47.7

 

Link to post
  • 3 weeks later...

For anyone who cares... I've re-formatted the cache pool from encrypted-btrfs to just plain btrfs and the writes have droppped by almost 10x. Disk IO for [loop2] has gone from several GB's in a few minutes to maybe a 1GB in a couple of hours using iotop -a. Total writes for the SSD using LBA has gone to 1GB/hour, that is with appdata also on the same device. This will give a lifespan closer to 10 years instead of 1 year 🙂.

 

Anyone real smart in here, that can explain the behaviour? I used to run LUKS encryption on the same drives with ext4 and /var/lib/docker straight onto the disk...this gave no issues. The combo LUKS+BTRFS+docker.img does however.

Link to post
  • 2 months later...
On 8/25/2019 at 1:44 PM, hotio said:

For anyone who cares... I've re-formatted the cache pool from encrypted-btrfs to just plain btrfs and the writes have droppped by almost 10x. Disk IO for [loop2] has gone from several GB's in a few minutes to maybe a 1GB in a couple of hours using iotop -a. Total writes for the SSD using LBA has gone to 1GB/hour, that is with appdata also on the same device. This will give a lifespan closer to 10 years instead of 1 year 🙂.

 

Anyone real smart in here, that can explain the behaviour? I used to run LUKS encryption on the same drives with ext4 and /var/lib/docker straight onto the disk...this gave no issues. The combo LUKS+BTRFS+docker.img does however.

+1 on this.

I'm seeing similar issues (high writes on the SSD cache).

However since my docker containers are hosting my bitwarden vault, I'm not really keen on unencrypting it.

Bug in BTRFS? Or something we can tweak?

Link to post

Yeah I know.

This seems like a really nasty tradeoff.

Security vs SSD's wearing out quick.

I was quite happy with the encryption on all disks, but feel that it's a real waste of the SSD's to write TB's of data to them each month.

Link to post
  • 3 months later...

Running MariaDB-docker pointed to /mnt/cache/appdata/mariadb or /mnt/user/appdata/mariadb (Use cache: Only) generates LOTS of writes to the cache drive(s). Between 15-20GB/h. iotop showing almost all that writing consumed by mariadb. Moving the databases to array, the writes goes down very much (using "iotop -ao"). I use MariaDB for light use of Nextcloud and Home Assistant. Nothing else.

This is iotop for an hour with mariadb databases on cache drive. /mnt/cache/appdata or /mnt/user/appdata with cache only:
1542776419_Capture_root@Server__2020-02-09_01-01-14_23514654.thumb.png.c36b6049b25df8b583e888f3409e3c73.png

When /mnt/cache/appdata is used, the shfs-processes will show as mysql(d?). Missing screenshot.

This is iotop for about an hour with databases on array. /mnt/user/araydata:
613357541_Capture_root@Server__2020-02-09_02-00-56_37324452.thumb.png.c0f437ee26acef7b607ab83c890ab0ae.png
Still a bit much (seen to my light usage) but not even near the writing when on cache.

I don't know if this is a bug in Unraid, btrfs or something but I will keep my databases on array to save some ssd life. I will loose speed but as I said, this is with very light use of mariadb..

I checked tree different ways to enter the path to the location for databases (/config) and let it sit for an hour with freshly started iotop between the different paths. To calculate data used, I checked and compared the smart value "233    Lifetime wts to flsh GB" for the ssd(s). Running mirrored drives. I guess other stuff writing to the cache drive or share with cache set to only will have unnecessary high writes.

Sorry for my rumble. I get like that when I'm interested in a specific area. ;)

Not native english speaker. Please, just ask if unclear.


Edit:

 

My docs

On /mnt/cache/appdata/mariadb (direct cache drive)

2020-02-08 kl 22:02 - 23:04. 15 (!) GB written.

 

On /mnt/user/arraydata/mariadb (user share on array only)
2020-02-08 kl 23:04-00:02. 2 GB written.

 

On /mnt/user/appdata/mariadb (Use cache: Only)
2020-02-09 kl 00:02-01:02. 22 GB (!) written.


Just ran this again to really see the differense and loook attt itttt. ;)

On /mnt/user/arraydata/mariadb (array only, spinning rust)

2020-02-09 kl 01:02-02:02. 4 GB written.

 

 

Edited by Niklas
More about how I tested it.
Link to post
13 hours ago, Niklas said:

Running MariaDB-docker pointed to /mnt/cache/appdata/mariadb or /mnt/user/appdata/mariadb (Use cache: Only) generates LOTS of writes to the cache drive(s). Between 15-20GB/h. iotop showing almost all that writing consumed by mariadb. Moving the databases to array, the writes goes down very much (using "iotop -ao"). I use MariaDB for light use of Nextcloud and Home Assistant. Nothing else.

This is iotop for an hour with mariadb databases on cache drive. /mnt/cache/appdata or /mnt/user/appdata with cache only:
1542776419_Capture_root@Server__2020-02-09_01-01-14_23514654.thumb.png.c36b6049b25df8b583e888f3409e3c73.png

When /mnt/cache/appdata is used, the shfs-processes will show as mysql(d?). Missing screenshot.

This is iotop for about an hour with databases on array. /mnt/user/araydata:
613357541_Capture_root@Server__2020-02-09_02-00-56_37324452.thumb.png.c0f437ee26acef7b607ab83c890ab0ae.png
Still a bit much (seen to my light usage) but not even near the writing when on cache.

I don't know if this is a bug in Unraid, btrfs or something but I will keep my databases on array to save some ssd life. I will loose speed but as I said, this is with very light use of mariadb..

I checked tree different ways to enter the path to the location for databases (/config) and let it sit for an hour with freshly started iotop between the different paths. To calculate data used, I checked and compared the smart value "233    Lifetime wts to flsh GB" for the ssd(s). Running mirrored drives. I guess other stuff writing to the cache drive or share with cache set to only will have unnecessary high writes.

Sorry for my rumble. I get like that when I'm interested in a specific area. ;)

Not native english speaker. Please, just ask if unclear.


Edit:

 

My docs

On /mnt/cache/appdata/mariadb (direct cache drive)

2020-02-08 kl 22:02 - 23:04. 15 (!) GB written.

 

On /mnt/user/arraydata/mariadb (user share on array only)
2020-02-08 kl 23:04-00:02. 2 GB written.

 

On /mnt/user/appdata/mariadb (Use cache: Only)
2020-02-09 kl 00:02-01:02. 22 GB (!) written.


Just ran this again to really see the differense and loook attt itttt. ;)

On /mnt/user/arraydata/mariadb (array only, spinning rust)

2020-02-09 kl 01:02-02:02. 4 GB written.

 

 

You're saying you're moving the mariadb location between the array, the cache location (/mnt/user/appdata) and directly onto the mountpoint (/mnt/cache).

The screenshots seem to point out that the loop device your docker image is using is still the source of a lot of writes though (loop2)?

 

One way to avoid this behavior would be to have docker mount on the cache directly, bypassing the loop device approach.

I had similar issues and I'm running directly on the BTRFS cache now for quite some time. Still really happy with it.

I wrote about my approach in the bug report I did here: 

 

Note that in order to have it working on boot automatically I modified the start_docker() function and copied the entire /etc/rc.d/rc.docker file to /boot/config/docker-service-mod/rc.docker. My go file copies that back over the original rc.docker file so that when the docker deamon is started, the script sets up docker to use the cache directly.

Haven't got any issues so far 🙂

 

My /boot/config/go file looks like this now (just the cp command to rc.docker is relevant here, the other lines before are for hardware acceleration on Plex)

#!/bin/bash

# Load the i915 driver and set the right permissions on the Quick Sync device so Plex can use it
modprobe i915
chmod -R 777 /dev/dri

# Place the modified docker rc.d script over the original one to make it not use the docker.img
cp /boot/config/docker-service-mod/rc.docker /etc/rc.d/rc.docker

# Start the Management Utility
/usr/local/sbin/emhttp &

 

Cheers.

Edited by S1dney
Link to post
1 minute ago, S1dney said:

You saying you're moving the mariadb location between the array, the cache location (/mnt/user/appdata) and directly onto the mountpoint (/mnt/cache).

The screenshots seem point out the loop device your docker image is using is still the source of a lot of writes though (loop2)?

 

One way to avoid this behavior would be to have docker mount on the cache directly, bypassing the loop device approach.

I had similar issues and I'm running directly on the BTRFS cache now for quite some time. Still really happy with it.

I wrote about my approach in the bug report I did here: 

 

Note that in order to have it working on boot automatically I modified the start_docker() function and copied the entire /etc/rc.d/rc.docker file to /boot/config/docker-service-mod/rc.docker. My go file copies that back over the original rc.docker file so that when the docker deamon is started, the script sets up docker to use the cache directly.

Haven't got any issues so far 🙂

 

My /boot/config/go file looks like this now (just the cp command to rc.docker is relevant here, the other lines before are for hardware acceleration on Plex)


#!/bin/bash

# Load the i915 driver and set the right permissions on the Quick Sync device so Plex can use it
modprobe i915
chmod -R 777 /dev/dri

# Put the modified docker service file over the original one to make it not use the docker.img
cp /boot/config/docker-service-mod/rc.docker /etc/rc.d/rc.docker

# Start the Management Utility
/usr/local/sbin/emhttp &

 

Cheers.


loop2 is the mounted docker.img, right? I have it pointed to /mnt/cache/system/docker/docker.img
What did you change in your rc.docker?

Link to post
10 minutes ago, Niklas said:


loop2 is the mounted docker.img, right? I have it pointed to /mnt/cache/system/docker/docker.img
What did you change in your rc.docker?

Yeah it should indeed.

 

You can check with:

df -hm /dev/loop2

This will probably show you mounted on /var/lib/docker.

 

Now the fact that you're moving the docker image between the user share appdata and the mountpoint directly doesn't really help, since docker is still running inside an image file, mounted via a loop device on your cache.

There seems to be a bug with using docker in this kind of setup, although I wasn't able to reproduce this on another linux distro. Might be a Slackware thing.

 

The only way (I could come up with) to get the writes down is to create a symlink between /mnt/cache/docker (or whatever cache only dir you create) and /var/lib/docker and then start docker.

The start_docker() function I've modified inside the rc.docker script does that and some other things, like checking whether the docker image is already mounted, and if so, unmount it.

Edited by S1dney
Link to post
1 hour ago, S1dney said:

You're saying you're moving the mariadb location between the array, the cache location (/mnt/user/appdata) and directly onto the mountpoint (/mnt/cache).

The screenshots seem to point out that the loop device your docker image is using is still the source of a lot of writes though (loop2)?


Yes. I tried three different locations. /mnt/cache/appdata, /mnt/user/appdata (set to cache only) and /mnt/user/arraydata (array only).

The two first locations generate that crazy 12-20GB/h. On array, it was like 10x+ less writing.

The loop device also does some writing I find strange, yes. Wrote about this before noticing the high mariadb usage..

I will read your bug report and answers.

Edit: this is just by keeping an eye on mariadb specifically. Other containers writing to the appdata dir will probably also generate lots of waste data including the loopback docker.img.

Edited by Niklas
Link to post
  • 1 month later...
On 3/14/2020 at 11:47 AM, sdamaged said:

Old thread but yes this is indeed a problem.  My 2 year old Samsung 850 Pro SSD (which was used for cache for only 12 months!) has over 450TB written which effectively means the warranty is now void...

There a bug report for this now:

See report.

Link to post
  • 2 months later...

Looks like I've not been impacted.

 

cache drive is btrfs, with the following dockers: mariadb, kodi-server, duplicati

# cat /etc/unraid-version; /usr/sbin/smartctl -A /dev/sdb | awk '$0~/Power_On_Hours/{ printf "Days: %.1f\n", $10 / 24} $0~/LBAs/{ printf "TBW: %.1f\n", $10 * 512 / 1024^4 }'
version="6.8.3"
Days: 646.9
TBW: 10.3

 

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.