50-60GB written to cache daily even with insignificant activity.


pellen

Recommended Posts

Alright. I put an Samsung 850 EVO mSATA 250GB as a cache drive in my unRAID server a couple of weeks ago as I got it dirt cheap. A few days ago I stumbled upon the SMART data and noticed that it already had passed over 3TBW. I'm using the cache for downloads that then moves over to the array, and then I always keep appdata and system on the cache as well to keep the array drives from spinning up all the time.

 

I made a simple script that extracted the SMART TBW value every night and it's increasing with roughly 50-60GB every day, and the latest week the activity has been close to zero (i.e no downloads or other stuff put to the array), so that leaves me with appdata or system.

 

Tried digging deeper to find what causes so much writes when nothing is happening with iotop.

This is a snippet after running iotop for ~20 minutes, and during this time there was no activity in unRAID, except the docker containers running as usual.

 

DISK READ :       0.00 B/s | Total DISK WRITE :      49.99 K/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:     178.53 K/s
  PID  PRIO  USER     DISK READ DISK WRITE>  SWAPIN      IO    COMMAND                                                               
 7144 be/4 root         84.00 K    552.05 M  0.00 %  0.01 % shfs /mnt/user -disks 31 2048000000 ~big_writes,allow_other -o remember=0
 7427 be/0 root          0.00 B    264.48 M  0.00 %  0.03 % [loop2]
 7450 be/4 root          0.00 B     52.22 M  0.00 %  0.05 % [btrfs-transacti]
16673 be/4 root          0.00 B      8.20 M  0.00 %  0.00 % [kworker/u32:15-bond0]
 3409 be/4 root          0.00 B      7.48 M  0.00 %  0.00 % [kworker/u32:0-btrfs-worker]
 

Could someone help me out here what I'm looking at? As I've understood it loop2 is the docker image? But what is shfs doing? Is it the combined writes made in my various containers?

 

These are the containers currently running:

bazaar

letsencrypt

mariadb

mediawiki

nzbget

organizr

PlexMediaServer

radarr

sonarr

speedtest

tautulli

transmission

unifi

 

I've tried stoppingthem one by one and let iotop run for a while to see if there was a significant difference in writes, but there wasn't really any container that stood out. Stopping them all did make a big difference though, but that's not a good solution :D

 

So, is 50-60GB writes a day considered normal, or is it something that's off? Any tips or tricks to tweak any of the containers?

I know you might think this is a non-problem, the 850 EVO 250GB should handle 75TBW (but probably a bit more according to SSD enducance test reviews), but I really think it's odd that a bunch of low activity containers could write this amount of data every day...it can't just be logs right?

 

 

Link to comment
  • 2 weeks later...

So I've managed to get it down to ~10GB a day now. 

 

I checked how often files in appdata was modified using

find -printf "%TY-%Tm-%Td %TT %p\n" | sort -n | tail

and could from there see which containers modified files most often.

 

Apparently radarr had debug level set to trace which caused a constant stream of logs, and unifi did lots of writes to logs and database as well. bazarr was quite active as well. So setting a normal loglevel in radarr and stopping unifi and bazarr had the writes go down to roughly 10GB a day now.

 

I still don't understand how so small writes can accumulate to double digits of gigabytes a day. Is every log/database file rewritten as soon as it's being modified?

Link to comment

Several of your docker containers contains SQlite databases and what you're seeing is probably writes to a database/temp database.

 

Depending on what "mode" your database is using these writes can cause heavy disk I/O. I had the same issue, then I started using the file activity plugin to see what could be causing this and I found that most of the writes were to a database or a database related file.

 

Basically, there are two database files, the main database file and a second file called "rollback journal" or a "write-ahead log" file (WAL)

I found that databases using the rollback journal caused heavy disk I/O. I think I had 30 million writes in 24 hours or something when I was testing. The container causing the most writes was the Ombi container in my case.

 

I found a way to reduce the writes though. If your containers are using the rollback journal for their database you can choose between a few journal modes. I won't go into detail right now but I changed the mode to the "truncate" mode and was able to reduce the writes drastically. Right at this moment, I have 23 Million writes in 6 days. I still think that is a bit much so I will try to check out what I can do about my containers which are using the WAL mode (I know Radarr is using it). Maybe I can reduce the disk I/O even more.

 

A few links:
https://www.sqlite.org/fileformat2.html#walindexformat
https://www.sqlite.org/wal.html
https://www.sqlite.org/pragma.html#pragma_journal_mode

Link to comment
On 12/14/2018 at 11:00 PM, strike said:

Several of your docker containers contains SQlite databases and what you're seeing is probably writes to a database/temp database.

 

Depending on what "mode" your database is using these writes can cause heavy disk I/O. I had the same issue, then I started using the file activity plugin to see what could be causing this and I found that most of the writes were to a database or a database related file.

 

Basically, there are two database files, the main database file and a second file called "rollback journal" or a "write-ahead log" file (WAL)

I found that databases using the rollback journal caused heavy disk I/O. I think I had 30 million writes in 24 hours or something when I was testing. The container causing the most writes was the Ombi container in my case.

 

I found a way to reduce the writes though. If your containers are using the rollback journal for their database you can choose between a few journal modes. I won't go into detail right now but I changed the mode to the "truncate" mode and was able to reduce the writes drastically. Right at this moment, I have 23 Million writes in 6 days. I still think that is a bit much so I will try to check out what I can do about my containers which are using the WAL mode (I know Radarr is using it). Maybe I can reduce the disk I/O even more.

 

A few links:
https://www.sqlite.org/fileformat2.html#walindexformat
https://www.sqlite.org/wal.html
https://www.sqlite.org/pragma.html#pragma_journal_mode

Big thanks for the tip!! I will look into this :)

 

On 12/14/2018 at 11:35 PM, scubieman said:

Get the plugin file activity by dlandon . It will show you want disc and what file. This does do cache you have to enable.

 

However I only use the app for ten to 20 minutes. Dont leave it run or it could fill up a log file.

I did try it out, and I never saw it showing file activity for the appdata folder, even though I had it enabled for Cache...

 

With 

find -printf "%TY-%Tm-%Td %TT %p\n" | sort -n | tail

I can clearly see that files are being modified several times a minute, but it doesn't show in file activity.

 

root@Cradle:/mnt/cache/appdata# find -printf "%TY-%Tm-%Td %TT %p\n" | sort -n | tail
2018-12-17 08:57:14.9395287050 ./PlexMediaServer/Library/Application Support/Plex Media Server/Logs/Plex Media Scanner.log
2018-12-17 08:57:14.9425285960 ./PlexMediaServer/Library/Application Support/Plex Media Server/Plug-in Support/Databases/com.plexapp.plugins.library.db-shm
2018-12-17 08:57:14.9505283030 ./tautulli/logs/plex_websocket.log
2018-12-17 08:57:23.3892202060 ./sonarr/logs.db-shm
2018-12-17 08:57:48.9892855440 ./radarr/nzbdrone.db-shm
2018-12-17 08:57:52.8431448400 ./sonarr/nzbdrone.db-wal
2018-12-17 08:57:59.2369114070 ./letsencrypt/log/nginx/access.log
2018-12-17 08:57:59.4869022820 ./letsencrypt/fail2ban/fail2ban.sqlite3
2018-12-17 08:58:23.8050147460 ./sonarr/nzbdrone.db-shm
2018-12-17 08:58:24.7049819000 ./PlexMediaServer/Library/Application Support/Plex Media Server/Logs/Plex Media Server.log
 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.