Rootfs file is getting full

February 10, 20251 yr

Hey all,

The Fix Common Problems plugin gave me the error that "Rootfs file is getting full", so, as directed in this post, I am posting in General Support asking for help.

I took a look in various suspicious places: syslog, docker.log, nginx logs, but couldn't find anything obvious that stuck out to me. I also took a look around the forums but no previous topics that came up seem to be match my case. Any help y'all could give me is greatly appreciated!

I've attached my diagnostics, and here is the output of Squid's memorystorage script:

plugin: installing: memorystorage.plg
Executing hook script: pre_plugin_checks
plugin: downloading: memorystorage.plg ... done

Executing hook script: pre_plugin_checks


This script may take a few minutes to run, especially if you are manually mounting a remote share outside of /mnt/disks or /mnt/remotes

/usr/bin/du --exclude=/mnt/user --exclude=/mnt/user0 --exclude=/mnt/disks --exclude=/proc --exclude=/sys --exclude=/var/lib/docker --exclude=/boot --exclude=/mnt -h -d2 / 2>/dev/null | grep -v 0$' '
4.0K /root/.config
28K /root
2.3M /var/lib
960K /var/log
16K /var/spool
36K /var/tmp
16K /var/named
4.0M /var/cache
28K /var/state
4.0K /var/kerberos
112M /var/local
23M /var/sa
141M /var
12M /bin
116K /etc/X11
60K /etc/profile.d
8.0K /etc/acpi
80K /etc/default
176K /etc/bash_completion.d
4.0K /etc/sasl2
8.0K /etc/dbus-1
4.0K /etc/cron.weekly
4.0K /etc/cron.d
20K /etc/cron.daily
4.0K /etc/cron.hourly
4.0K /etc/cron.monthly
12K /etc/elogind
112K /etc/pam.d
24K /etc/modprobe.d
9.4M /etc/udev
4.0K /etc/sensors.d
48K /etc/logrotate.d
92K /etc/mc
48K /etc/mcelog
4.0K /etc/nvme
576K /etc/ssh
260K /etc/ssl
44K /etc/security
12K /etc/sysstat
44K /etc/apcupsd
24K /etc/avahi
2.7M /etc/file
40K /etc/nginx
24K /etc/php-fpm.d
8.0K /etc/php.d
20K /etc/samba
8.0K /etc/ssmtp
8.0K /etc/libnl
152K /etc/lvm
4.0K /etc/pkcs11
8.0K /etc/OpenCL
40K /etc/zfs
12K /etc/gtk-3.0
8.0K /etc/fonts
340K /etc/libvirt-
360K /etc/libvirt
8.0K /etc/rsyslog.d
17M /etc
25K /lib/dhcpcd
146M /lib/firmware
18K /lib/modprobe.d
27M /lib/modules
512 /lib/systemd
8.5M /lib/udev
186M /lib
4.4M /lib64/security
1.8M /lib64/elogind
12K /lib64/pkgconfig
47M /lib64
8.0K /run/blkid
244K /run/udev
4.0K /run/hook-state
4.0K /run/dbus
12K /run/elogind
4.0K /run/avahi-daemon
1.3M /run/docker
60K /run/libvirt
1.8M /run
24M /sbin
595M /usr/bin
29M /usr/lib
815M /usr/lib64
90M /usr/libexec
118M /usr/local
50M /usr/sbin
278M /usr/share
7.2M /usr/src
928K /usr/doc
300K /usr/include
20K /usr/info
1.0M /usr/man
2.0G /usr
20K /tmp/emhttp
748K /tmp/plugins
368K /tmp/notifications
4.0K /tmp/unraid.patch
136K /tmp/appdata.backup
13M /tmp/community.applications
16K /tmp/unassigned.devices
12K /tmp/usb_manager
18M /tmp/fix.common.problems
32K /tmp/user.scripts
56K /tmp/gui.search
18M /tmp/CA_logs
4.0K /tmp/ca_notices
20K /tmp/tailscale
4.0K /tmp/unraidcheck
50M /tmp
2.4G /
0 /mnt/addons
0 /mnt/rootshare
0 /mnt


Finished.
NOTE: If there is any subdirectory from /mnt appearing in this list, then that means that you have (most likely) a docker app which is directly referencing a non-existant disk or cache pool
script: memorystorage.plg executed
Executing hook script: post_plugin_checks

I'll mention this in case it's relevant: I recently restarted my server and found that all my docker containers had disappeared, so I had to restore them (thank goodness for CA "Previous Apps" and the AppData Backup plugin). But I did also temporarily forget about the "restore" functionality in the AppData Backup plugin, so I did a bit of manual appdata restoring before wiping that out and doing it using the plugin. I still have a couple docker containers that won't start and that I need to troubleshoot, but nothing that's, like restarting over and over again (to my knowledge).

indigo-diagnostics-20250210-0904.zip

Edited February 10, 20251 yr by dispatchrabbi
Added the note about having looked on the forums already.

Quote

February 10, 20251 yr

Community Expert

The output of the script looks OK, so it must be missing something.

What do you get from command line with this?

ls -lah /mnt

Quote

February 10, 20251 yr

Author

Thanks for taking a look. Here's what I get:

root@indigo:~# ls -lah /mnt
total 0
drwxr-xr-x 15 root   root  300 Feb  7 21:39 ./
drwxr-xr-x 20 root   root  440 Feb 10 09:04 ../
drwxrwxrwt  2 nobody users  40 Feb  7 19:56 addons/
drwxr-xr-x  4 root   root   80 Feb 10 09:12 cache/
drwxrwxrwx  6 nobody users  80 Feb 10 09:12 disk1/
drwxrwxrwx  7 nobody users 100 Feb 10 09:12 disk2/
drwxrwxrwx  7 nobody users 108 Feb 10 09:12 disk3/
drwxrwxrwx  6 nobody users  62 Feb 10 10:00 disk4/
drwxrwxrwx  2 nobody users   6 Feb 10 09:12 disk5/
drwxrwxrwx  7 nobody users 111 Feb 10 09:12 disk6/
drwxrwxrwt  2 nobody users  40 Feb  7 19:56 disks/
drwxrwxrwt  2 nobody users  40 Feb  7 19:56 remotes/
drwxrwxrwt  2 nobody users  40 Feb  7 19:56 rootshare/
drwxrwxrwx  1 nobody users  80 Feb 10 10:00 user/
drwxrwxrwx  1 nobody users  80 Feb 10 10:00 user0/

Quote

February 10, 20251 yr

Community Expert
Solution

That's not correct for cache. Looks like cache was unmountable when you booted (and still is).

Since Docker and VM Manager were enabled, their related shares (appdata, domains, system) got created on the array. Then probably some container or something specified a host path of /mnt/cache when it didn't actually exist, so it got created in rootfs.

Disable Docker and VM Manager in Settings, and leave them disabled until you get cache fixed.

Quote

February 10, 20251 yr

Author

Thanks for the pointer! From there and some other places, I think I am back on track. In case this helps someone in the future, here's what I did from here:

Discovered which docker was referencing cache directly and changed it so it points at the path under /mnt/user instead
Disabled Docker and VM Manager (I don't use any VMs, so that'll actually just stay off indefinitely)
Stopped the array, then started it. The Main page shows the cache disk as "Unmountable: wrong or no file system"
Looked around for others who had had this problem. Found this thread and post.
The cache drive will mount with mount -o rescue=all,ro /dev/nvme0n1p1 /tempmount but will not mount with mount -o ro /dev/nvme0n1p1 /tempmount
I mounted the cache drive and copied the data off of it, just in case

When I try to mount the drive with mount -o ro /dev/nvme0n1p1 /tempmount, it tells me:

root@indigo:/# mount -o ro /dev/nvme0n1p1 /tempmount/
mount: /tempmount: can't read superblock on /dev/nvme0n1p1.
       dmesg(1) may have more information after failed mount system call.

And then dmesg tells me:

[ 2952.544039] BTRFS info (device nvme0n1p1: state C): last unmount of filesystem 1fa9864a-4ec2-4a12-b8b6-39615023a732
[ 2964.813750] BTRFS: device fsid 1fa9864a-4ec2-4a12-b8b6-39615023a732 devid 1 transid 8774759 /dev/nvme0n1p1 scanned by mount (117261)
[ 2964.814021] BTRFS info (device nvme0n1p1): first mount of filesystem 1fa9864a-4ec2-4a12-b8b6-39615023a732
[ 2964.814031] BTRFS info (device nvme0n1p1): using crc32c (crc32c-intel) checksum algorithm
[ 2964.814034] BTRFS info (device nvme0n1p1): using free space tree
[ 2964.815733] BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 5596, gen 0
[ 2964.834878] BTRFS info (device nvme0n1p1): enabling ssd optimizations
[ 2964.834880] BTRFS info (device nvme0n1p1): auto enabling async discard
[ 2964.834881] BTRFS info (device nvme0n1p1): start tree-log replay
[ 2964.857252] BTRFS error (device nvme0n1p1): incorrect extent count for 1655832576000; counted 5563, expected 5562
[ 2964.857258] BTRFS error (device nvme0n1p1: state A): Transaction aborted (error -5)
[ 2964.857261] BTRFS: error (device nvme0n1p1: state A) in btrfs_recover_log_trees:7174: errno=-5 IO failure
[ 2964.857266] BTRFS: error (device nvme0n1p1: state EA) in btrfs_replay_log:2084: errno=-5 IO failure (Failed to recover log tree)
[ 2964.858164] BTRFS error (device nvme0n1p1: state EA): open_ctree failed

Looking at the btrfs rescue options and having found this thread , I went ahead and ran btrfs rescue zero-log /dev/nvme0n1p1. I stopped my array and then started it again, and the cache drive mounted! Things appear to be good so far.

For posterity's sake, here's what I get now when I run ls -lah /mnt:

root@indigo:~# ls -lah /mnt
total 16K
drwxr-xr-x 15 root   root  300 Feb 10 15:59 ./
drwxr-xr-x 19 root   root  420 Feb 10 15:59 ../
drwxrwxrwt  2 nobody users  40 Feb 10 15:58 addons/
drwxrwxrwx  1 nobody users  60 Feb  7 04:40 cache/
drwxrwxrwx  6 nobody users  80 Feb 10 14:40 disk1/
drwxrwxrwx  7 nobody users 100 Feb 10 14:40 disk2/
drwxrwxrwx  7 nobody users 108 Feb 10 14:40 disk3/
drwxrwxrwx  5 nobody users  48 Feb 10 16:00 disk4/
drwxrwxrwx  2 nobody users   6 Feb 10 14:40 disk5/
drwxrwxrwx  8 nobody users 135 Feb 10 15:09 disk6/
drwxrwxrwt  2 nobody users  40 Feb 10 15:58 disks/
drwxrwxrwt  2 nobody users  40 Feb 10 15:58 remotes/
drwxrwxrwt  2 nobody users  40 Feb 10 15:58 rootshare/
drwxrwxrwx  1 nobody users  80 Feb 10 16:00 user/
drwxrwxrwx  1 nobody users  80 Feb 10 16:00 user0/

I then re-enabled Docker and started my containers back up. So far, everything seems to be working as it should.

@trurl, thank you for your quick responses. I'd have been pulling my hair out without your help.

Quote

February 10, 20251 yr

Community Expert

You probably still have some cleanup to do:

5 hours ago, trurl said:

Since Docker and VM Manager were enabled, their related shares (appdata, domains, system) got created on the array.

Post new diagnostics

Quote

February 10, 20251 yr

Author

New diagnostics, as requested!

indigo-diagnostics-20250210-1842.zip

Quote

February 11, 20251 yr

Community Expert

Feb 10 18:00:18 indigo kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 33689159 off 2238828544 csum 0x604860f1 expected csum 0xda8efc4a mirror 1

This is usually bad RAM.

You must not even attempt to run any computer unless RAM is working perfectly. Everything goes through RAM. The OS and other executable code, your data. Everything. The CPU can't do anything with anything until it is loaded into RAM.

Quote

February 11, 20251 yr

Author

Oof. Yeah, okay, that's a big deal.

Aside from, e.g., making sure RAM is seated correctly, is there any diagnostic or other thing worth doing before I just go get new RAM? Or is it not worth it?

Quote

February 19, 20251 yr

Author

Okay, finally got some new RAM. I've been keeping the server off since the last post.

Booted up the server, had a bit of a heck of a time because apparently my SSL certs got revoked or expired somewhere in there. Finally got it up and going to the point where I could grab new diagnostics, then shut the machine back down.

I've attached the new diagnostics. Mind taking a look and letting me know if things look good?

indigo-diagnostics-20250219-1505.zip

Quote

February 19, 20251 yr

Community Expert

memtest is on the boot menu

Quote

February 19, 20251 yr

Author

Ah, very cool to have it built in. I don't think I noticed that before.

memtest was a pass! Anything else I should look at, or do you think I'm good to go?

Thank you for all of your help through this.

indigo-diagnostics-20250219-1848.zip

Quote

February 20, 20251 yr

Community Expert

Your appdata, domains, system shares have files on the array.

Ideally, these would be all on cache or other pool so Docker/VM will perform better, and so array disks can spin down since these files are always open.

Quote

February 20, 20251 yr

Author

Ah, yeah, that one I’m aware of. I was in the middle of fixing that when the rest of this started happening. I didn’t want to start moving anything around while troubleshooting in case there was a chance something might get corrupted or stuck in a bad state. It’s on my to-fix list once I’m clear though.

Quote

Rootfs file is getting full

Featured Replies

Solved by trurl

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)