February 10, 20251 yr Hey all, The Fix Common Problems plugin gave me the error that "Rootfs file is getting full", so, as directed in this post, I am posting in General Support asking for help. I took a look in various suspicious places: syslog, docker.log, nginx logs, but couldn't find anything obvious that stuck out to me. I also took a look around the forums but no previous topics that came up seem to be match my case. Any help y'all could give me is greatly appreciated! I've attached my diagnostics, and here is the output of Squid's memorystorage script: plugin: installing: memorystorage.plg Executing hook script: pre_plugin_checks plugin: downloading: memorystorage.plg ... done Executing hook script: pre_plugin_checks This script may take a few minutes to run, especially if you are manually mounting a remote share outside of /mnt/disks or /mnt/remotes /usr/bin/du --exclude=/mnt/user --exclude=/mnt/user0 --exclude=/mnt/disks --exclude=/proc --exclude=/sys --exclude=/var/lib/docker --exclude=/boot --exclude=/mnt -h -d2 / 2>/dev/null | grep -v 0$' ' 4.0K /root/.config 28K /root 2.3M /var/lib 960K /var/log 16K /var/spool 36K /var/tmp 16K /var/named 4.0M /var/cache 28K /var/state 4.0K /var/kerberos 112M /var/local 23M /var/sa 141M /var 12M /bin 116K /etc/X11 60K /etc/profile.d 8.0K /etc/acpi 80K /etc/default 176K /etc/bash_completion.d 4.0K /etc/sasl2 8.0K /etc/dbus-1 4.0K /etc/cron.weekly 4.0K /etc/cron.d 20K /etc/cron.daily 4.0K /etc/cron.hourly 4.0K /etc/cron.monthly 12K /etc/elogind 112K /etc/pam.d 24K /etc/modprobe.d 9.4M /etc/udev 4.0K /etc/sensors.d 48K /etc/logrotate.d 92K /etc/mc 48K /etc/mcelog 4.0K /etc/nvme 576K /etc/ssh 260K /etc/ssl 44K /etc/security 12K /etc/sysstat 44K /etc/apcupsd 24K /etc/avahi 2.7M /etc/file 40K /etc/nginx 24K /etc/php-fpm.d 8.0K /etc/php.d 20K /etc/samba 8.0K /etc/ssmtp 8.0K /etc/libnl 152K /etc/lvm 4.0K /etc/pkcs11 8.0K /etc/OpenCL 40K /etc/zfs 12K /etc/gtk-3.0 8.0K /etc/fonts 340K /etc/libvirt- 360K /etc/libvirt 8.0K /etc/rsyslog.d 17M /etc 25K /lib/dhcpcd 146M /lib/firmware 18K /lib/modprobe.d 27M /lib/modules 512 /lib/systemd 8.5M /lib/udev 186M /lib 4.4M /lib64/security 1.8M /lib64/elogind 12K /lib64/pkgconfig 47M /lib64 8.0K /run/blkid 244K /run/udev 4.0K /run/hook-state 4.0K /run/dbus 12K /run/elogind 4.0K /run/avahi-daemon 1.3M /run/docker 60K /run/libvirt 1.8M /run 24M /sbin 595M /usr/bin 29M /usr/lib 815M /usr/lib64 90M /usr/libexec 118M /usr/local 50M /usr/sbin 278M /usr/share 7.2M /usr/src 928K /usr/doc 300K /usr/include 20K /usr/info 1.0M /usr/man 2.0G /usr 20K /tmp/emhttp 748K /tmp/plugins 368K /tmp/notifications 4.0K /tmp/unraid.patch 136K /tmp/appdata.backup 13M /tmp/community.applications 16K /tmp/unassigned.devices 12K /tmp/usb_manager 18M /tmp/fix.common.problems 32K /tmp/user.scripts 56K /tmp/gui.search 18M /tmp/CA_logs 4.0K /tmp/ca_notices 20K /tmp/tailscale 4.0K /tmp/unraidcheck 50M /tmp 2.4G / 0 /mnt/addons 0 /mnt/rootshare 0 /mnt Finished. NOTE: If there is any subdirectory from /mnt appearing in this list, then that means that you have (most likely) a docker app which is directly referencing a non-existant disk or cache pool script: memorystorage.plg executed Executing hook script: post_plugin_checks I'll mention this in case it's relevant: I recently restarted my server and found that all my docker containers had disappeared, so I had to restore them (thank goodness for CA "Previous Apps" and the AppData Backup plugin). But I did also temporarily forget about the "restore" functionality in the AppData Backup plugin, so I did a bit of manual appdata restoring before wiping that out and doing it using the plugin. I still have a couple docker containers that won't start and that I need to troubleshoot, but nothing that's, like restarting over and over again (to my knowledge). indigo-diagnostics-20250210-0904.zip Edited February 10, 20251 yr by dispatchrabbi Added the note about having looked on the forums already.
February 10, 20251 yr Community Expert The output of the script looks OK, so it must be missing something. What do you get from command line with this? ls -lah /mnt
February 10, 20251 yr Author Thanks for taking a look. Here's what I get: root@indigo:~# ls -lah /mnt total 0 drwxr-xr-x 15 root root 300 Feb 7 21:39 ./ drwxr-xr-x 20 root root 440 Feb 10 09:04 ../ drwxrwxrwt 2 nobody users 40 Feb 7 19:56 addons/ drwxr-xr-x 4 root root 80 Feb 10 09:12 cache/ drwxrwxrwx 6 nobody users 80 Feb 10 09:12 disk1/ drwxrwxrwx 7 nobody users 100 Feb 10 09:12 disk2/ drwxrwxrwx 7 nobody users 108 Feb 10 09:12 disk3/ drwxrwxrwx 6 nobody users 62 Feb 10 10:00 disk4/ drwxrwxrwx 2 nobody users 6 Feb 10 09:12 disk5/ drwxrwxrwx 7 nobody users 111 Feb 10 09:12 disk6/ drwxrwxrwt 2 nobody users 40 Feb 7 19:56 disks/ drwxrwxrwt 2 nobody users 40 Feb 7 19:56 remotes/ drwxrwxrwt 2 nobody users 40 Feb 7 19:56 rootshare/ drwxrwxrwx 1 nobody users 80 Feb 10 10:00 user/ drwxrwxrwx 1 nobody users 80 Feb 10 10:00 user0/
February 10, 20251 yr Community Expert Solution That's not correct for cache. Looks like cache was unmountable when you booted (and still is). Since Docker and VM Manager were enabled, their related shares (appdata, domains, system) got created on the array. Then probably some container or something specified a host path of /mnt/cache when it didn't actually exist, so it got created in rootfs. Disable Docker and VM Manager in Settings, and leave them disabled until you get cache fixed.
February 10, 20251 yr Author Thanks for the pointer! From there and some other places, I think I am back on track. In case this helps someone in the future, here's what I did from here: Discovered which docker was referencing cache directly and changed it so it points at the path under /mnt/user instead Disabled Docker and VM Manager (I don't use any VMs, so that'll actually just stay off indefinitely) Stopped the array, then started it. The Main page shows the cache disk as "Unmountable: wrong or no file system" Looked around for others who had had this problem. Found this thread and post. The cache drive will mount with mount -o rescue=all,ro /dev/nvme0n1p1 /tempmount but will not mount with mount -o ro /dev/nvme0n1p1 /tempmount I mounted the cache drive and copied the data off of it, just in case When I try to mount the drive with mount -o ro /dev/nvme0n1p1 /tempmount, it tells me: root@indigo:/# mount -o ro /dev/nvme0n1p1 /tempmount/ mount: /tempmount: can't read superblock on /dev/nvme0n1p1. dmesg(1) may have more information after failed mount system call. And then dmesg tells me: [ 2952.544039] BTRFS info (device nvme0n1p1: state C): last unmount of filesystem 1fa9864a-4ec2-4a12-b8b6-39615023a732 [ 2964.813750] BTRFS: device fsid 1fa9864a-4ec2-4a12-b8b6-39615023a732 devid 1 transid 8774759 /dev/nvme0n1p1 scanned by mount (117261) [ 2964.814021] BTRFS info (device nvme0n1p1): first mount of filesystem 1fa9864a-4ec2-4a12-b8b6-39615023a732 [ 2964.814031] BTRFS info (device nvme0n1p1): using crc32c (crc32c-intel) checksum algorithm [ 2964.814034] BTRFS info (device nvme0n1p1): using free space tree [ 2964.815733] BTRFS info (device nvme0n1p1): bdev /dev/nvme0n1p1 errs: wr 0, rd 0, flush 0, corrupt 5596, gen 0 [ 2964.834878] BTRFS info (device nvme0n1p1): enabling ssd optimizations [ 2964.834880] BTRFS info (device nvme0n1p1): auto enabling async discard [ 2964.834881] BTRFS info (device nvme0n1p1): start tree-log replay [ 2964.857252] BTRFS error (device nvme0n1p1): incorrect extent count for 1655832576000; counted 5563, expected 5562 [ 2964.857258] BTRFS error (device nvme0n1p1: state A): Transaction aborted (error -5) [ 2964.857261] BTRFS: error (device nvme0n1p1: state A) in btrfs_recover_log_trees:7174: errno=-5 IO failure [ 2964.857266] BTRFS: error (device nvme0n1p1: state EA) in btrfs_replay_log:2084: errno=-5 IO failure (Failed to recover log tree) [ 2964.858164] BTRFS error (device nvme0n1p1: state EA): open_ctree failed Looking at the btrfs rescue options and having found this thread , I went ahead and ran btrfs rescue zero-log /dev/nvme0n1p1. I stopped my array and then started it again, and the cache drive mounted! Things appear to be good so far. For posterity's sake, here's what I get now when I run ls -lah /mnt: root@indigo:~# ls -lah /mnt total 16K drwxr-xr-x 15 root root 300 Feb 10 15:59 ./ drwxr-xr-x 19 root root 420 Feb 10 15:59 ../ drwxrwxrwt 2 nobody users 40 Feb 10 15:58 addons/ drwxrwxrwx 1 nobody users 60 Feb 7 04:40 cache/ drwxrwxrwx 6 nobody users 80 Feb 10 14:40 disk1/ drwxrwxrwx 7 nobody users 100 Feb 10 14:40 disk2/ drwxrwxrwx 7 nobody users 108 Feb 10 14:40 disk3/ drwxrwxrwx 5 nobody users 48 Feb 10 16:00 disk4/ drwxrwxrwx 2 nobody users 6 Feb 10 14:40 disk5/ drwxrwxrwx 8 nobody users 135 Feb 10 15:09 disk6/ drwxrwxrwt 2 nobody users 40 Feb 10 15:58 disks/ drwxrwxrwt 2 nobody users 40 Feb 10 15:58 remotes/ drwxrwxrwt 2 nobody users 40 Feb 10 15:58 rootshare/ drwxrwxrwx 1 nobody users 80 Feb 10 16:00 user/ drwxrwxrwx 1 nobody users 80 Feb 10 16:00 user0/ I then re-enabled Docker and started my containers back up. So far, everything seems to be working as it should. @trurl, thank you for your quick responses. I'd have been pulling my hair out without your help.
February 10, 20251 yr Community Expert You probably still have some cleanup to do: 5 hours ago, trurl said: Since Docker and VM Manager were enabled, their related shares (appdata, domains, system) got created on the array. Post new diagnostics
February 11, 20251 yr Community Expert Feb 10 18:00:18 indigo kernel: BTRFS warning (device nvme0n1p1): csum failed root 5 ino 33689159 off 2238828544 csum 0x604860f1 expected csum 0xda8efc4a mirror 1 This is usually bad RAM. You must not even attempt to run any computer unless RAM is working perfectly. Everything goes through RAM. The OS and other executable code, your data. Everything. The CPU can't do anything with anything until it is loaded into RAM.
February 11, 20251 yr Author Oof. Yeah, okay, that's a big deal. Aside from, e.g., making sure RAM is seated correctly, is there any diagnostic or other thing worth doing before I just go get new RAM? Or is it not worth it?
February 19, 20251 yr Author Okay, finally got some new RAM. I've been keeping the server off since the last post. Booted up the server, had a bit of a heck of a time because apparently my SSL certs got revoked or expired somewhere in there. Finally got it up and going to the point where I could grab new diagnostics, then shut the machine back down. I've attached the new diagnostics. Mind taking a look and letting me know if things look good? indigo-diagnostics-20250219-1505.zip
February 19, 20251 yr Author Ah, very cool to have it built in. I don't think I noticed that before. memtest was a pass! Anything else I should look at, or do you think I'm good to go? Thank you for all of your help through this. indigo-diagnostics-20250219-1848.zip
February 20, 20251 yr Community Expert Your appdata, domains, system shares have files on the array. Ideally, these would be all on cache or other pool so Docker/VM will perform better, and so array disks can spin down since these files are always open.
February 20, 20251 yr Author Ah, yeah, that one I’m aware of. I was in the middle of fixing that when the rest of this started happening. I didn’t want to start moving anything around while troubleshooting in case there was a chance something might get corrupted or stuck in a bad state. It’s on my to-fix list once I’m clear though.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.