thatja Posted May 17 Author Share Posted May 17 Managed to get this off the flash drive in /boot/logs syslog Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 Crashed 10mins after a reboot this time. syslog Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 (edited) plexified-diagnostics-20240517-1052.zip /mnt inaccessible again Edited May 17 by thatja Quote Link to comment
JorgeB Posted May 17 Share Posted May 17 I still don't see anything relevant logged, did you try without mergefs? Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 Just now, JorgeB said: I still don't see anything relevant logged, did you try without mergefs? I have been using the system all morning without mergerfs and it was working fine, 10mins after I mounted mergefs I got a crash. I am highly suspecting either failing USB or mergerfs to be the cause. Which is worrying as my setup depends on mergerfs, and I'm not sure what's changed because for 6months it has been solid. Quote Link to comment
Rysz Posted May 17 Share Posted May 17 Something is taking down your system at night: May 17 01:00:35 Plexified emhttpd: unclean shutdown detected Sorry, but you really need to start listening to us and re-trace your steps on what you have changed/updated recently... especially regarding your plugins. There are a ton of additional (and a few of them of quite invasive nature) plugins installed on your server... any of which could cause this issue. The fact that it isn't even able to generate a non-empty diagnostics package further underlines the fact that there's something seriously wrong with your server at the moment. Again, you need to start listening to the advice given, try disabling your plugins one-by-one and see if and when your server starts working again. You needing some plugins for your daily business doesn't change the fact that it's impossible to diagnose the problem without disabling some plugins at least temporarily. You also, as already pointed out by @JorgeB, need to set up the syslog server to see what is happening before the crashing and not just afterwards. So far we've only seen the logs after the system reboots, not from before, which would likely show the problem. Quote Link to comment
JorgeB Posted May 17 Share Posted May 17 1 minute ago, thatja said: I am highly suspecting either failing USB A failing USB drive usually leaves traces in the syslog, so don't think that is the problem, but you can try a different one. Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 (edited) 11 minutes ago, Rysz said: Something is taking down your system at night: May 17 01:00:35 Plexified emhttpd: unclean shutdown detected Sorry, but you really need to start listening to us and re-trace your steps on what you have changed/updated recently... especially regarding your plugins. There are a ton of additional (and a few of them of quite invasive nature) plugins installed on your server... any of which could cause this issue. The fact that it isn't even able to generate a non-empty diagnostics package further underlines the fact that there's something seriously wrong with your server at the moment. Again, you need to start listening to the advice given, try disabling your plugins one-by-one and see if and when your server starts working again. You needing some plugins for your daily business doesn't change the fact that it's impossible to diagnose the problem without disabling some plugins at least temporarily. You also, as already pointed out by @JorgeB, need to set up the syslog server to see what is happening before the crashing and not just afterwards. So far we've only seen the logs after the system reboots, not from before, which would likely show the problem. This is where I am stuck, the clean shutdown was because I could not get into ssh OR the UI, that was the first crash at 1AM. Secondly, the last 2 syslog provided above was before I restarted the server. (after /mnt became inaccessible) - this was from /boot/logs as I did enable syslog server. I have had the server with nothing running at all, no docker containers but Plex, no mergerfs and it was fine. As soon as I mounted my mounts, I got another crash. Could you please elaborate on this? Quote (and a few of them of quite invasive nature) Edited May 17 by thatja Quote Link to comment
Rysz Posted May 17 Share Posted May 17 4 minutes ago, thatja said: This is where I am stuck, the clean shutdown was because I could not get into ssh OR the UI, that was the first crash at 1AM. Secondly, the last 2 syslog provided above was before I restarted the server. (after /mnt became inaccessible) - this was from /boot/logs as I did enable syslog server. I have had the server with nothing running at all, no docker containers but Plex, no mergerfs and it was fine. As soon as I mounted my mounts, I got another crash. OK and where are the logs from what happened before 01am? Because the server seems to have crashed and rebooted at 01am, we need to know what happened before. There's no indication in the logs that mergerFS isn't operating as it should. The opposite actually, it doing garbage collection until the very end of your logs shows it's still running. 🤔 Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 (edited) 3 minutes ago, Rysz said: OK and where are the logs from what happened before 01am? Because the server seems to have crashed and rebooted at 01am, we need to know what happened before. There's no indication in the logs that mergerFS isn't operating as it should. The opposite actually, it doing garbage collection until the very end of your logs shows it's still running. 🤔 The unclean shutdown was because power was pulled from the server, this wasn't a crash related to UNRAID but a power outage on my end, sorry for the confusion regarding that. The crashes today caused by UNRAID/Something else occurred at 10:30AMish and 10:50AMish. Those are what the syslogs above cover before/after events of. Also nothing at all has changed between when things were working good, and the first ever crash relating to this, all I've done is update plugins/docker containers where they have updates available, I've had 6months without issue until the first crash happened at the time of this thread creation. Edited May 17 by thatja Quote Link to comment
Rysz Posted May 17 Share Posted May 17 (edited) 10 minutes ago, thatja said: The unclean shutdown was because power was pulled from the server, this wasn't a crash related to UNRAID but a power outage on my end, sorry for the confusion regarding that. The crashes today caused by UNRAID/Something else occurred at 10:30AMish and 10:50AMish. Those are what the syslogs above cover before/after events of. Well there's nothing in the logs to indicate a failure of any kind around those times, related to mergerFS or not. But the fact that it fails to even generate a diagnostics package makes me think that the rootfs-ramdisk (at /) is either full (with some plugin writing to it non-stop filling it up), not accessible or otherwise broken somehow. It isn't even able to write the syslog or any other files into the diagnostics package, which would lead me to my earlier belief that it has something to do with the RAM. How much RAM do you have on your server? How did you shutdown your server after it crashed - because there's nothing in the logs anymore after your last SSH login to the crashed server. Edited May 17 by Rysz Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 7 minutes ago, Rysz said: Well there's nothing in the logs to indicate a failure of any kind around those times, related to mergerFS or not. But the fact that it fails to even generate a diagnostics package makes me think that the rootfs-ramdisk (at /) is either full (with some plugin writing to it non-stop filling it up), not accessible or otherwise broken somehow. It isn't even able to write the syslog or any other files into the diagnostics package, which would lead me to my earlier belief that it has something to do with the RAM. How much RAM do you have on your server? How did you shutdown your server after it crashed - because there's nothing in the logs anymore after your last SSH login to the crashed server. How would I find out about the rootfs-ramdisk being full? or likewise if a plugin is writing to it? I haver 96GB of RAM in the server, I restarted the system via reboot on SSH using my phone on an app called Termius, only the web UI ssh isn't responsive. Quote Link to comment
Rysz Posted May 17 Share Posted May 17 (edited) 15 minutes ago, thatja said: How would I find out about the rootfs-ramdisk being full? or likewise if a plugin is writing to it? I haver 96GB of RAM in the server, I restarted the system via reboot on SSH using my phone on an app called Termius, only the web UI ssh isn't responsive. OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full or otherwise unwritable at that point. The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following commands: df -h and cat /etc/mtab and ls -la /mnt Please post the output of those commands here then, before rebooting your server. Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 Also... where did you put the mergerFS mount commands, how are you running them? Edited May 17 by Rysz Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 1 minute ago, Rysz said: OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full. The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following command: df -h Please post the output of that command here then, before rebooting your server. Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 Also - where did you put the mergerFS mount commands, how are you running them? Okay, I will do that. As for mergerfs, when I boot up my server, I have a bash script that I created that mounts my rclone, mergerfs and autoscan. I run this file around a minute after I start my array. Here's the script #!/bin/bash # Start a screen session named "files" screen -dmS files # Attach to the "files" screen session and execute the first command screen -S google -X stuff $'rclone mount --config=/mnt/nvme/plexified/mounts/rclone/rclone.conf --allow-other --no-traverse --vfs-cache-mode full --cache-dir /mnt/nvmedl/plexified/mounts/googlecache/ --vfs-cache-max-size 250G --dir-cache-time 96h --vfs-fast-fingerprint --vfs-refresh --drive-impersonate [email protected] googledecrypted: /mnt/nvmedl/plexified/mounts/google/\n' # Wait for the command to start sleep 2 # Execute the mergerfs commands mergerfs -o defaults,allow_other,use_ino,fsname=mergerFS /mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0000/:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0001:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0002:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0003:/mnt/nvmedl/plexified/mounts/google/MoviesSrc/0004/ /mnt/nvmedl/plexified/mounts/moviesrc/Movies/ sleep 2 # Wait for 2 seconds before running the next mergerfs command mergerfs -o defaults,allow_other,use_ino,category.create=ff,fsname=mergerFS /mnt/user/plexdata/:/mnt/nvmedl/plexified/mounts/moviesrc=NC:/mnt/nvmedl/plexified/mounts/google/Data=NC /mnt/nvmedl/plexified/mounts/secret/ # Wait for 30 seconds before starting autoscan sleep 30 # Start a screen session named "autoscan", change to the correct directory, and then run the autoscan command screen -dmS autoscan screen -S autoscan -X stuff $'cd /mnt/nvme/plexified/services/autoscan\n' screen -S autoscan -X stuff $'./autoscan_v1.4.0_linux_amd64\n' Then I start my docker containers. Quote Link to comment
Rysz Posted May 17 Share Posted May 17 OK, I updated my before post with two more commands to run when it gets stuck - should hopefully narrow down the problem. Quote Link to comment
AgentXXL Posted May 17 Share Posted May 17 (edited) @thatja I've been using the mergerfs plugin for a few months now and have seen no issues similar to yours. Looking through the syslogs you've managed to capture, there is nothing I can see that indicates a mergerfs problem. I suspect a RAM issue. I would suggest shutting down and running a RAM test using Memtest86. At least for 24 - 36 hrs since your crashes appear to happen in that time frame. Also just to confirm, you do have syslog server (Settings --> Syslog Server) set to archive the syslog to a share/folder? Your syslogs don't seem to be retaining anything prior to the reboots/crashes, so they're a little less useful. Edited May 17 by AgentXXL Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 It has just happened again after almost 12 days of uptime. Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 On 5/17/2024 at 11:48 AM, Rysz said: OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full or otherwise unwritable at that point. The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following commands: df -h and cat /etc/mtab and ls -la /mnt Please post the output of those commands here then, before rebooting your server. Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 Also... where did you put the mergerFS mount commands, how are you running them? Okay, so I've just tried running the first one df -h And my ssh window is hanging atm. Has been for around 3minutes now Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 (edited) root@Plexified:~# cat /etc/mtab rootfs / rootfs rw,size=49452720k,nr_inodes=12363180,inode64 0 0 proc /proc proc rw,relatime 0 0 sysfs /sys sysfs rw,relatime 0 0 tmpfs /run tmpfs rw,nosuid,nodev,noexec,relatime,size=32768k,mode=755,inode64 0 0 /dev/sda1 /boot vfat rw,noatime,nodiratime,fmask=0177,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,flush,errors=remount-ro 0 0 /dev/loop0 /lib squashfs ro,relatime,errors=continue 0 0 overlay /lib overlay rw,relatime,lowerdir=/lib,upperdir=/var/local/overlay/lib,workdir=/var/local/overlay-work/lib 0 0 /dev/loop1 /usr squashfs ro,relatime,errors=continue 0 0 overlay /usr overlay rw,relatime,lowerdir=/usr,upperdir=/var/local/overlay/usr,workdir=/var/local/overlay-work/usr 0 0 devtmpfs /dev devtmpfs rw,relatime,size=8192k,nr_inodes=12363180,mode=755,inode64 0 0 devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /dev/shm tmpfs rw,relatime,inode64 0 0 fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0 hugetlbfs /hugetlbfs hugetlbfs rw,relatime,pagesize=2M 0 0 cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0 tmpfs /var/log tmpfs rw,relatime,size=131072k,mode=755,inode64 0 0 rootfs /mnt rootfs rw,size=49452720k,nr_inodes=12363180,inode64 0 0 tmpfs /mnt/disks tmpfs rw,relatime,size=1024k,inode64 0 0 tmpfs /mnt/remotes tmpfs rw,relatime,size=1024k,inode64 0 0 tmpfs /mnt/addons tmpfs rw,relatime,size=1024k,inode64 0 0 tmpfs /mnt/rootshare tmpfs rw,relatime,size=1024k,inode64 0 0 /dev/md1p1 /mnt/disk1 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md2p1 /mnt/disk2 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md3p1 /mnt/disk3 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md4p1 /mnt/disk4 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md5p1 /mnt/disk5 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md6p1 /mnt/disk6 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/md7p1 /mnt/disk7 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/nvme0n1p1 /mnt/nvme xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0 /dev/nvme1n1p1 /mnt/nvmedl btrfs rw,noatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ 0 0 shfs /mnt/user0 fuse.shfs rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0 shfs /mnt/user fuse.shfs rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0 /dev/loop2 /var/lib/docker btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0 /dev/loop2 /var/lib/docker/btrfs btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0 nsfs /run/docker/netns/fdcd16e64b60 nsfs rw 0 0 nsfs /run/docker/netns/default nsfs rw 0 0 /dev/loop3 /etc/libvirt btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0 googledecrypted: /mnt/nvmedl/plexified/mounts/google fuse.rclone rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0 mergerFS /mnt/nvmedl/plexified/mounts/moviesrc/Movies fuse.mergerfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0 mergerFS /mnt/nvmedl/plexified/mounts/secret fuse.mergerfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0 nsfs /run/docker/netns/93f74b6ab06f nsfs rw 0 0 nsfs /run/docker/netns/5b147fc09a9a nsfs rw 0 0 nsfs /run/docker/netns/21aa0e5b657b nsfs rw 0 0 nsfs /run/docker/netns/efbd08065b39 nsfs rw 0 0 nsfs /run/docker/netns/d628991141dd nsfs rw 0 0 nsfs /run/docker/netns/021970ea8a50 nsfs rw 0 0 nsfs /run/docker/netns/cf6e4881fffc nsfs rw 0 0 nsfs /run/docker/netns/ef2fa253537f nsfs rw 0 0 nsfs /run/docker/netns/3a977d309e1d nsfs rw 0 0 nsfs /run/docker/netns/2382e5f02b25 nsfs rw 0 0 nsfs /run/docker/netns/a3dd8518c453 nsfs rw 0 0 nsfs /run/docker/netns/29bd7cce0d5e nsfs rw 0 0 tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=9893912k,nr_inodes=2473478,mode=700,inode64 0 0 nsfs /run/docker/netns/b5ada1918a90 nsfs rw 0 0 nsfs /run/docker/netns/3ca37854a52b nsfs rw 0 0 nsfs /run/docker/netns/db6327bbe313 nsfs rw 0 0 That's what I get when I run cat /etc/mtab The other two just hang without an output. Edited May 31 by thatja Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 Getting Diagnostics through the Ui also freezes And also trying to get them via "diagnostics" inside ssh also just hangs Quote Link to comment
Rysz Posted May 31 Share Posted May 31 Just now, thatja said: Getting Diagnostics through the Ui also freezes And also trying to get them via "diagnostics" inside ssh also just hangs Can you try: df -h / Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 Just now, Rysz said: Can you try: df -h / Quote Link to comment
thatja Posted May 31 Author Share Posted May 31 /mnt is completely inaccessible aswel Quote Link to comment
Rysz Posted May 31 Share Posted May 31 Just now, thatja said: Ok, that rules out the theory of a full rootfs ramdisk. Did you test your RAM sticks with memtest in the meantime? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.