Jump to content

Entire folder directories become unresponsive


Recommended Posts

Just now, JorgeB said:

I still don't see anything relevant logged, did you try without mergefs?

 

I have been using the system all morning without mergerfs and it was working fine, 10mins after I mounted mergefs I got a crash. I am highly suspecting either failing USB or mergerfs to be the cause. Which is worrying as my setup depends on mergerfs, and I'm not sure what's changed because for 6months it has been solid.

Link to comment

Something is taking down your system at night:

May 17 01:00:35 Plexified emhttpd: unclean shutdown detected

 

Sorry, but you really need to start listening to us and re-trace your steps on what you have changed/updated recently... especially regarding your plugins. There are a ton of additional (and a few of them of quite invasive nature) plugins installed on your server... any of which could cause this issue. The fact that it isn't even able to generate a non-empty diagnostics package further underlines the fact that there's something seriously wrong with your server at the moment.

 

Again, you need to start listening to the advice given, try disabling your plugins one-by-one and see if and when your server starts working again. You needing some plugins for your daily business doesn't change the fact that it's impossible to diagnose the problem without disabling some plugins at least temporarily. You also, as already pointed out by @JorgeB, need to set up the syslog server to see what is happening before the crashing and not just afterwards. So far we've only seen the logs after the system reboots, not from before, which would likely show the problem.

 

Link to comment
Posted (edited)
11 minutes ago, Rysz said:

Something is taking down your system at night:

May 17 01:00:35 Plexified emhttpd: unclean shutdown detected

 

Sorry, but you really need to start listening to us and re-trace your steps on what you have changed/updated recently... especially regarding your plugins. There are a ton of additional (and a few of them of quite invasive nature) plugins installed on your server... any of which could cause this issue. The fact that it isn't even able to generate a non-empty diagnostics package further underlines the fact that there's something seriously wrong with your server at the moment.

 

Again, you need to start listening to the advice given, try disabling your plugins one-by-one and see if and when your server starts working again. You needing some plugins for your daily business doesn't change the fact that it's impossible to diagnose the problem without disabling some plugins at least temporarily. You also, as already pointed out by @JorgeB, need to set up the syslog server to see what is happening before the crashing and not just afterwards. So far we've only seen the logs after the system reboots, not from before, which would likely show the problem.

 

 

This is where I am stuck, the clean shutdown was because I could not get into ssh OR the UI, that was the first crash at 1AM.

 

Secondly, the last 2 syslog provided above was before I restarted the server. (after /mnt became inaccessible) - this was from /boot/logs as I did enable syslog server.

 

I have had the server with nothing running at all, no docker containers but Plex, no mergerfs and it was fine. As soon as I mounted my mounts, I got another crash. 

 

Could you please elaborate on this?

 

Quote

(and a few of them of quite invasive nature)

 

 

Edited by thatja
Link to comment
4 minutes ago, thatja said:

 

This is where I am stuck, the clean shutdown was because I could not get into ssh OR the UI, that was the first crash at 1AM.

 

Secondly, the last 2 syslog provided above was before I restarted the server. (after /mnt became inaccessible) - this was from /boot/logs as I did enable syslog server.

 

I have had the server with nothing running at all, no docker containers but Plex, no mergerfs and it was fine. As soon as I mounted my mounts, I got another crash. 

 

OK and where are the logs from what happened before 01am?

Because the server seems to have crashed and rebooted at 01am, we need to know what happened before.

 

There's no indication in the logs that mergerFS isn't operating as it should.

The opposite actually, it doing garbage collection until the very end of your logs shows it's still running. 🤔

Link to comment
Posted (edited)
3 minutes ago, Rysz said:

 

OK and where are the logs from what happened before 01am?

Because the server seems to have crashed and rebooted at 01am, we need to know what happened before.

 

There's no indication in the logs that mergerFS isn't operating as it should.

The opposite actually, it doing garbage collection until the very end of your logs shows it's still running. 🤔

 

The unclean shutdown was because power was pulled from the server, this wasn't a crash related to UNRAID but a power outage on my end, sorry for the confusion regarding that.

 

The crashes today caused by UNRAID/Something else occurred at 10:30AMish and 10:50AMish. Those are what the syslogs above cover before/after events of.

 

Also nothing at all has changed between when things were working good, and the first ever crash relating to this, all I've done is update plugins/docker containers where they have updates available, I've had 6months without issue until the first crash happened at the time of this thread creation.

Edited by thatja
Link to comment
Posted (edited)
10 minutes ago, thatja said:

 

The unclean shutdown was because power was pulled from the server, this wasn't a crash related to UNRAID but a power outage on my end, sorry for the confusion regarding that.

 

The crashes today caused by UNRAID/Something else occurred at 10:30AMish and 10:50AMish. Those are what the syslogs above cover before/after events of.

 

Well there's nothing in the logs to indicate a failure of any kind around those times, related to mergerFS or not. But the fact that it fails to even generate a diagnostics package makes me think that the rootfs-ramdisk (at /) is either full (with some plugin writing to it non-stop filling it up), not accessible or otherwise broken somehow. It isn't even able to write the syslog or any other files into the diagnostics package, which would lead me to my earlier belief that it has something to do with the RAM. How much RAM do you have on your server? How did you shutdown your server after it crashed -  because there's nothing in the logs anymore after your last SSH login to the crashed server.

 

Edited by Rysz
Link to comment
7 minutes ago, Rysz said:

 

Well there's nothing in the logs to indicate a failure of any kind around those times, related to mergerFS or not. But the fact that it fails to even generate a diagnostics package makes me think that the rootfs-ramdisk (at /) is either full (with some plugin writing to it non-stop filling it up), not accessible or otherwise broken somehow. It isn't even able to write the syslog or any other files into the diagnostics package, which would lead me to my earlier belief that it has something to do with the RAM. How much RAM do you have on your server? How did you shutdown your server after it crashed -  because there's nothing in the logs anymore after your last SSH login to the crashed server.

 

 

How would I find out about the rootfs-ramdisk being full? or likewise if a plugin is writing to it?

 

I haver 96GB of RAM in the server, I restarted the system via reboot on SSH using my phone on an app called Termius, only the web UI ssh isn't responsive.

Link to comment
Posted (edited)
15 minutes ago, thatja said:

 

How would I find out about the rootfs-ramdisk being full? or likewise if a plugin is writing to it?

 

I haver 96GB of RAM in the server, I restarted the system via reboot on SSH using my phone on an app called Termius, only the web UI ssh isn't responsive.

 

OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full or otherwise unwritable at that point.

 

The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following commands:

df -h

and

cat /etc/mtab

and

ls -la /mnt

Please post the output of those commands here then, before rebooting your server.

 

Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 

Also... where did you put the mergerFS mount commands, how are you running them?

 

Edited by Rysz
Link to comment
1 minute ago, Rysz said:

 

OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full.

 

The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following command:

df -h

Please post the output of that command here then, before rebooting your server.

 

Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 

Also - where did you put the mergerFS mount commands, how are you running them?

 

 

Okay, I will do that. 

 

As for mergerfs, when I boot up my server, I have a bash script that I created that mounts my rclone, mergerfs and autoscan. I run this file around a minute after I start my array.

 

Here's the script

 

#!/bin/bash

# Start a screen session named "files"
screen -dmS files

# Attach to the "files" screen session and execute the first command
screen -S google -X stuff $'rclone mount --config=/mnt/nvme/plexified/mounts/rclone/rclone.conf --allow-other --no-traverse --vfs-cache-mode full --cache-dir /mnt/nvmedl/plexified/mounts/googlecache/ --vfs-cache-max-size 250G --dir-cache-time 96h --vfs-fast-fingerprint --vfs-refresh --drive-impersonate [email protected] googledecrypted: /mnt/nvmedl/plexified/mounts/google/\n'

# Wait for the command to start
sleep 2

# Execute the mergerfs commands
mergerfs -o defaults,allow_other,use_ino,fsname=mergerFS /mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0000/:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0001:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0002:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0003:/mnt/nvmedl/plexified/mounts/google/MoviesSrc/0004/ /mnt/nvmedl/plexified/mounts/moviesrc/Movies/

sleep 2 # Wait for 2 seconds before running the next mergerfs command

mergerfs -o defaults,allow_other,use_ino,category.create=ff,fsname=mergerFS /mnt/user/plexdata/:/mnt/nvmedl/plexified/mounts/moviesrc=NC:/mnt/nvmedl/plexified/mounts/google/Data=NC /mnt/nvmedl/plexified/mounts/secret/

# Wait for 30 seconds before starting autoscan
sleep 30

# Start a screen session named "autoscan", change to the correct directory, and then run the autoscan command
screen -dmS autoscan
screen -S autoscan -X stuff $'cd /mnt/nvme/plexified/services/autoscan\n'
screen -S autoscan -X stuff $'./autoscan_v1.4.0_linux_amd64\n'

 

Then I start my docker containers.

Link to comment
Posted (edited)

@thatja I've been using the mergerfs plugin for a few months now and have seen no issues similar to yours. Looking through the syslogs you've managed to capture, there is nothing I can see that indicates a mergerfs problem. I suspect a RAM issue. I would suggest shutting down and running a RAM test using Memtest86. At least for 24 - 36 hrs since your crashes appear to happen in that time frame.

 

Also just to confirm, you do have syslog server (Settings --> Syslog Server) set to archive the syslog to a share/folder? Your syslogs don't seem to be retaining anything prior to the reboots/crashes, so they're a little less useful.

Edited by AgentXXL
Link to comment
  • 2 weeks later...
On 5/17/2024 at 11:48 AM, Rysz said:

 

OK that's very interesting because if you restarted via reboot command it should show more in the syslogs. It should show it shutting down services, the array etc... but there's nothing after your last SSH login, which again makes me think that the ramdisk is full or otherwise unwritable at that point.

 

The next time it gets stuck, don't instantly reboot, but SSH into it first and run the following commands:

df -h

and

cat /etc/mtab

and

ls -la /mnt

Please post the output of those commands here then, before rebooting your server.

 

Feel free to enable mergerFS again and wait for it to get stuck again, just so we can be sure. 🙂 

Also... where did you put the mergerFS mount commands, how are you running them?

 

 

Okay, so I've just tried running the first one

 

df -h

 

And my ssh window is hanging atm. Has been for around 3minutes now

Link to comment
Posted (edited)
root@Plexified:~# cat /etc/mtab
rootfs / rootfs rw,size=49452720k,nr_inodes=12363180,inode64 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
tmpfs /run tmpfs rw,nosuid,nodev,noexec,relatime,size=32768k,mode=755,inode64 0 0
/dev/sda1 /boot vfat rw,noatime,nodiratime,fmask=0177,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,flush,errors=remount-ro 0 0
/dev/loop0 /lib squashfs ro,relatime,errors=continue 0 0
overlay /lib overlay rw,relatime,lowerdir=/lib,upperdir=/var/local/overlay/lib,workdir=/var/local/overlay-work/lib 0 0
/dev/loop1 /usr squashfs ro,relatime,errors=continue 0 0
overlay /usr overlay rw,relatime,lowerdir=/usr,upperdir=/var/local/overlay/usr,workdir=/var/local/overlay-work/usr 0 0
devtmpfs /dev devtmpfs rw,relatime,size=8192k,nr_inodes=12363180,mode=755,inode64 0 0
devpts /dev/pts devpts rw,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,relatime,inode64 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
hugetlbfs /hugetlbfs hugetlbfs rw,relatime,pagesize=2M 0 0
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
tmpfs /var/log tmpfs rw,relatime,size=131072k,mode=755,inode64 0 0
rootfs /mnt rootfs rw,size=49452720k,nr_inodes=12363180,inode64 0 0
tmpfs /mnt/disks tmpfs rw,relatime,size=1024k,inode64 0 0
tmpfs /mnt/remotes tmpfs rw,relatime,size=1024k,inode64 0 0
tmpfs /mnt/addons tmpfs rw,relatime,size=1024k,inode64 0 0
tmpfs /mnt/rootshare tmpfs rw,relatime,size=1024k,inode64 0 0
/dev/md1p1 /mnt/disk1 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md2p1 /mnt/disk2 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md3p1 /mnt/disk3 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md4p1 /mnt/disk4 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md5p1 /mnt/disk5 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md6p1 /mnt/disk6 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/md7p1 /mnt/disk7 xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/nvme0n1p1 /mnt/nvme xfs rw,noatime,nouuid,attr2,inode64,logbufs=8,logbsize=32k,noquota 0 0
/dev/nvme1n1p1 /mnt/nvmedl btrfs rw,noatime,ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ 0 0
shfs /mnt/user0 fuse.shfs rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
shfs /mnt/user fuse.shfs rw,nosuid,nodev,noatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
/dev/loop2 /var/lib/docker btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0
/dev/loop2 /var/lib/docker/btrfs btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0
nsfs /run/docker/netns/fdcd16e64b60 nsfs rw 0 0
nsfs /run/docker/netns/default nsfs rw 0 0
/dev/loop3 /etc/libvirt btrfs rw,noatime,ssd,space_cache=v2,subvolid=5,subvol=/ 0 0
googledecrypted: /mnt/nvmedl/plexified/mounts/google fuse.rclone rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other 0 0
mergerFS /mnt/nvmedl/plexified/mounts/moviesrc/Movies fuse.mergerfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
mergerFS /mnt/nvmedl/plexified/mounts/secret fuse.mergerfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other 0 0
nsfs /run/docker/netns/93f74b6ab06f nsfs rw 0 0
nsfs /run/docker/netns/5b147fc09a9a nsfs rw 0 0
nsfs /run/docker/netns/21aa0e5b657b nsfs rw 0 0
nsfs /run/docker/netns/efbd08065b39 nsfs rw 0 0
nsfs /run/docker/netns/d628991141dd nsfs rw 0 0
nsfs /run/docker/netns/021970ea8a50 nsfs rw 0 0
nsfs /run/docker/netns/cf6e4881fffc nsfs rw 0 0
nsfs /run/docker/netns/ef2fa253537f nsfs rw 0 0
nsfs /run/docker/netns/3a977d309e1d nsfs rw 0 0
nsfs /run/docker/netns/2382e5f02b25 nsfs rw 0 0
nsfs /run/docker/netns/a3dd8518c453 nsfs rw 0 0
nsfs /run/docker/netns/29bd7cce0d5e nsfs rw 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=9893912k,nr_inodes=2473478,mode=700,inode64 0 0
nsfs /run/docker/netns/b5ada1918a90 nsfs rw 0 0
nsfs /run/docker/netns/3ca37854a52b nsfs rw 0 0
nsfs /run/docker/netns/db6327bbe313 nsfs rw 0 0

 

That's what I get when I run cat /etc/mtab

 

The other two just hang without an output.

 

image.png.859dbf926ce92e9f385caa7341948715.png

Edited by thatja
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...