thatja Posted May 13 Share Posted May 13 Between 4-18hrs after server start, my entire folder structure becomes unresponsive and I cannot start or stop Docker services and get the following error My Plex server becomes unresponsive, but all other apps like Sonarr UI loads, but can't import/delete/access any files as the storage system is locked up. The only thing thats fixes this is a hard reboot of the server, which is not practical to do multiple times per day. It can go anywhere from 4-18hours without doing it, before being stuck in this state again. I have attached diagnostics to the post, but not sure if its a full one as I'm struggling to gain access due to the lockups. Can anyone advise on what to do to troubleshoot this? plexified-diagnostics-20240513-0855.zip Quote Link to comment
thatja Posted May 13 Author Share Posted May 13 When running diagnostics download it gets stuck on sed -ri 's/^(share(Comment|ReadList|WriteList)=")[^"]+/\1.../' '/plexified-diagnostics-20240513-0907/shares/a-----a.cfg' 2>/dev/null Quote Link to comment
thatja Posted May 13 Author Share Posted May 13 /mnt is inaccessible via FTP or SSH. Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 Syslog in the diags is empty, try cp /var/log/syslog /boot/syslog.txt then attach it here. Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 Not seeing anything logged, is the server having the issue now? Quote Link to comment
thatja Posted May 13 Author Share Posted May 13 (edited) The server had issues between 2:50AM-5:20AM and that's the logs that I grabbed when I woken. I've since rebooted the machine and all is working again. However, this has been the pattern for the past 5 days, it'll stay up for 16-18hrs before the same problem occurs, its done it every day for the past 4 days. The only real difference has been updating the nvidia GPU driver, its been stable for 39 days previous to upgrading the GPU driver. But could a GPU driver really cause the issue that's occuring? Edited May 13 by thatja Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 I missed the syslog you posted before my first post, the one in the diags was empty, I'm still not seeing any errors, but I do see this around the time you mention: May 13 02:48:58 Plexified mergerfs[20489]: running basic garbage collection May 13 02:48:58 Plexified mergerfs[20489]: threadpool (fuse.read): spawning 32 threads w/ max queue depth 32 May 13 02:48:58 Plexified mergerfs[20489]: read-thread-count=32; process-thread-count=-1; process-thread-queue-depth=-1; pin-threads=false; May 13 02:49:00 Plexified mergerfs[20580]: running basic garbage collection May 13 02:49:00 Plexified mergerfs[20580]: threadpool (fuse.read): spawning 32 threads w/ max queue depth 32 May 13 02:49:00 Plexified mergerfs[20580]: read-thread-count=32; process-thread-count=-1; process-thread-queue-depth=-1; pin-threads=false; Do you still have issue if you don't use mergefs? Quote Link to comment
thatja Posted May 13 Author Share Posted May 13 7 minutes ago, JorgeB said: I missed the syslog you posted before my first post, the one in the diags was empty, I'm still not seeing any errors, but I do see this around the time you mention: May 13 02:48:58 Plexified mergerfs[20489]: running basic garbage collection May 13 02:48:58 Plexified mergerfs[20489]: threadpool (fuse.read): spawning 32 threads w/ max queue depth 32 May 13 02:48:58 Plexified mergerfs[20489]: read-thread-count=32; process-thread-count=-1; process-thread-queue-depth=-1; pin-threads=false; May 13 02:49:00 Plexified mergerfs[20580]: running basic garbage collection May 13 02:49:00 Plexified mergerfs[20580]: threadpool (fuse.read): spawning 32 threads w/ max queue depth 32 May 13 02:49:00 Plexified mergerfs[20580]: read-thread-count=32; process-thread-count=-1; process-thread-queue-depth=-1; pin-threads=false; Do you still have issue if you don't use mergefs? Well I'm not sure what that means per se, but I've used mergerfs since I first started using UNRAID in around December 2023, and mergefs is crucial to my setup, as I am merging my google drive and storing new files on my unraid array, so without mergerfs my system doesn't really work. Mergerfs has been updated quite a lot over the past couple of months, it could be a bad update I guess, but I'm not sure if there is a way to downgrade mergerfs? Quote Link to comment
thatja Posted May 13 Author Share Posted May 13 Worth noting, I got the system back up at 9:50AM this morning, its now 15:56PM and I haven't had a crash, this is with mergerfs too. Not sure if that rules mergerfs out, or not. Quote Link to comment
JorgeB Posted May 13 Share Posted May 13 25 minutes ago, thatja said: Not sure if that rules mergerfs out, or not. Not really, but it would be the first thing I would test, that is, running without it, if you can for a few hours just for testing. Quote Link to comment
Rysz Posted May 15 Share Posted May 15 (edited) Can you please post the mergerFS scripts where you are setting up your mergerFS mounts? I see no actual errors regarding mergerFS, but let's see your scripts just to be sure. 🙂 mergerFS garbage collection is normal and occurs every 15 minutes by default (according to manual). Also... the log posted starts with a system reboot at 02:45am - did you do this reboot? ... or did the system crash and reboot itself? Since you say trouble started at 02:50am. ... 02:50am would be after that 02:45am reboot, so was it a crash or user-triggered reboot? Also... just to provide a timeline here - since you say the troubles started around 5 days ago: The mergerFS backend (the actual binary) has last been updated 26/03/2024. The mergerFS frontend (calling your mergerFS mount scripts) has last been updated 26/04/2024. Those frontend changes have been minor, only introducing a timeout so that array start cannot get stuck. So both these updates would have been way outside of the 5 days where you experienced trouble... But please do post your mergerFS mount scripts nevertheless, you never know! 🙂 @JorgeB: Seems more like a general system problem (perhaps RAM-related?) to me. It's also weird that diagnostics did not include a syslog, perhaps some problems writing to the rootfs (RAM-)disk? Also the user said /mnt itself was inaccessible, that directory should always exist regardless of any mounts being there. Edited May 15 by Rysz Quote Link to comment
JorgeB Posted May 15 Share Posted May 15 39 minutes ago, Rysz said: Seems more like a general system problem It may well be, I just wanted the user to test without mergefs to rule that out, since there's nothing else relevant logged that I can see that would explain folders going away. 1 Quote Link to comment
Rysz Posted May 15 Share Posted May 15 Just now, JorgeB said: It may well be, I just wanted the user to test without mergefs to rule that out, since there's nothing else relevant logged that I can see that would explain folders going away. Yes, that's definitely a good idea, was already thinking a step further there. 😄 Quote Link to comment
thatja Posted May 15 Author Share Posted May 15 (edited) Hi. I didn't reboot the servewr at 2.45AM. As for my mergerfs, its a pretty simple command that has been working since I started using UNRAID in December. mergerfs -o defaults,allow_other,use_ino,fsname=mergerFS /mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0000/:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0001:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0002:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0003:/mnt/nvmedl/plexified/mounts/google/MoviesSrc/0004/ /mnt/nvmedl/plexified/mounts/moviesrc/Movies/ AND mergerfs -o defaults,allow_other,use_ino,category.create=ff,fsname=mergerFS /mnt/user/plexdata/:/mnt/nvmedl/plexified/mounts/moviesrc=NC:/mnt/nvmedl/plexified/mounts/google/Data=NC /mnt/nvmedl/plexified/mounts/secret/ Worth noting, that /mnt/user/plexdata is my array, the rest are all rclone mounts merged to make /mnt/nvmedl/plexified/mounts/secret/ nvmedl is the name of my cache drive and it is an nvme as the name suggests. Previously, over the past 5 days it had been happening after around 18hours, however yesterday it managed 1 day and 7 hours uptime, before the same happened around 5 hours ago. Again, the only fix was to hard reboot the server. I did try unmounting mergerfs folders but that didn't help. Edited May 15 by thatja Quote Link to comment
Rysz Posted May 15 Share Posted May 15 7 minutes ago, thatja said: Hi. I didn't reboot the servewr at 2.45AM. As for my mergerfs, its a pretty simple command that has been working since I started using UNRAID in December. mergerfs -o defaults,allow_other,use_ino,fsname=mergerFS /mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0000/:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0001:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0002:/mnt/nvmedl/plexified/mounts/google/Data/MoviesSrc/0003:/mnt/nvmedl/plexified/mounts/google/MoviesSrc/0004/ /mnt/nvmedl/plexified/mounts/moviesrc/Movies/ AND mergerfs -o defaults,allow_other,use_ino,category.create=ff,fsname=mergerFS /mnt/user/plexdata/:/mnt/nvmedl/plexified/mounts/moviesrc=NC:/mnt/nvmedl/plexified/mounts/google/Data=NC /mnt/nvmedl/plexified/mounts/secret/ Worth noting, that /mnt/user/plexdata is my array, the rest are all rclone mounts merged to make /mnt/nvmedl/plexified/mounts/secret/ nvmedl is the name of my cache drive and it is an nvme as the name suggests. Looks good to me - and you're running this through array_start.sh or array_start_complete.sh, I'm guessing? Something definitely shutdown your server before 02:45am, because the log starts with a server boot at 02:45am. Did you notice any parity checks or anything that would indicate an unclean shutdown has happened? Honestly if you changed nothing on the mergerFS scripts and they worked since December... I'd start looking at the GPU driver or a general RAM issue; might be worth running an extended memtest. ... to see if your RAM experiences any troubles after (x) hours of testing ... But @JorgeB is definitely more experienced at general support than me, so take this with a grain of salt. I don't think mergerFS is causing this, but as suggested I would try disabling it first and see if the problems still happen. Quote Link to comment
thatja Posted May 15 Author Share Posted May 15 3 hours ago, Rysz said: Looks good to me - and you're running this through array_start.sh or array_start_complete.sh, I'm guessing? Something definitely shutdown your server before 02:45am, because the log starts with a server boot at 02:45am. Did you notice any parity checks or anything that would indicate an unclean shutdown has happened? Honestly if you changed nothing on the mergerFS scripts and they worked since December... I'd start looking at the GPU driver or a general RAM issue; might be worth running an extended memtest. ... to see if your RAM experiences any troubles after (x) hours of testing ... But @JorgeB is definitely more experienced at general support than me, so take this with a grain of salt. I don't think mergerFS is causing this, but as suggested I would try disabling it first and see if the problems still happen. Funny you should mention GPU driver, it did update a couple hours before this first outage occurred. I will try going back one driver! 1 Quote Link to comment
thatja Posted May 16 Author Share Posted May 16 And it's down again, I've just woken up and my server is down in the same state as before. Not sure what to try next. Quote Link to comment
JorgeB Posted May 16 Share Posted May 16 Post a new syslog in case there's something there now. 1 Quote Link to comment
Rysz Posted May 16 Share Posted May 16 17 minutes ago, thatja said: And it's down again, I've just woken up and my server is down in the same state as before. Not sure what to try next. Best post the diagnostics package now, hopefully there'll be a log this time. Was mergerFS disabled now? Quote Link to comment
thatja Posted May 16 Author Share Posted May 16 4 hours ago, Rysz said: Best post the diagnostics package now, hopefully there'll be a log this time. Was mergerFS disabled now? I was unable to get into unraid at all, even to get diagnostics. I can't really start my plex server without mergerfs so I'm not sure what to do. Quote Link to comment
Rysz Posted May 16 Share Posted May 16 26 minutes ago, thatja said: I was unable to get into unraid at all, even to get diagnostics. I can't really start my plex server without mergerfs so I'm not sure what to do. Even after a restart diagnostics can be useful, so please do post them if you're able to access the server now. Quote Link to comment
JorgeB Posted May 16 Share Posted May 16 1 hour ago, thatja said: I was unable to get into unraid at all, You can enable the syslog server and post that after it happens again. Quote Link to comment
thatja Posted May 17 Author Share Posted May 17 19 hours ago, JorgeB said: You can enable the syslog server and post that after it happens again. Okay so server restarted again. I can't access UI but can SSH, and when running "diagnostics" it just sticks on; root@Plexified:/mnt# diagnostics Starting diagnostics collection... root@Plexified:/mnt# diagnostics Starting diagnostics collection... Could this be a case of a failing flash drive? or bad files on flash drive? /mnt is once again inaccessible. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.