Jump to content

Web GUI not loading, server freezing up.


Recommended Posts

I've attached a screenshot of the issue, as well as my diagnostics.
(Please let me know if there was any sensitive information I should've redacted)

I've had a few unclean shutdowns lately as a result of my server completely locking up.

For months now I've been dealing with my Web GUI not fully loading, after a few hours it typically would come back up and I could access it without needing to make any changes.

 

It seemed random when it would occur, though as I've been doing some troubleshooting I've noticed messages in the syslog like this one.
 

crond[1863]: exit status 255 from user root php /usr/local/emhttp/plugins/community.applications/scripts/notices.php > /dev/null 2>&1

 

I've recently gone through and deleted several shares as I've been following the TRaSH Guides to optimize my setup.
Though this does not appear to have affected my issue one way or the other I just wanted to mention it in-case it may be relevant.

 

Fix Common Problems has not reported anything since I corrected the recent issues.

I switched from macvlan to ipvlan but my issue was present before and after that change.

 

Final thing to note, when the Web GUI would stop loading completely I could still access all of my shares, my dockers, I could access the terminal both from the shortcut in the Web GUI and via SSH.

Screenshot 2023-12-27 105620.png

 

Edited by techlovin
Link to comment
38 minutes ago, JorgeB said:

Forgot to mention, for the GUI not loading try booting in safe mode.


The Web GUI does load though, it loads and will work just fine for several hours, and then it'll stop for a few hours, and then it comes back again.

Edit:
Though I suppose you may be suggesting safe-mode to determine whether or not the WEB Gui will stop working as it is periodically?

 

Update:
I'm currently running the system in safe-mode, I'll leave it in safe mode for a day or two to monitor it for further issues.
I'm also setting up a syslog server now to help with monitoring as extra wear on the flash drive doesn't sound like a good idea.

Edited by techlovin
Link to comment

I'm trying to do some research on Unraid's safe mode option, I would like to know what all it actually does besides booting without Plugins.

 

In Windows you have a very basic, stripped down version of the OS with minimal drivers, and features.

 

What else does Unraid's safe mode do besides disabling plugins?

Edited by techlovin
Link to comment
5 hours ago, techlovin said:

What else does Unraid's safe mode do besides disabling plugins?

 

As far as I know that is all it does.   

 

The reason for safe mode is that plugins can install new components into the base Unraid OS and if these are incompatible with a particular release it can cause issues with unpredictable symptoms.

Link to comment

Well sometime in the night while I was sleeping the server froze up again.

I could PING it this morning, but the WEB GUI, Docker, and Shares were all inaccessible.

 

I'm working on replacing the flash drive, because I have no idea what else to look for.

I've also run another set of diagnostics.

 

 

 

Edited by techlovin
Link to comment
7 hours ago, JorgeB said:

Did you enable the syslog server?


Yes, enabled the server, and set up a rpi for logging.

Not sure if I should've went and mirrored to the flash drive (I was trying to avoid potential wear as there was a warning about that.)

 

I've attached the syslog.

 

You can see on line 16 the last reported entry before I had to long-press the power button to shutdown.
 

2023-12-28T01:21:43-08:00 Friday monitor: Stop running nchan processes

 

 

Edited by techlovin
Link to comment
17 minutes ago, JorgeB said:

Unfortunately there's nothing relevant logged, this usually points to a hardware issue, make sure this has been taken care of:

 

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173

 


I don't recall ever adjusting power control in the BIOS to mitigate issues with C-States.

I'll work on addressing that issue, and will follow back up tomorrow.

Link to comment

Set "Power Supply Idle Control" to Typical Current Idle, and also disabled C-States from the BIOS.

I'm not concerned with the power consumption of my server and would prefer as much stability as possible with my Ryzen build.

 

Will update this post tomorrow with status.

Edited by techlovin
Link to comment

Woke up again to my server completely frozen.

Tried removing unraidsafemode from boot to look at my plugins but the WEB Gui would fail to load as it did in the screenshot above.

Booted back into Safe Mode.

 

Noticed these entries in the syslog before it went down.

 

2023-12-28T14:10:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T15:07:00-08:00 Friday monitor: Stop running nchan processes
2023-12-28T15:07:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T16:05:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T17:03:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T18:00:02-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T18:58:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T19:56:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T20:54:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T21:20:58-08:00 Friday webGUI: Successful login user root from 192.168.1.69
2023-12-28T21:21:35-08:00 Friday monitor: Stop running nchan processes
2023-12-28T21:28:08-08:00 Friday monitor: Stop running nchan processes
2023-12-28T21:43:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T22:42:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-28T23:41:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-29T01:09:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-29T02:07:01-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :
2023-12-29T03:00:57-08:00 Friday crond[1866]: failed parsing crontab for user root: *"2,4,¶,8(90,12 1˛ $(date +%e -d -7days) -le w ]] && /usr/local/sbin/mdcmd check NOCORReCT &> /dev/null || :

 

Edited by techlovin
Link to comment
12 minutes ago, itimpi said:

Those look to be related to the built-in parity check scheduling.    The fact the entries are invalid suggests to me that you either have RAM issues or you have corruption on your flash drive.

 

I've prepared a new flash drive, hopefully it's not my RAM cause I don't have any spares on hand.

Link to comment

Server locked up again, it seems to be happening every night/morning at 3:15am approximately.

Syslog had no meaningful info around the crash/lock/freeze.

 

2023-12-29T21:52:48-08:00 Friday monitor: Stop running nchan processes
2023-12-29T23:18:52-08:00 Friday webGUI: Successful login user root from 192.168.1.203
2023-12-29T23:19:21-08:00 Friday monitor: Stop running nchan processes
2023-12-29T23:41:24-08:00 Friday monitor: Stop running nchan processes
2023-12-30T00:23:28-08:00 Friday monitor: Stop running nchan processes
2023-12-30T01:43:02-08:00 Friday monitor: Stop running nchan processes
2023-12-30T03:07:20-08:00 Friday webGUI: Successful login user root from 192.168.1.203
2023-12-30T03:08:05-08:00 Friday monitor: Stop running nchan processes
2023-12-30T03:16:27-08:00 Friday emhttpd: Starting services...

 

I'm currently using a new flash drive, though considering the consistency in timing I don't think this is a hardware issue.

 

 

Link to comment
10 hours ago, itimpi said:

It might be worth checking there is nothing scheduled to run at around that time.    I would be looking through the various .cron files stored under /etc/rc.cron.

 

That directory does not appear to exist.

 

I did look through the cron.d directory and found one file named root, and have attached a screenshot of its contents.

 

Screenshot 2023-12-30 142244.png

Screenshot 2023-12-30 142438.png

Link to comment

Noticed a warning in my syslog.

 

2023-12-30T22:35:01-08:00 Friday kernel: TCP: request_sock_TCP: Possible SYN flooding on port 8181. Sending cookies.  Check SNMP counters.

 

:8181 is my Tautulli container.

Checked logs and found the following.

 

2023-12-30 22:35:40 - ERROR :: CP Server Thread-9 : Failed to access uri endpoint /status/sessions. Request timed out: HTTPConnectionPool(host='192.168.1.10', port=32400): Read timed out. (read timeout=15)
2023-12-30 22:35:40 - WARNING :: CP Server Thread-9 : Tautulli Pmsconnect :: Unable to parse XML for get_current_activity: 'NoneType' object has no attribute 'getElementsByTagName'.
2023-12-30 22:35:40 - WARNING :: CP Server Thread-9 : Unable to retrieve data for get_activity.
2023-12-30 22:35:44 - ERROR :: Thread-17 (run) : Failed to access uri endpoint /status/sessions. Request timed out: HTTPConnectionPool(host='192.168.1.10', port=32400): Read timed out. (read timeout=15)
2023-12-30 22:35:44 - WARNING :: Thread-17 (run) : Tautulli Pmsconnect :: Unable to parse XML for get_current_activity: 'NoneType' object has no attribute 'getElementsByTagName'.
2023-12-30 22:35:59 - ERROR :: Thread-17 (run) : Failed to access uri endpoint /status/sessions. Request timed out: HTTPConnectionPool(host='192.168.1.10', port=32400): Read timed out. (read timeout=15)
2023-12-30 22:35:59 - WARNING :: Thread-17 (run) : Tautulli Pmsconnect :: Unable to parse XML for get_current_activity: 'NoneType' object has no attribute 'getElementsByTagName'.

{REPEATED}

2023-12-30 22:57:00 - ERROR :: CP Server Thread-7 : Failed to access uri endpoint /status/sessions. Request timed out: HTTPConnectionPool(host='192.168.1.10', port=32400): Read timed out. (read timeout=15)
2023-12-30 22:57:00 - WARNING :: CP Server Thread-7 : Tautulli Pmsconnect :: Unable to parse XML for get_current_activity: 'NoneType' object has no attribute 'getElementsByTagName'.
2023-12-30 22:57:00 - WARNING :: CP Server Thread-7 : Unable to retrieve data for get_activity.
2023-12-30 22:57:27 - WARNING :: CP Server Thread-11 : Failed to get image /library/metadata/281641/thumb, falling back to poster.
2023-12-30 22:57:27 - WARNING :: CP Server Thread-8 : Failed to get image /library/metadata/281635/thumb, falling back to poster.

 

Now this looks like something that could bring my server down, doing some research it looks like I may need to adjust some settings in the container or on my firewall.

Link to comment

Shutdown again, at exactly 3:15am, nothing in the syslog.

Froze, couldn't access the GUI, shares, or any containers.

 

Is there someway I can check to see what it may be doing at that time?
 

I've connected a monitor to it so that I might be able to see something.

 

I've set up the syslog to mirror to flash in-case something is being missed.

 

Edit:

Each time it's done this the last three days I could see the light blinking on my Unraid flash drive.

I waited it out once for a couple hours but it was still down.

The last two days I've just been doing an unclean shutdown and rebooting.

While it happens I'm just streaming Plex from my server.

 

Edit2:

Went down again, less than 45mins after the first freeze.

Running MemTest now, will update this thread after an extended test.

Edited by techlovin
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...