(6.8.3) syslog filling up with nginx errors, causing system to be unusable

rav · August 1, 2020

For the past couple weeks, I have had nginx errors filling up my system log, causing the Unraid UI to become unresponsive.

I see the log full of errors like this:

Aug  1 16:55:30 Alexandria nginx: 2020/08/01 16:55:30 [alert] 10170#10170: worker process 19461 exited on signal 6
Aug  1 16:55:32 Alexandria nginx: 2020/08/01 16:55:32 [alert] 10170#10170: worker process 19472 exited on signal 6
Aug  1 16:55:34 Alexandria nginx: 2020/08/01 16:55:34 [alert] 10170#10170: worker process 19474 exited on signal 6
Aug  1 16:55:36 Alexandria nginx: 2020/08/01 16:55:36 [alert] 10170#10170: worker process 19622 exited on signal 6
Aug  1 16:55:38 Alexandria nginx: 2020/08/01 16:55:38 [alert] 10170#10170: worker process 19624 exited on signal 6
Aug  1 16:55:40 Alexandria nginx: 2020/08/01 16:55:40 [alert] 10170#10170: worker process 19634 exited on signal 6
Aug  1 16:55:42 Alexandria nginx: 2020/08/01 16:55:42 [alert] 10170#10170: worker process 19637 exited on signal 6
Aug  1 16:55:44 Alexandria nginx: 2020/08/01 16:55:44 [alert] 10170#10170: worker process 19640 exited on signal 6
Aug  1 16:55:46 Alexandria nginx: 2020/08/01 16:55:46 [alert] 10170#10170: worker process 19719 exited on signal 6
Aug  1 16:55:48 Alexandria nginx: 2020/08/01 16:55:48 [alert] 10170#10170: worker process 19723 exited on signal 6
Aug  1 16:55:50 Alexandria nginx: 2020/08/01 16:55:50 [alert] 10170#10170: worker process 19725 exited on signal 6
Aug  1 16:55:52 Alexandria nginx: 2020/08/01 16:55:52 [alert] 10170#10170: worker process 19728 exited on signal 6

the nginx error log is full of this:

2020/08/01 16:55:38 [alert] 10170#10170: worker process 19624 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:40 [alert] 10170#10170: worker process 19634 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:42 [alert] 10170#10170: worker process 19637 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:44 [alert] 10170#10170: worker process 19640 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:46 [alert] 10170#10170: worker process 19719 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:48 [alert] 10170#10170: worker process 19723 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:50 [alert] 10170#10170: worker process 19725 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/08/01 16:55:52 [alert] 10170#10170: worker process 19728 exited on signal 6

I saw this for the first time the week of 7/20/2020 (not sure exact date). I found some threads with similar issues. Following their advice I ended up rebooting nginx (/etc/rc.d/rc.nginx restart), shutting down the array, and rebooting the system.

problem occurred again on 7/27, 7/29, 7/31, and now 8/1. Today is the first day I caught it in progress. Logs are 5% right now. Both the syslog.txt and nginx error.log are attached, along with diagnostics.

prior to these errors, I had recently added 2 dockers. I added Navidrome and NginxProxyManager. All of my dockers and VMs seem to function normally until the log folders are full then nothing seems to fuction.

I found this thread where the user seems to suggest that Dynamix System Temp plugin is the cause. https://forums.unraid.net/topic/90727-nginx-errors-filling-logs/

I uninstalled that plugin but the issue still occurs.

There are a number of other threads indicating the same problem, but no solution that I can find.

https://forums.unraid.net/topic/84681-odd-ngnix-error-filling-up-my-syslog/

https://forums.unraid.net/topic/90942-webui-is-angry-with-me/

https://forums.unraid.net/topic/88948-get-this-nginix-error-on-the-server-log-and-the-plex-transs-is-buggy-with-green-lag/

Can anyone find anything wrong in my logs or diagnostics? Any advice on resolving this?

Thanks!

20200801 logs.zip alexandria-diagnostics-20200801-1727.zip

rav · August 6, 2020

I still could use some help with this. After rebooting the server on Aug 1, I observed the same issues recurring in the logs on Aug 3.

Following the advice from a similar thread (https://forums.unraid.net/topic/95420-683-syslog-filling-up-with-nginx-errors-causing-system-to-be-unusble/) I set the dockers to not autostart, I disabled the VMs, and I rebooted into safe mode. This was around noon.

at 14:25:24, I saw the same errors in the syslog. They stopped on their own at 14:55:23.

Aug  3 14:55:23 Alexandria nginx: 2020/08/03 14:55:23 [alert] 2084#2084: worker process 26373 exited on signal 6

the next morning, Aug 4, I saw the same errors from 10:25:42 to 11:22:23

Aug  4 11:22:23 Alexandria nginx: 2020/08/04 11:22:23 [alert] 2084#2084: worker process 15116 exited on signal 6

Note that while the errors were occurring, I tried to telnet to the machine to collect logs. I was able to do that, but the machine was acting sluggish. I was not able to collect diagnostics at all during this time.

After 11:22:23 on Aug 4, I have not seen any such errors in the syslog. I have monitored for 2 days. I ended up starting one docker (MariaDB, so I could use Kodi). I noticed no ill effects during this time. I was able to collect logs and diagnostics as normal. The machine seems perfectly fine now.

The timing of the errors in the syslog coincide with the error messages in the nginx error.log, but I don't know enough about nginx to identify those errors or what could be going on.

Could there be some external process hitting the Unraid UI causing this type of error? I can't image what that would be. I see some reports that it could be related to Dynamix System Temp (https://forums.unraid.net/topic/90727-nginx-errors-filling-logs/), but I removed that plugin and the errors still occur. I saw some reports that it could be related to accessing the Unraid UI from Safari (can't find that thread now), but I've only ever accessed the system from Chrome or Firefox on Windows, Mac, Linux, and Android.

Anyone have any ideas what's going on or what else I can check? I thought maybe it was a plugin, but the issue occurred twice in the first day upon running in safe mode, so I guess that rules out plugins?

20200806 logs.zip alexandria-diagnostics-20200806-0515.zip

(6.8.3) syslog filling up with nginx errors, causing system to be unusable

Recommended Posts

rav

Link to comment

rav

Link to comment

Join the conversation