[Solved] 6.5.0 nginx error messages

John_M · April 9, 2018

And a third server also running 6.5.1-rc5 in normal mode with plugins but no dockers and no VM service also started producing nginx errors several hours into a parity check.

Server "northolt" has completely different hardware from the other two. It's an HP Microserver Gen8 with Intel Celeron processor.

Normal boot mode diagnostics after completion: northolt-diagnostics-20180409-1547.zip

eschultz · April 9, 2018

@John_M When you run your parity-checks, do you leave the browser open on the Main page the whole time? If so, which browser and version are you using?

I have four 6.5.0 (or newer) machines here running a parity-check and trying to reproduce the issue you're seeing. About 20min in so far and htop isn't showing any nginx memory growth yet. I have Firefox browser tabs open to each of the servers.

John_M · April 9, 2018

1 minute ago, eschultz said:

@John_M When you run your parity-checks, do you leave the browser open on the Main page the whole time? If so, which browser and version are you using?

I have four 6.5.0 (or newer) machines here running a parity-check and trying to reproduce the issue you're seeing. About 20min in so far and htop isn't showing any nginx memory growth yet. I have Firefox browser tabs open to each of the servers.

I do leave a browser open all the time, but for these tests it has spent most of its time on the Dashboard page. I'm using Safari (I know it isn't your favourite browser!) version 11.1 (12605.1.33.1.3) on macOS 10.12.6. I can try either Firefox or Chrome if you'd like me to, or I could start more parity checks, close the windows and reopen them tomorrow. Or I could use Windows 10 and either Edge or Firefox - anything to be of assistance.

I don't see quite the huge increase in nginx memory usage that some people are seeing but I've just checked my three servers (now just sitting idle - I haven't rebooted) and it is using 3.3 GB, 4.3 GB and 6.1 GB on lapulapu, mandaue and northolt, repectively. L and M are 16 GB machines, N has 8 GB and so the usage is significant in this particular case.

Since changing to 6.5.1-rc5 it takes several hours for the error messages to start but when they do they come in at intervals of one second, if that helps.

eschultz · April 10, 2018

7 hours ago, John_M said:

I do leave a browser open all the time, but for these tests it has spent most of its time on the Dashboard page. I'm using Safari (I know it isn't your favourite browser!) version 11.1 (12605.1.33.1.3) on macOS 10.12.6. I can try either Firefox or Chrome if you'd like me to, or I could start more parity checks, close the windows and reopen them tomorrow. Or I could use Windows 10 and either Edge or Firefox - anything to be of assistance.

I don't see quite the huge increase in nginx memory usage that some people are seeing but I've just checked my three servers (now just sitting idle - I haven't rebooted) and it is using 3.3 GB, 4.3 GB and 6.1 GB on lapulapu, mandaue and northolt, repectively. L and M are 16 GB machines, N has 8 GB and so the usage is significant in this particular case.

Since changing to 6.5.1-rc5 it takes several hours for the error messages to start but when they do they come in at intervals of one second, if that helps.

I think I can reproduce after a while with Safari (i'm also on macOS 10.12.6). Once I completely close Safari, from the Dock, the memory seems to free up... meanwhile Firefox still has the Dashboard page open in a tab.

I'm messing around with 'nchan_stub_status' to see what's happening and it looks like the Stored Messages stays at 3 for a while but then begins to increase after 30-40min but only when Safari is connected it seems. I'm trying with Chrome now to see if it exhibits the same behavior as Safari.

When you get a change, and are able, could you give Firefox a shot? You might first need to close the unRAID tabs in Safari, then fully close Safari to get a clean state.

I see two issues for nchan possibly related but I'd like to determine which conditions (Safari?) are causing this first before chiming in there:

https://github.com/slact/nchan/issues/413

https://github.com/slact/nchan/issues/445

John_M · April 10, 2018

8 hours ago, eschultz said:

When you get a change, and are able, could you give Firefox a shot? You might first need to close the unRAID tabs in Safari, then fully close Safari to get a clean state.

Thanks Eric. I'm glad you've been able to reproduce the problem. I shut down server M and quit Safari completely (CMD-Q). Then I restarted the server in Safe Mode and opened a Firefox browser tab on it, started the array, and kicked off a non-correcting parity check. In the first few minutes I stayed on the Main tab and checked that the disks were being read at about the expected speed (so I was looking at MB/s, not read and write counts - in case this makes any difference). Then switched to the Tools page to check on the syslog. Then switched to the Dashboard page where I'll leave it alone, with the browser open for the rest of the day. Dashboard is reporting Memory Usage at 7% at the moment. I'll report back later.

John_M · April 11, 2018

21 hours ago, eschultz said:

When you get a change, and are able, could you give Firefox a shot? You might first need to close the unRAID tabs in Safari, then fully close Safari to get a clean state.

The parity check completed without any nginx error messages appearing in the syslog, when using Firefox browser. Memory usage is now 8% as reported by the Dashboard. I'm going to stay with Firefox for the time being but if you'd like me to do any further testing I'm willing.

Safe Mode diagnostics (Firefox browser) after completion: mandaue-diagnostics-20180411-0131.zip

eschultz · April 11, 2018

38 minutes ago, John_M said:

The parity check completed without any nginx error messages appearing in the syslog, when using Firefox browser. Memory usage is now 8% as reported by the Dashboard. I'm going to stay with Firefox for the time being but if you'd like me to do any further testing I'm willing.

Thanks for testing. I suspect you can boot back in to normal mode and start using VMs and Docker apps again since it seems to just be a issue related to Safari (most of the machines I was using to test Nginx memory usage had active VMs and Docker apps running).

I can also confirm Chrome works fine too -- no memory increases throughout a parity-check with Chrome. Now on to see what's special about Safari...

John_M · April 11, 2018

5 minutes ago, eschultz said:

Thanks for testing. I suspect you can boot back in to normal mode and start using VMs and Docker apps again since it seems to just be a issue related to Safari (most of the machines I was using to test Nginx memory usage had active VMs and Docker apps running).

I can also confirm Chrome works fine too -- no memory increases throughout a parity-check with Chrome. Now on to see what's special about Safari...

Many thanks, Eric. I'll reboot and use things as normal but stick with Firefox for the time being. I rather like the ability to use the built-in terminal which Safari doesn't allow.

John_M · April 16, 2018

I updated to unRAID 6.5.1-rc6, re-booted to normal mode and set another parity check running overnight. This time I used the Safari browser and left a window open on the Dashboard page of the GUI. No nginx error messages were reported. I'll mark the thread as solved.

Normal mode diagnostics (Safari browser) after completion: northolt-diagnostics-20180416-1301.zip

limetech · April 17, 2018

On 4/16/2018 at 5:45 AM, John_M said:

No nginx error messages were reported. I'll mark the thread as solved.

Thank you for all your help with this. Safari seems to do things differently than the other major browsers which onfuses nginx/nchan, as well as not supporting websockets at all if Basic authentication is in use. We're working on a better long-term solution for latter issue but for now, probably best to avoid Safari if possible.

Ascii227 · December 9, 2020

Hi,

Apologies to necro this thread but it is the only one on the forum about this specific issue. Please let me know if it is more appropriate to create a new thread, but I thought as it is related it should stay here.

Last night I put a new disk into my server and started the disk clear. This morning when I came to the server the syslog is full of errors identical to the OP:

Dec 9 08:11:31 AsQ-NAS nginx: 2020/12/09 08:11:31 [crit] 4885#4885: ngx_slab_alloc() failed: no memory
Dec 9 08:11:31 AsQ-NAS nginx: 2020/12/09 08:11:31 [error] 4885#4885: shpool alloc failed
Dec 9 08:11:31 AsQ-NAS nginx: 2020/12/09 08:11:31 [error] 4885#4885: nchan: Out of shared memory while allocating message of size 10229. Increase nchan_max_reserved_memory.
Dec 9 08:11:31 AsQ-NAS nginx: 2020/12/09 08:11:31 [error] 4885#4885: *97557 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Dec 9 08:11:31 AsQ-NAS nginx: 2020/12/09 08:11:31 [error] 4885#4885: MEMSTORE:00: can't create shared message for channel /disks

I followed this thread through and I can see it was an issue with safari, however I use windows and firefox. Also I do not leave my browser open. All browsers and connections to the server were closed whilst this check was going on and the error messages occurred. I cant find anything else related to these nginx memory errors in my logs or on the forum.

I am using unRaid 6.8.3 and have had no other issues.

Diagnostics attached and taken while the new disk clear is still running, any advice would be much appreciated. Thanks very much.

asq-nas-diagnostics-20201209-0923.zip

Edited December 9, 2020 by Ascii227

frakman1 · January 4, 2021

UnRaid version: 6.8.3

I have a similar issue. I noticed that the log bar in Dashboard was at 100%.

I checked the syslog file and it had this error repeated forever across syslog and syslog.1 and syslog.2:

Jan  4 04:54:03 Tower nginx: 2021/01/04 04:54:03 [error] 3115#3115: MEMSTORE:00: can't create shared message for channel /disks
Jan  4 04:54:04 Tower nginx: 2021/01/04 04:54:04 [crit] 3115#3115: ngx_slab_alloc() failed: no memory
Jan  4 04:54:04 Tower nginx: 2021/01/04 04:54:04 [error] 3115#3115: shpool alloc failed
Jan  4 04:54:04 Tower nginx: 2021/01/04 04:54:04 [error] 3115#3115: nchan: Out of shared memory while allocating message of size 9693. Increase nchan_max_reserved_memory.
Jan  4 04:54:04 Tower nginx: 2021/01/04 04:54:04 [error] 3115#3115: *5463770 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"

I don't have scheduled parity enabled so it can't be related to that.

The output of du for that folder is:

root@Tower:~# df -h /var/log
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           128M  128M     0 100% /var/log
root@Tower:~# du -sm /var/log/*
1	/var/log/apcupsd.events
1	/var/log/apcupsd.events.1
1	/var/log/apcupsd.events.2
1	/var/log/apcupsd.events.3
1	/var/log/btmp
1	/var/log/btmp.1
0	/var/log/cron
0	/var/log/debug
1	/var/log/dmesg
1	/var/log/docker.log
1	/var/log/faillog
1	/var/log/lastlog
1	/var/log/libvirt
0	/var/log/maillog
0	/var/log/messages
0	/var/log/nfsd
55	/var/log/nginx
0	/var/log/packages
1	/var/log/pkgtools
0	/var/log/plugins
0	/var/log/removed_packages
0	/var/log/removed_scripts
1	/var/log/samba
0	/var/log/scripts
0	/var/log/secure
0	/var/log/setup
0	/var/log/spooler
0	/var/log/swtpm
1	/var/log/syslog
34	/var/log/syslog.1
34	/var/log/syslog.2
4	/var/log/wtmp
2	/var/log/wtmp.1

I looked into the nginx folder and saw the same syslog error but also several repeated errors before it:

2021/01/02 15:35:48 [alert] 8095#8095: worker process 22309 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2021/01/02 15:35:50 [alert] 8095#8095: worker process 22319 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2021/01/02 15:35:52 [alert] 8095#8095: worker process 22402 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2021/01/02 15:35:54 [alert] 8095#8095: worker process 22410 exited on signal 6
2021/01/02 15:35:55 [crit] 22421#22421: ngx_slab_alloc() failed: no memory
2021/01/02 15:35:55 [error] 22421#22421: shpool alloc failed
2021/01/02 15:35:55 [error] 22421#22421: nchan: Out of shared memory while allocating message of size 9693. Increase nchan_max_reserved_memory.
2021/01/02 15:35:55 [error] 22421#22421: *4870623 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
2021/01/02 15:35:55 [error] 22421#22421: MEMSTORE:00: can't create shared message for channel /disks
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.

I have plenty of unsued RAM so I'm not sure why it's complaining about memory.

Any ideas?

*** UPDATE ***

I created more room by deleting one of the error.log files in /var/log/nginx

I then ran these commands to restart the webserver components (but noticed that only first command was needed to stop the deluge of error log messages):

/etc/rc.d/rc.nginx restart 
/etc/rc.d/rc.nginx reload 
/etc/rc.d/rc.php-fpm restart 
/etc/rc.d/rc.php-fpm reload

Edited January 4, 2021 by frakman1

[Solved] 6.5.0 nginx error messages

Recommended Posts

John_M

Link to comment

eschultz

Link to comment

John_M

Link to comment

eschultz

Link to comment

John_M

Link to comment

John_M

Link to comment

eschultz

Link to comment

John_M

Link to comment

John_M

Link to comment

limetech

Link to comment

Ascii227

Link to comment

frakman1

Link to comment

Join the conversation