[Solved] 6.5.0 nginx error messages


Recommended Posts

SOLVED: The problem was specific to the Safari web browser and was fixed in unRAID 6.5.1-rc6.

 

I'm getting a series of nginx-related error messages in my syslog that I've never seen before:

Mar 25 19:23:49 Northolt nginx: 2018/03/25 19:23:49 [crit] 7004#7004: ngx_slab_alloc() failed: no memory
Mar 25 19:23:49 Northolt nginx: 2018/03/25 19:23:49 [error] 7004#7004: shpool alloc failed
Mar 25 19:23:49 Northolt nginx: 2018/03/25 19:23:49 [error] 7004#7004: nchan: Out of shared memory while allocating message of size 293. Increase nchan_max_reserved_memory.
Mar 25 19:23:49 Northolt nginx: 2018/03/25 19:23:49 [error] 7004#7004: *224281 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost"
Mar 25 19:23:49 Northolt nginx: 2018/03/25 19:23:49 [error] 7004#7004: MEMSTORE:00: can't create shared message for channel /cpuload
Mar 25 19:31:26 Northolt bunker: Verify task for disk3 finished, duration: 13 hr, 31 min, 25 sec.

They are repeated over and over and began during a scheduled run of the Dynamix File Integrity plugin to check the contents of disk3. Since the end of that task they have stopped appearing. It looks as though something is failing to allocate memory.

 

northolt-diagnostics-20180325-1948.zip

 

EDIT: I found this from two years ago but it was supposedly fixed.

Edited by John_M
Found a reference, marked as solved
Link to comment
  • John_M changed the title to 6.5.0 nginx error messages
25 minutes ago, shaunsund said:

I'll post my diags after OOM messages. I like nginx, but the memory handling/configuration seems to ned some work.

 

I don't see any similarities with my case at all. You had an OOM - it looks as though python triggered it - and nginx got reaped. I'm seeing nginx/nchan allocation failures and not a single OOM. There are things you can try to avoid the OOMs, or you might want to start your own thread.

Link to comment

I'm now seeing this on my other two servers during the monthly parity check:

Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [crit] 13132#13132: ngx_slab_alloc() failed: no memory
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: shpool alloc failed
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: nchan: Out of shared memory while allocating message of size 3181. Increase nchan_max_reserved_memory.
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: *280770 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: MEMSTORE:00: can't create shared message for channel /var
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [crit] 13132#13132: ngx_slab_alloc() failed: no memory
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: shpool alloc failed
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: nchan: Out of shared memory while allocating message of size 8911. Increase nchan_max_reserved_memory.
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: *280771 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  1 14:42:03 Mandaue nginx: 2018/04/01 14:42:03 [error] 13132#13132: MEMSTORE:00: can't create shared message for channel /disks

and

Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [crit] 7019#7019: ngx_slab_alloc() failed: no memory
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: shpool alloc failed
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: nchan: Out of shared memory while allocating message of size 3177. Increase nchan_max_reserved_memory.
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: *1328700 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: MEMSTORE:00: can't create shared message for channel /var
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [crit] 7019#7019: ngx_slab_alloc() failed: no memory
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: shpool alloc failed
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: nchan: Out of shared memory while allocating message of size 5178. Increase nchan_max_reserved_memory.
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: *1328701 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  1 10:02:49 Lapulapu nginx: 2018/04/01 10:02:49 [error] 7019#7019: MEMSTORE:00: can't create shared message for channel /disks

It seems to be associated with periods of heavy disk activity. Am I really the only one seeing this? Diagnostics in OP.

 

Link to comment

Just to confirm, the third server is doing the same during its parity check:

Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [crit] 7004#7004: ngx_slab_alloc() failed: no memory
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: shpool alloc failed
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: nchan: Out of shared memory while allocating message of size 3181. Increase nchan_max_reserved_memory.
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: *1542554 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost"
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: MEMSTORE:00: can't create shared message for channel /var
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [crit] 7004#7004: ngx_slab_alloc() failed: no memory
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: shpool alloc failed
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: nchan: Out of shared memory while allocating message of size 5233. Increase nchan_max_reserved_memory.
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: *1542555 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Apr 1 14:46:15 Northolt nginx: 2018/04/01 14:46:15 [error] 7004#7004: MEMSTORE:00: can't create shared message for channel /disks

In each case this is while the parity check is in progress. Does anyone know what "channel /disks" and "channel /var" refer to? In my OP is was "channel /cpuload".

 

Edited by John_M
Added the fact that it's during parity check
Link to comment

Would anyone be prepared to look and confirm that they are not seeing similar entries in their syslog, please? I'm using 6.5.0 on two servers and 6.5.1-rc3 on the third and the error messages seem to appear during periods of high disk activity - today during a monthly parity check and last Sunday during a scheduled Dynamix File Integrity scan - and they repeat over and over.

 

Link to comment

Inspired by this thread

I updated the server that first showed the error to version 6.5.1-rc3 and restarted in Safe Mode. I then ran a non-correcting parity check and waited. After a while the errors started appearing in the syslog. I was beginning to think that it was caused by the SNMP plugin but that doesn't seem to be the case. I have no dockers running and VMs are disabled. The nginx worker process is using around 800 MB of RAM.

 

northolt-diagnostics-20180402-0207.zip

 

Link to comment

The parity check I started last night in Safe Mode completed successfully but my syslog is full of nginx-related errors just like the ones above. So the issue is not related to a plugin. It isn't related to a docker either - I had the docker service enabled but no dockers running. I have no VMs on this server and the VM service is permanently disabled. I've attached the diagnostics taken just after the parity check completed with zero errors.

 

northolt-diagnostics-20180402-1258.zip

 

Link to comment
1 hour ago, John_M said:

I updated one server to 6.5.1-rc4 in the hope it would fix the problem but it didn't.

 

There are three topics going for this same issue & we have been trying to reproduce without success.  Seems like you can reproduce this issue right?

If so, please try adding this line in your 'go' file just before emhttp is started:

sed -i 's/$arg_buffer_length/1/g' /etc/rc.d/rc.nginx

Then reboot (sorrry) and see if issue goes away.

Link to comment
4 minutes ago, limetech said:

 

There are three topics going for this same issue & we have been trying to reproduce without success.  Seems like you can reproduce this issue right?

If so, please try adding this line in your 'go' file just before emhttp is started:


sed -i 's/$arg_buffer_length/1/g' /etc/rc.d/rc.nginx

Then reboot (sorrry) and see if issue goes away.

 

Thanks for the reply, Tom. I've searched and haven't found any other threads about this specific error spamming the syslog. It might well be linked with the problem some people are seeing where nginx uses an excessive amount (several gigabytes) of RAM but that isn't what I'm seeing. Nor has anyone else confirmed that they are seeing my particular problem.

 

I can indeed reproduce my issue simply by running a parity check and waiting. So thanks for the suggestion. I'll edit my go file and reboot, start a non-correcting parity check and report back tomorrow. Thanks again.

Link to comment

@limetech The modification to the go file did not fix the problem. Errors from nginx started to spam the syslog about two hours after the start of a parity check. This time I'm getting three similar sets of five messages per second relating to the "channels" /var, /disks and /cpuload and the syslog rapidly fills up:

Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [crit] 7210#7210: ngx_slab_alloc() failed: no memory
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: shpool alloc failed
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: nchan: Out of shared memory while allocating message of size 3181. Increase nchan_max_reserved_memory.
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: *31760 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/var?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: MEMSTORE:00: can't create shared message for channel /var
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [crit] 7210#7210: ngx_slab_alloc() failed: no memory
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: shpool alloc failed
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: nchan: Out of shared memory while allocating message of size 8888. Increase nchan_max_reserved_memory.
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: *31761 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: MEMSTORE:00: can't create shared message for channel /disks
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [crit] 7210#7210: ngx_slab_alloc() failed: no memory
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: shpool alloc failed
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: nchan: Out of shared memory while allocating message of size 339. Increase nchan_max_reserved_memory.
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: *31762 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost"
Apr  8 07:04:42 Mandaue nginx: 2018/04/08 07:04:42 [error] 7210#7210: MEMSTORE:00: can't create shared message for channel /cpuload

I'll keep one server on 6.5.1-rc4 for experimenting with but I'm going to have to move the others back to 6.4.1. Is there anything else I could try? I have already tried Safe Mode.

 

Diagnostic grabbed during parity check. mandaue-diagnostics-20180408-1035.zip

 

Link to comment

I've tried searching the forum and the only reports I can find that mention these error messages (apart from a couple of false alarms) are written by me. It might be related to the high memory usage issue but one person experiencing that issue has confirmed that he isn't seeing this issue.

 

I've tried Googling the error messages but I'm only seeing reports from 3 to 5 years ago. For example a search for "ngx_slab_alloc() failed: no memory" reveals this from 2013 and this from 2016. What is interesting is that the same guy, Wandenberg answers both questions and in 2013 he said that there was a known issue with the nginx code, which surely must have been fixed by now. As part of his reply he gives a link but I really don't have a clue what they are talking about.

 

I have the issue on all three servers and it's 100% repeatable - just run a parity check and stand back - it takes a couple of hours for the error messages to start. That's the case for normal boot and safe mode, with no Dockers running and no VMs. I first saw the problem with unRAID 6.5.0 and I've seen it with 6.5.1-rc3 and now -rc4. I don't believe I can be alone but so far nobody has confirmed that they have the problem too. If you don't check your syslog you might well not notice as the system continues to function, though it certainly affects performance.

  • Upvote 1
Link to comment
7 hours ago, John_M said:

I have the issue on all three servers and it's 100% repeatable - just run a parity check and stand back - it takes a couple of hours for the error messages to start.

 

Please try 6.5.1-rc5 on the next branch.  This brings nginx and nchan up to their latest versions which will be necessary in order to correspond with the developers to help solve this problem.  What would also be most helpful, if it doesn't make a difference in repeatability, is to run in safe mode just to be as vanilla as possible.

 

I think this issue is related to the other nginx/nchan issues being reported, just manifesting a little differently.  Thx for your help!

Link to comment
3 minutes ago, limetech said:

 

Please try 6.5.1-rc5 on the next branch.  This brings nginx and nchan up to their latest versions which will be necessary in order to correspond with the developers to help solve this problem.  What would also be most helpful, if it doesn't make a difference in repeatability, is to run in safe mode just to be as vanilla as possible.

 

I think this issue is related to the other nginx/nchan issues being reported, just manifesting a little differently.  Thx for your help!

 

Thanks Tom. I'm downloading it now. I'll restart in safe mode with no dockers or VMs and run a parity check again.

Link to comment
3 hours ago, limetech said:

Right, you can get rid of that.  It should have had absolutely no effect, just wanted to confirm that.

 

Thanks. I'll delete the line completely before my next reboot.

 

Parity check has been running for 4 hours 40 minutes and the nginx error messages have not appeared this time. This is a definite improvement. I'll let it run its course and see how it stands tomorrow.

Link to comment

A different server also running 6.5.1-rc5 but in normal mode with plugins and two dockers running plus a lightweight Observium VM also produced the nginx errors during a parity check, but also took longer for the messages to start appearing that with previous unRAID versions.

 

Servers "lapulapu" and "mandaue" have similar hardware specs: consumer Asus motherboards, AMD A88X chipsets, socket FM2/FM2+ processors, 16 GB RAM, Dell H310 HBAs.

 

Normal boot mode diagnostics after completion: lapulapu-diagnostics-20180409-1235.zip

Link to comment
  • John_M changed the title to [Solved] 6.5.0 nginx error messages

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.