nginx running out of shared memory


Recommended Posts

1046480893_Screenshot2023-03-10104843.thumb.png.67bd1dc36d2093026bb0c941f50c74e8.png

 

- Saw these errors so I ran "/etc/rc.d/rc.nginx restart"

- The connection to system went down (as expected) took about 15 minutes to come back online on My Server but still wasn't reachable. 
- I VPN'd to the router, still wasn't able to get the WebUI with the LAN IP.

- Ping returned a reply, SSH'd into the server, then did a 'sudo reboot'

 

This is especially scary when you're working in another country 👀 

  • Like 1
Link to comment

i just had the same thing happen to me 1st time.  shutdown the array did a rc.nginx stop, wait it to shutdown the service and API, did a start and at the moment not getting the error anymore for the time being.  i did have the red triangle ! on the myserver, I disabled it, went to work and come home and it was reconnected to the API, restarted it and all seemed well and it did a flash basckup.

 

and yes everything is updated

Edited by MyKroFt
Link to comment
  • 1 month later...
1 hour ago, FriarTuck said:

The same thing is happening to me. UnRAID used to be so solid... Ive tried most things and cant fix this. I wanted to leave this running unattended for a few months when I travel, but this will no longer be possible.

I would recommend if you haven't already. Install "CA Auto Update Applications" set it to search for updates for every container and plugin daily. Reinstall the MyServer/Unraid Connect plugin (doing a reboot inbetween).

LMK how that works for you.

Link to comment
  • 1 month later...

Last month I updated from 6.9.2 (after staying on it for ages) to 6.11.5 then updated to 6.12.0 then 6.12.1.

 

Since this time I have had ongoing issues with massive spikes in CPU usage. Usually 100% and it cannot recover. I have updated plugins, read 100 forum threads, changed settings within Tips & Tweaks (vm.dirty_background_ratio to 3, vm.dirty_ratio to 6, changed the CPU scaling governor to performance), changed my MB & CPU, changed SATA card, removed a HDD, pinned all dockers to various CPU cores, removed unraid access to a few cores, among various other things suggested in these forums... Still no fix.

 

Today I ran the "/etc/rc.d/rc.nginx restart" command. It stopped the CPU from spiking, then maxed out my RAM?! Then the server froze after that and I had to hard reboot.

 

Generally these spikes occur about once per day. Diagnostics attached. 

 

Does anyone have any other ideas?

 

Example prior to CPU isolation:

 

2023-06-29.screenshot.jpg

 

Example after CPU isolation:

 

2023-07-01.screenshot.jpg

 

What HTOP looks like on one occasion:

 

2023-06-29.screenshot (1).jpg

 

Old CPU:

 

920754983_2023-06-20.screenshot(4).jpg.1dbf1ba61bc2a32733c55e3614e3f9c4.jpg

 

spaldounraid-diagnostics-20230701-1754.zip

Edited by DrSpaldo
Link to comment
  • 2 weeks later...

On 6.12.2 and similar error:

Jul 13 17:34:17 kasumi nginx: 2023/07/13 17:34:17 [crit] 35027#35027: ngx_slab_alloc() failed: no memory
Jul 13 17:34:17 kasumi nginx: 2023/07/13 17:34:17 [error] 35027#35027: shpool alloc failed
Jul 13 17:34:17 kasumi nginx: 2023/07/13 17:34:17 [error] 35027#35027: nchan: Out of shared memory while allocating message of size 233. Increase nchan_max_reserved_memory.
Jul 13 17:34:17 kasumi nginx: 2023/07/13 17:34:17 [error] 35027#35027: *5835100 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/wireguard?buffer_length=1 HTTP/1.1", host: "localhost"
Jul 13 17:34:17 kasumi nginx: 2023/07/13 17:34:17 [error] 35027#35027: MEMSTORE:01: can't create shared message for channel /wireguard

Restarted nginx but no luck, will restart the OS next.

I also tried to see if it was maybe Nginx Proxy Manager, but the /pub location makes me think I have a browser on another computer logged into the dashboard, causing this issue.

 

Could we get a list of remote connections to the dashboard and shut them down manually?

Edited by martial
typo fix
Link to comment
14 hours ago, FlyingTexan said:

My advice is to get rid of NGINX and do a cloudflare zero trust tunnel.  Removes another potential point of failure and was very easy to setup. Made my life much easier. 

 

 

 

I think we need to buy a domain name ???

Link to comment
On 7/13/2023 at 5:59 PM, FlyingTexan said:

My advice is to get rid of NGINX and do a cloudflare zero trust tunnel.  Removes another potential point of failure and was very easy to setup. Made my life much easier. 

 

 

@FlyingTexan Which of the "cloudflared" version in "apps" do you recommend?
I ended up using "CloudflaredTunnel" which uses the "cloudflare/cloudflared" container. Thanks for the recommendation.

Edited by martial
update content
Link to comment
  • 3 weeks later...

Started happening to me a few days ago.  Suck.  Restarting the server sorts things out, but cannot otherwise find the cause, and it keeps coming back.  Filling the log repeating the following:

 

Aug  1 00:10:53 BigChief nginx: 2023/08/01 00:10:53 [crit] 7226#7226: ngx_slab_alloc() failed: no memory
Aug  1 00:10:53 BigChief nginx: 2023/08/01 00:10:53 [error] 7226#7226: shpool alloc failed
Aug  1 00:10:53 BigChief nginx: 2023/08/01 00:10:53 [error] 7226#7226: nchan: Out of shared memory while allocating message of size 15979. Increase nchan_max_reserved_memory.
Aug  1 00:10:53 BigChief nginx: 2023/08/01 00:10:53 [error] 7226#7226: *895289 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Aug  1 00:10:53 BigChief nginx: 2023/08/01 00:10:53 [error] 7226#7226: MEMSTORE:00: can't create shared message for channel /disks

 

  • Thanks 1
Link to comment

Reporting this encountered for the first time today on 6.12.3

 

Manifests as empty pages on  the Array, Pool and Boot device tabs.

 

Log just filled with this;

 

Aug 8 13:45:08 UNRAID nginx: 2023/08/08 13:45:08 [error] 9990#9990: nchan: Out of shared memory while allocating message of size 19386. Increase nchan_max_reserved_memory.

Aug 8 13:45:08 UNRAID nginx: 2023/08/08 13:45:08 [error] 9990#9990: *2742737 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"

Aug 8 13:45:08 UNRAID nginx: 2023/08/08 13:45:08 [error] 9990#9990: MEMSTORE:01: can't create shared message for channel /disks

Aug 8 13:45:08 UNRAID nginx: 2023/08/08 13:45:08 [crit] 9990#9990: ngx_slab_alloc() failed: no memory

Aug 8 13:45:08 UNRAID nginx: 2023/08/08 13:45:08 [error] 9990#9990: shpool alloc failed

 

Link to comment

Same issue here... log is filling up with:

 

Aug  8 21:16:07 MyServer nginx: 2023/08/09 06:16:07 [crit] 16233#16233: ngx_slab_alloc() failed: no memory
Aug  8 21:16:07 MyServer nginx: 2023/08/09 06:16:07 [error] 16233#16233: shpool alloc failed
Aug  8 21:16:07 MyServer nginx: 2023/08/09 06:16:07 [error] 16233#16233: nchan: Out of shared memory while allocating message of size 8894. Increase nchan_max_reserved_memory.
Aug  8 21:16:07 MyServer nginx: 2023/08/09 06:16:07 [error] 16233#16233: *1133944 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/devices?buffer_length=1 HTTP/1.1", host: "localhost"
Aug  8 21:16:07 MyServer nginx: 2023/08/09 06:16:07 [error] 16233#16233: MEMSTORE:01: can't create shared message for channel /devices

 

Link to comment
  • 4 weeks later...

I'm seeing these errors for the first time (or noticing for first time) on two different servers running 6.12.3 and 6.12.4.  I haven't noticed any negative impact though, but I do have GUI tabs and browser terminal sessions open to both servers.  Is the thought that leaving the terminal session open is what causes this error?  Or a tab to the Unraid GUI?  Or both?

Link to comment
4 hours ago, dboonthego said:

I'm seeing these errors for the first time (or noticing for first time) on two different servers running 6.12.3 and 6.12.4.  I haven't noticed any negative impact though, but I do have GUI tabs and browser terminal sessions open to both servers.  Is the thought that leaving the terminal session open is what causes this error?  Or a tab to the Unraid GUI?  Or both?

try changing macvlan to ipvlan and do the above posted cloudflare zerotrust tunnel.

Link to comment

I just started seeing this error in my logs now too. I just upgraded to 6.12.4 yesterday. I usually keep a Firefox tab open with my Dashboard displayed, but for the past day I've actually had a tab open in Edge. When I went to look at it after it had sat for a few hours, all the animations on the dashboard (CPU/RAM/temps, etc) started going super fast, as if there were all fastforwarding to catch up with the current time. Then they all returned to normal speed. When I looked in the logs (checking for other issues) I found all these entries which coincide with the time that I was looking.

 

Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [crit] 12634#12634: ngx_slab_alloc() failed: no memory
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: shpool alloc failed
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: nchan: Out of shared memory while allocating message of size 28129. Increase nchan_max_reserved_memory.
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: *351766 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/devices?buffer_length=1 HTTP/1.1", host: "localhost"
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: MEMSTORE:01: can't create shared message for channel /devices
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [crit] 12634#12634: ngx_slab_alloc() failed: no memory
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: shpool alloc failed
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: nchan: Out of shared memory while allocating message of size 16811. Increase nchan_max_reserved_memory.
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: *351769 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Sep 10 01:39:22 Valaskjalf nginx: 2023/09/10 01:39:22 [error] 12634#12634: MEMSTORE:01: can't create shared message for channel /disks
Sep 10 01:39:23 Valaskjalf nginx: 2023/09/10 01:39:23 [crit] 12634#12634: ngx_slab_alloc() failed: no memory
Sep 10 01:39:23 Valaskjalf nginx: 2023/09/10 01:39:23 [error] 12634#12634: shpool alloc failed
Sep 10 01:39:23 Valaskjalf nginx: 2023/09/10 01:39:23 [error] 12634#12634: nchan: Out of shared memory while allocating message of size 28129. Increase nchan_max_reserved_memory.
Sep 10 01:39:23 Valaskjalf nginx: 2023/09/10 01:39:23 [error] 12634#12634: *351776 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/devices?buffer_length=1 HTTP/1.1", host: "localhost"

 

valaskjalf-diagnostics-20230910-0222.zip

Link to comment

I too have a tab always open in Firefox to my dashboard, but I have been seeing less of my logs filling lately (logs still at 14% after 16 days -- awaiting a few more days to see the list of issues on 6.12.4)
 

The two things I did were to use CloudFlare tunnels to reach my dashboard (Cloudflare Zero Trust with a 6 digits pin sent to a selected email to give me access to the dasahboard and selected services) vs accessing it from Nginx Proxy Manager (NPM), and I have increased the memory allocated to NPM in the extra parameters: 

--memory=4G

from 1G

 

Hopefully others can reproduce this

Link to comment

Me too.  I just saw these - my system was recently upgraded to 6.12.4.  I don't remember ever seeing these under 6.11.5, if I did they didn't cause issues as my system could go for months without having issues.  Not sure it is causing any major problems in 6.12.4 - at least not yet, other than filling the logs.

Link to comment

Seeing this problem now. Similar to the last post, I was also fine running 6.12.3 but updated to 6.12.4 a few weeks ago. 

 

Ran this:

> /var/log/syslog
/etc/rc.d/rc.syslog stop
/etc/rc.d/rc.syslog start

It cleared the log file and the web console log is working. 

 

But I noticed that the Log utilization bar on the Dashboard isnt reflecting the new empty syslog. 

image.png.81c063d2b492e3d7345ef50f1479be7d.png

 

uptime:

06:54:26 up 6 days, 18:53,  1 user,  load average: 2.18, 2.11, 2.17

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.