nginx running out of shared memory

semtex41 · February 12

I am still running 6.12.4, and prior to today my uptime was over 2 months, but unfortunately i have been working with the unraid server webpage open and forgot about checking logs. The system locked up within 5 hours and i had to hard reboot.

Are people still seeing this issue on 6.12.6?

*** NOTE:
My uptime was only that stable because I change the server webpage tab to unraid.net when I am not actively using the web console.

Edited February 12 by semtex41
adding additional info

JohnnyGrey · March 14

Has anyone found a fix for this? This is happening in Unraid 6.12.8. I recently added an "on pool first start" script to increase the size of my /var/log folder to 512mb, since I have 64gb of RAM to use, so luckily it didn't crash, but I see easily hundreds of thousands of these errors in my logs.

I have noticed over the years that if the dashboard is left open, live polling eventually skyrockets to multiple updates per second. I wonder if this is what's causing it?

Edited March 14 by JohnnyGrey

kaares · March 15

I've given up on having the page open. I just have btop running in a terminal to keep an eye on it now

I didn't have a clean shutdown the whole first year I had the server running because of this bug.

martial · March 15

28 minutes ago, kaares said:

I've given up on having the page open. I just have btop running in a terminal to keep an eye on it now

I didn't have a clean shutdown the whole first year I had the server running because of this bug.

One thing that I have noticed is that when I access the Dashboard directly from NPM, I might have it happen.

When I access that same dashboard from a CloudFlare tunnel, I have not seen it happen yet.

Varean · April 6

On 3/14/2024 at 6:22 PM, JohnnyGrey said:

Has anyone found a fix for this? This is happening in Unraid 6.12.8. I recently added an "on pool first start" script to increase the size of my /var/log folder to 512mb, since I have 64gb of RAM to use, so luckily it didn't crash, but I see easily hundreds of thousands of these errors in my logs.

I have noticed over the years that if the dashboard is left open, live polling eventually skyrockets to multiple updates per second. I wonder if this is what's causing it?

I just recently upgraded from 6.9.2 to 6.12.8 and started running into this similar issue.

Symptoms are basically where the system becomes unresponsive, and I can't even SSH into it or ping it. Today after letting it sit for about 3hrs while it's doing a parity check I tried to load the GUI and I got a 404 nginx error - restarting the syslog service that allowed me to load the page correctly. I think once parity is finished I was going to downgrade back to 6.9.2 since it was much stable and see if it persists.

Is there a way to monitor your log folder size? I am never able to see if that's getting too full before I just have an issue - and what script did you use to increase the size of it?

Edited April 6 by Varean

quack7017 · April 10

On 10/23/2023 at 11:21 PM, semtex41 said:

After another week, I have determined a few things:

Closing all tabs prevents the errors from building up/cascading.

The browser type doesnt seem to matter. Crashes/logs growing happens with Edge, Chrome, and Firefox.

My appdata backup (which runs on Monday mornings) has been one of the triggers for the nginx errors in the logs. When the tab is open, the log fills up with the errors while the scheduled job is running. I do not blame the plugin, because when the tab (all tabs) are closed, the errors are not generated. Closing the tab today prevented a hard crash like last week, which required a hard shutdown.

This is a webserver based interface. If the primary mechanism for accessing the OS causes the OS to consistently crash, then it is a bug.

I totally agree and hope Limetech will step in to fix it as soon as possible

wayner · April 10

1 hour ago, quack7017 said:

I totally agree and hope Limetech will step in to fix it as soon as possible

We all agree, but the post you are quoting is from six months ago - do we even know if Limetech is addressing it? There have been several new releases of unRAID since then.

pixeldoc81 · April 17

Similar Error messaged filling up syslog on my Server, but not local log.

UNRAID 6.12.10

Unraid Connect Plugin was installed, have uninstalled it now.

root@srv:~# grep -o 'Increase nchan_max_reserved_memory' /mnt/user/system/syslog-127.0.0.1.log | wc -l
74452

root@srv:~# awk -v phrase="Increase nchan_max_reserved_memory" '{count += gsub(phrase, "")} END {print count}' /mnt/user/system/syslog-127.0.0.1.log
74452

root@srv:~# awk -v phrase="Increase nchan_max_reserved_memory" '{count += gsub(phrase, "")} END {print count}' /mnt/user/system/syslog-127.0.0.1.log.1 
261773

root@srv:~# grep -o '"/usr/local/emhttp/us"' /mnt/user/system/syslog-127.0.0.1.log

root@pd-srv:~# du -h /mnt/user/system/syslog-127.0.0.1.log*
48M     /mnt/user/system/syslog-127.0.0.1.log
159M    /mnt/user/system/syslog-127.0.0.1.log.1
548M    /mnt/user/system/syslog-127.0.0.1.log.2
1.6G    /mnt/user/system/syslog-127.0.0.1.log.3
1.6G    /mnt/user/system/syslog-127.0.0.1.log.4

root@srv:~# du -h -d 1 /var/log
0       /var/log/pwfail
16K     /var/log/unraid-api
0       /var/log/preclear
0       /var/log/swtpm
2.5M    /var/log/samba
0       /var/log/plugins
28K     /var/log/pkgtools
0       /var/log/nginx
0       /var/log/nfsd
16K     /var/log/libvirt
3.1M    /var/log

Edited April 17 by pixeldoc81
Added more infos.

posidron · April 21

Hm, am receiving the same error messages and behavior. But in my case it seems, it may be caused by my Windows VM - which performs some heavy memory intensive operations. Given that I have assigned a max of 32Gb as memory out of my 128GB available, it seems like it triggers some leak in the VM / host (?), which consumes all of my host memory, which then leads to nginx not being able to allocate any further memory and ultimately crashes my entire Unraid. Well, a theory, it seems not everybody with the same issue is actually running a Windows VM but if my Windows VM is not running, all seems fine. VirtIO drivers is the last stable version and machine is Q35-7.2.

Yonix · April 23

Hello,

Just adding on this. I have the same behavior here,

using NPM plus (Docker)

The system totally crash only when i pass credential for an authentification on HTTPS.

For context: I'm using a GLPI (Ticketing system) in docker + MariaDB on a closed network loop that overlay 2 VLAN.

Docker is config in IPVLAN mode and all my VLANs are defined in the network config (its a quite complex one)

GLPI is facing the intranet, and it has 2 addresses: one HTTP and one HTTPS.

Note that i also have 12 other websites actively being proxied on the dockerized NPM with passing credentials and authentification 18h/24h 7d/7d, and the phenomenon only occurs with GLPI in HTTPS. (so when the request goes throught the nginx (of the dockerized NPM))

Now when using GLPI. When i log in with my credentials on HTTP, i have no issues.

Using the HTTPS: the credential goes throught, then it start to crumble bit by bit (sometimes very quickly), the page stop loading (especially the modules querying the DB directly).

SOMEHOW, this impact directly the nginx of unraid. and i have to reboot to take control again over my services.

sometimes i still manage to have a responding UnraidUI for a few second but i never managed to resart all the services before a complete freeze of the unraid nginx.

I now try to use as less as possible the unraid ui, and i tend to do it when few people are working.

It happened recently with another program (also deployed in double docker 1app + 1db on separates vlan)

Logs are repeating indefinetly

Mar 19 21:38:23 THEMIS nginx: 2024/03/19 21:38:23 [crit] 16723#16723: ngx_slab_alloc() failed: no memory
Mar 19 21:38:23 THEMIS nginx: 2024/03/19 21:38:23 [error] 16723#16723: shpool alloc failed
Mar 19 21:38:23 THEMIS nginx: 2024/03/19 21:38:23 [error] 16723#16723: nchan: Out of shared memory while allocating message of size 5492. Increase nchan_max_reserved_memory.
Mar 19 21:38:23 THEMIS nginx: 2024/03/19 21:38:23 [error] 16723#16723: *717484 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/notify?buffer_length=1 HTTP/1.1", host: "localhost"
Mar 19 21:38:23 THEMIS nginx: 2024/03/19 21:38:23 [error] 16723#16723: MEMSTORE:00: can't create shared message for channel /notify
Mar 19 21:38:24 THEMIS nginx: 2024/03/19 21:38:24 [crit] 16723#16723: ngx_slab_alloc() failed: no memory
Mar 19 21:38:24 THEMIS nginx: 2024/03/19 21:38:24 [error] 16723#16723: shpool alloc failed
Mar 19 21:38:24 THEMIS nginx: 2024/03/19 21:38:24 [error] 16723#16723: nchan: Out of shared memory while allocating message of size 4753. Increase nchan_max_reserved_memory.
Mar 19 21:38:24 THEMIS nginx: 2024/03/19 21:38:24 [error] 16723#16723: *717490 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Mar 19 21:38:24 THEMIS nginx: 2024/03/19 21:38:24 [error] 16723#16723: MEMSTORE:00: can't create shared message for channel /disks

PS: I'm using a Dell PowerEdge based on a Xeon 8c16t and 96Gb of Ram

Edited April 23 by Yonix
adding machine spec

martial · April 29

Instead of killing all nginx, I was looking for the master process (ie not the ones run by containerd-shim-runc-v2)

So far the simpler way I found is to run:

ps -axfo pid,ppid,uname,cmd | grep nginx | grep -v '\\_'

ie get the hierarchy of processes and remove the ones started by another process.

This returns only one value so far.

You can then kill the process and

/etc/rc.d/rc.nginx start

it again

techwiz2100 · May 13

Hey all,

Just wanted to chime in with a temporary workaround until Unraid fixes nginx.

In the nginx conf file (/etc/nginx/nginx.conf) add the following:

nchan_shared_memory_size 512M;

just before the line

include /etc/nginx/conf.d/servers.conf;

and that should increase the nchan memory limit from default of 128M to 512M.

Restart nginx for the parameter to take effect, then you can clean up /var/log/nginx/* and /var/log/syslog.*

Also, if this continues to cause problems, there's also an nchan parameter called 'nchan_message_timeout' which defaults to 1h. You can reduce this value to a shorter interval to clear/expire the messages in the queue faster. Be warned that messing with that setting may result in missed updates and could have other consequences.

Good luck.

martial · May 13

27 minutes ago, techwiz2100 said:
the nginx conf file (/etc/nginx/nginx.conf) add the following:
nchan_shared_memory_size 512M;
just before the line
include /etc/nginx/conf.d/servers.conf;
and that should increase the nchan memory limit from default of 128M to 512M.

Restart nginx for the parameter to take effect, then you can clean up /var/log/nginx/* and /var/log/syslog.*

Thank you for that.

In my case the servers.conf include is the last line in my configuration file.

The `/etc/rc.d/rc.nginx restart` confirms that the syntax is okay too ;)

Nemsys · May 29

As one previous poster commented, uninstalling the Unraid Connect plugin fixed this issue for me as well.

-Nemsys

tictoc · June 1

On 5/29/2024 at 1:21 AM, Nemsys said:

As one previous poster commented, uninstalling the Unraid Connect plugin fixed this issue for me as well.

-Nemsys

That seemed to work for me too!

Woogz · June 20

I have this issue but I don't have an /etc/nginx folder. I'd prefer not to uninstall Unraid Connect if I don't need to.

JonathanM · June 20

2 hours ago, Woogz said:

I don't have an /etc/nginx folder.

That screenshot appears to show the root folder of the krusader container running on Unraid, not Unraid's OS folder.

You need to use the console or ssh login to get to Unraid's root, type mc if you need a more graphical way to navigate around.

Rexile · June 25

I don't have Unraid Connect installed and it happens to me regularly as well. The restart nginx script from page 6 on this thread does help, but sadly it doesn't work for me from browsers, only when connecting to my server via commandline/SSH. Really hope they can fix this soon, it's really annoying.

Sanderluc · July 4

On 5/13/2024 at 4:20 PM, martial said:

Thank you for that.

In my case the servers.conf include is the last line in my configuration file.

The `/etc/rc.d/rc.nginx restart` confirms that the syntax is okay too

Never do this from inside the GUI. Otherwise Nginx won't come back online 😂

God_TM · July 4

On 6/25/2024 at 1:12 AM, Rexile said:

I don't have Unraid Connect installed and it happens to me regularly as well. The restart nginx script from page 6 on this thread does help, but sadly it doesn't work for me from browsers, only when connecting to my server via commandline/SSH. Really hope they can fix this soon, it's really annoying.

Couldn’t you put the command into a User Script and then kick out off to run in the background?

Sanderluc · July 7

On 7/5/2024 at 1:32 AM, God_TM said:

Couldn’t you put the command into a User Script and then kick out off to run in the background?

Already thought of that. I have already applied a patch now to prevent the crash.

nginx running out of shared memory

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Squid

semtex41

techwiz2100

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation