nginx running out of shared memory


DBJordan

Recommended Posts

  • Replies 65
  • Created
  • Last Reply

Top Posters In This Topic

Just now, John_M said:

 

Still using -rc2? 6.9.1 is the current version.

 

yeah. until now, updating hasn't been a high priority because i wasn't having any issues. it takes my r720 forever and a day to reboot, so i'm usually slow to update. will be updating today.

Link to comment

The only correlation I have figured out might have something to do with using the web browser terminals.  I've tried to avoid using them and I haven't had an instance in a while now.  I've also made sure they were all closed down right after I used them. Before I was using them and possibly leaving them open on different machines.

Not sure if that's it, but that's the only thing I can think of that might slightly correlate.

 

Link to comment
  • 4 weeks later...

This has happened a couple times to me as well. In the last few weeks I've left browsers open and viewing the dashboard page for many days at a time (I'm the guy with 10 windows, 50 tabs in each, and since I've been working on a few things I end up with a bunch of Unraid dashboard / Main windows open).

 

I'm using Chrome. I can confirm that running "/etc/rc.d/rc.nginx restart" via SSH resolved the problem for me (temporarily?) before my system completely locks up. Last time this happened the Web UI was completely hosed and I had to restart by force (I forgot to try SSH).

 

Link to comment
  • 1 month later...

I just had this problem myself. I usually keep Dashboard open in Firefox on my home PC always and when I'm away I also have it open while using my laptop. I'm running 6.9.2 and this is the first time I've run unto this issue. I rebooted my server before finding this temporary fix listed above. I know the webgui becomes unresponsive once the log fills up and I didn't want to chance it. The other thing I've been doing different is tinkering with VMs and I had left one on for a few hours but that shouldn't have caused it. Just throwing in my info in case it helps lead to a solution. I couldn't grab any logs, they were taking too long to compile. Just snagged a screenshot of the syslog so I could "google" it which led me here.

Link to comment
  • 2 weeks later...
  • 3 weeks later...

I've also noticed the /var/log/nginx throwing the out of shared memory error...

Spamming this:

2021/06/19 01:10:53 [crit] 6642#6642: ngx_slab_alloc() failed: no memory
2021/06/19 01:10:53 [error] 6642#6642: shpool alloc failed
2021/06/19 01:10:53 [error] 6642#6642: nchan: Out of shared memory while allocating channel /cpuload. Increase nchan_max_reserved_memory.
2021/06/19 01:10:53 [error] 6642#6642: *5824862 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost"
2021/06/19 01:10:53 [crit] 6642#6642: ngx_slab_alloc() failed: no memory
2021/06/19 01:10:53 [error] 6642#6642: shpool alloc failed
2021/06/19 01:10:53 [error] 6642#6642: nchan: Out of shared memory while allocating channel /var. Increase nchan_max_reserved_memory.
2021/06/19 01:10:53 [alert] 6642#6642: *5824863 header already sent while keepalive, client: 10.9.0.240, server: 0.0.0.0:80
2021/06/19 01:10:53 [alert] 27152#27152: worker process 6642 exited on signal 11
2021/06/19 01:10:53 [crit] 6798#6798: ngx_slab_alloc() failed: no memory

 

Link to comment

I've also been having this issue. Has anyone found a solution? Or at least a way to increase the log size permanently, since 128M seems pretty small and for some reason when the logs roll they're saved in the same directory, which kinda negates the point of rolling in the first place (so the log doesn't fill up in the first place!).

Link to comment

The actual problem is that nginx keeps crashing, not that the logs get full. Each time nginx crashes, it logs it, and restarts. It then crashes again, logs it, and restarts. this keeps happening, which is why the logs quickly get filled. But the full log itself isn't the problem, and isn't why Unraid slogs down until you reboot. It's the bug causing nginx to crash that's the problem

Link to comment

I'm on Unraid Version 6.9.2 now and this still keeps happening to me. 

 

I increased my log size to 1gb but it just keeps on filling up and I don't want to keep on restarting so i just delete the old syslogs and nginx logs (both filled with same errors).

 

Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: nchan: Out of shared memory while allocating channel /disks. Increase nchan_max_reserved_memory.
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: *1125090 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [crit] 3377#3377: ngx_slab_alloc() failed: no memory
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: shpool alloc failed

 

I do keep multiple chrome tabs with unraid open for a very long time but this was never a problem before, it just started in december, fixed itself and now it came back?

unraidserver-diagnostics-20210703-1357.zip

Link to comment
On 7/3/2021 at 7:59 AM, Inenting said:

I'm on Unraid Version 6.9.2 now and this still keeps happening to me. 

 

I increased my log size to 1gb but it just keeps on filling up and I don't want to keep on restarting so i just delete the old syslogs and nginx logs (both filled with same errors).

 


Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: nchan: Out of shared memory while allocating channel /disks. Increase nchan_max_reserved_memory.
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: *1125090 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [crit] 3377#3377: ngx_slab_alloc() failed: no memory
Jul  3 04:45:04 unraidserver nginx: 2021/07/03 04:45:04 [error] 3377#3377: shpool alloc failed

 

I do keep multiple chrome tabs with unraid open for a very long time but this was never a problem before, it just started in december, fixed itself and now it came back?

unraidserver-diagnostics-20210703-1357.zip 448.46 kB · 0 downloads

again, the logs filling up are a symptom, not the disease. You can increase the log size to the size of your array, it's nginx crashing that's the problem, not the log generated by that

Link to comment
  • 1 month later...

Undraid 6.9.2

 

I had this start being an issue for me too, and I'd just recently switched over to using Safari on a Mac. I always keep a tab open to my Unraid machine. After I found this and the query re: Safari I switched to Brave (was just giving Safari a chance really), rebooted the server to clear the log storage and haven't seen the message or an increase in logs (still on 1%) in the last ~6d. 

Link to comment

This happened again to me last night.  I've been really good about not leaving web terminal windows open.  Well..  I didn't have any last night open  or for a while.

What I did have different was I had set my grafana window to auto refresh every 10s.  I wonder if that had anything to do with this problem?

 

Also..  The restart command didn't completely fix it this time.  I was still getting a bunch of these...

Aug 23 09:09:49 Tower nginx: 2021/08/23 09:09:49 [alert] 25382#25382: worker process 1014 exited on signal 6
Aug 23 09:09:50 Tower nginx: 2021/08/23 09:09:50 [alert] 25382#25382: worker process 1044 exited on signal 6
Aug 23 09:09:51 Tower nginx: 2021/08/23 09:09:51 [alert] 25382#25382: worker process 1202 exited on signal 6
Aug 23 09:09:53 Tower nginx: 2021/08/23 09:09:53 [alert] 25382#25382: worker process 1243 exited on signal 6
Aug 23 09:09:54 Tower nginx: 2021/08/23 09:09:54 [alert] 25382#25382: worker process 1275 exited on signal 6
Aug 23 09:09:55 Tower nginx: 2021/08/23 09:09:55 [alert] 25382#25382: worker process 1311 exited on signal 6
Aug 23 09:09:56 Tower nginx: 2021/08/23 09:09:56 [alert] 25382#25382: worker process 1342 exited on signal 6
Aug 23 09:09:57 Tower nginx: 2021/08/23 09:09:57 [alert] 25382#25382: worker process 1390 exited on signal 6
Aug 23 09:09:58 Tower nginx: 2021/08/23 09:09:58 [alert] 25382#25382: worker process 1424 exited on signal 6
Aug 23 09:09:59 Tower nginx: 2021/08/23 09:09:59 [alert] 25382#25382: worker process 1455 exited on signal 6

 

I started to kill my dockers and after I stopped the HASSIO group, That message stopped.  I restarted the docker group and it hasn't comeback.

 

I really wish we could get to the bottom of this!! 

FYI..  I'm now on 6.9.2

 

Link to comment
  • 2 weeks later...

Hi

I found this page after experiencing the same issue myself. 

Running 6.9.2.

I am guilty of leaving Chrome windows open with my Dashboard and/or Docker container list, which i've now closed.

 

My log file hit 100% and i eventually managed to increase it (my terminal window keeps closing and resetting), by giving more space.
 

I've also ran the '/etc/rc.d/rc.nginx restart' command. 


When i saw 'nginx' in the log, i wondered if it was a problem with my SWAG container, which i've closed.

None of these measures have stopped the terminal window connection from constantly closing.

 

No errors were found on an extended 'Fix Common Problems' scan.

I have recently switched Plex to use a ram disk for transcodes, but i've never seen it anywhere near max out the system (i'm running 24gb) ram). 

I'm going to try a reboot.

 

Any thoughts or suggestions appreciated!

 

Cheers!

darren

 

syslog.zip

Edited by dp100
typos!
Link to comment
2 hours ago, Flemming said:

 

I have the same problem and /etc/rc.d/rc.nginx don't fix anything, not even temporarily

 

strange that /etc/rc.d/rc.nginx restart didn't fix it.

I assume you made room in /var/log for more messages to come through?

 

After the restart did you still have stuff spewing in the log file?

 

I do recall a time where I had to do a restart to fix it completely.

 

Jim

Link to comment
6 hours ago, jbuszkie said:

 

strange that /etc/rc.d/rc.nginx restart didn't fix it.

I assume you made room in /var/log for more messages to come through?

 

After the restart did you still have stuff spewing in the log file?

 

I do recall a time where I had to do a restart to fix it completely.

 

Jim

Hi Jim,
Thanks for the comments.

 

I'm fairly sure that '/etc/rc.d/rc.nginx restart' didn't fix it, but as the terminal window remained unusable (constantly closing/opening a connection) i was struggling to properly assess things, plus it was getting late.

In the end i lost my patience and rebooted the system as i went to bed.  Since then, the server has been fine and my log has stayed at 1%.


Although it seems 'fixed' for now, i'd love to know what the cause is. 

Considering the number of users complaining of this issue affecting them too, there must be some underlying common bug or cause...

 

cheers

darren


 

Link to comment
  • 2 weeks later...

I've just started running into this issue as well, on a machine running unRAID 6.10.0-rc1 with 64Gb of memory, which has always been plenty, though I too am guilty of leaving unRAID WebUI tabs open (as well as mosh/ssh sessions running). Interestingly, I'm able to access the pages of the WebUI and see the header and nav buttons, as well as the major sections of pages like the Dashboard and Main, however they're empty, not showing my disks, docker containers, etc. The only way I've been able to (temporarily) “fix” the issue is by restarting my server.

 

I tried restarting /etc/rc.d/rc.nginx/, but it didn't make any difference, as you can see in these logs:

Sep 17 02:25:29 vulfTower nginx: 2021/09/17 02:25:29 [alert] 25209#25209: worker process 32623 exited on signal 6
Sep 17 02:25:29 vulfTower nginx: 2021/09/17 02:25:29 [alert] 25209#25209: worker process 32645 exited on signal 6
Sep 17 02:25:31 vulfTower nginx: 2021/09/17 02:25:31 [alert] 25209#25209: worker process 32647 exited on signal 6
Sep 17 02:25:31 vulfTower nginx: 2021/09/17 02:25:31 [alert] 25209#25209: worker process 334 exited on signal 6
Sep 17 02:25:31 vulfTower nginx: 2021/09/17 02:25:31 [alert] 25209#25209: worker process 339 exited on signal 6
Sep 17 02:25:32 vulfTower nginx: 2021/09/17 02:25:32 [alert] 25209#25209: worker process 350 exited on signal 6
Sep 17 02:25:33 vulfTower nginx: 2021/09/17 02:25:33 [alert] 25209#25209: worker process 405 exited on signal 6
Sep 17 02:25:33 vulfTower nginx: 2021/09/17 02:25:33 [alert] 25209#25209: worker process 565 exited on signal 6
Sep 17 02:25:33 vulfTower nginx: 2021/09/17 02:25:33 [alert] 25209#25209: worker process 591 exited on signal 6
Sep 17 02:25:34 vulfTower nginx: 2021/09/17 02:25:34 [alert] 25209#25209: worker process 594 exited on signal 6
Sep 17 02:25:34 vulfTower rsyslogd: file '/var/log/syslog'[2] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2102.0 try https://www.rsys
log.com/e/2027 ]
Sep 17 02:25:34 vulfTower rsyslogd: action 'action-0-builtin:omfile' (module 'builtin:omfile') message lost, could not be processed. Check for additional error messages before this one. [v8.2102.0 try https:/
/www.rsyslog.com/e/2027 ]
Sep 17 02:25:34 vulfTower rsyslogd: rsyslogd[internal_messages]: 561 messages lost due to rate-limiting (500 allowed within 5 seconds)
Sep 17 02:25:34 vulfTower rsyslogd: file '/var/log/syslog'[2] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2102.0 try https://www.rsys
log.com/e/2027 ]
Sep 17 02:25:34 vulfTower rsyslogd: action 'action-0-builtin:omfile' (module 'builtin:omfile') message lost, could not be processed. Check for additional error messages before this one. [v8.2102.0 try https:/
/www.rsyslog.com/e/2027 ]

ad infinitum

 

I'm also getting these errors in my logs:

Sep 17 02:32:54 vulfTower nginx: 2021/09/17 02:32:54 [alert] 25209#25209: worker process 4936 exited on signal 6


Anyway, I tried running du to check my log sizes as well as df to check how full my cache drives and boot flash drive are, to see if I could identify the issue. You can see the results below. While my NGINX and Syslog are large, neither seem to be large enough to disable access to the WebUI, and neither of my cache drives (where I store my syslog backup).

❯ du -sh /var/log/*
4.0K	/var/log/apcupsd.events
0	/var/log/btmp
0	/var/log/cron
0	/var/log/debug
88K	/var/log/dmesg
1012K	/var/log/docker.log
0	/var/log/faillog
2.8M	/var/log/file.activity.log
16K	/var/log/gitflash
4.0K	/var/log/lastlog
4.0K	/var/log/libvirt
4.0K	/var/log/maillog
0	/var/log/messages
0	/var/log/nfsd
52M	/var/log/nginx
0	/var/log/packages
0	/var/log/pkgtools
0	/var/log/plugins
0	/var/log/preclear.disk.log
0	/var/log/pwfail
0	/var/log/removed_packages
0	/var/log/removed_scripts
0	/var/log/removed_uninstall_scripts
20K	/var/log/samba
0	/var/log/scripts
0	/var/log/secure
0	/var/log/setup
0	/var/log/spooler
0	/var/log/swtpm
69M	/var/log/syslog
3.6M	/var/log/syslog.1
0	/var/log/vfio-pci
8.0K	/var/log/wtmp

 

❯ df -h /mnt/cache
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       466G  306G  157G  67% /mnt/cache

 

❯ df -h /mnt/cache_io
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdk1       932G  244G  688G  27% /mnt/cache_io

 

❯ df -h /boot
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       7.5G  893M  6.6G  12% /boot

 

Any ideas? For some reason, I'm currently unable to download my diagnostics, but any advice or ideas would be much appreciated. Cheers!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.