nginx running out of shared memory


DBJordan

Recommended Posts

6 hours ago, xthursdayx said:

Any ideas? For some reason, I'm currently unable to download my diagnostics, but any advice or ideas would be much appreciated. Cheers!

 

You posted the space available for your cache and others..  But how much space do you have left in /var?  Was it full?

 

The first thing I do is delete the syslog.1 to make some space so it can write the log, then I restart nginx.   Then I tail the syslog to see if the writes stop. My syslog.1 is usually huge and frees up a lot of space so it can write to the syslog

The time before last time, I still had some of those errors in the syslog after the restart..  So I was going to stop my dockers one by one and see if it stopped.  And well it did with my first one.

Two days ago when then happened to me..  I didn't have to do that.  the restart was all I needed...

Link to comment
  • Replies 74
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

@limetech  Is there any way you guys can start looking at this?  There are more and more folks that are seeing this.

 

For the rest of us...  Maybe we should start listing our active dockers to see if one of them is triggering the bug. Maybe there is one common to us that's the cause.  If we can give limetech a place to start so to more frequently trigger this condition, it would probably help them..

 

For me I have

Home_assist bunch

ESPHome

stuckless-sagetv-server-java16

CrashPlanPRO

crazifuzzy-opendct

Gafana

Influxdb

MQTT

 

I have no running VMs

My last two fails did not have any web terms open at all. I may have had a putty terminal, but I don't think that would cause it?

I do have the dashboard open on several machines (randomly powered on) at the same time..

 

Jim

 

Link to comment
6 hours ago, jbuszkie said:

 

You posted the space available for your cache and others..  But how much space do you have left in /var?  Was it full?

 

The first thing I do is delete the syslog.1 to make some space so it can write the log, then I restart nginx.   Then I tail the syslog to see if the writes stop. My syslog.1 is usually huge and frees up a lot of space so it can write to the syslog

The time before last time, I still had some of those errors in the syslog after the restart..  So I was going to stop my dockers one by one and see if it stopped.  And well it did with my first one.

Two days ago when then happened to me..  I didn't have to do that.  the restart was all I needed...

 

Yeah, you're right, I should have included that. I checked /var and it was not full. See:
 

df -h /var
Filesystem      Size  Used Avail Use% Mounted on
rootfs           32G  7.0G   25G  23% /

However /var/log does show as 100% full:
 

❯ df -h /var/log
Filesystem      Size  Used Avail Use% Mounted on
tmpfs           128M  128M     0 100% /var/log

I deleted syslog.1 and syslog.2, restarted nginx and tailed my syslog. No more nchan out of memory errors, but I'm getting this error constantly:
 

nginx: 2021/09/17 16:53:47 [alert] 2815#2815: worker process 11259 exited on signal 6

 

Link to comment

Just happened to me for first time. Running UnRaid 6.9.2. I caught it with the logs at 77% full. I deleted syslog.1 and dropped to 51%. Can I delete syslog, too, to get usage down more? I restarted nginx and errors stopped. Now replaced with the following error every 2 seconds:

 

Sep 30 08:41:23 unRaid5 nginx: 2021/09/30 08:41:23 [alert] 12576#12576: worker process 18277 exited on signal 6

 

How to stop this error message? Reboot?

 

Things I did yesterday that may or may not matter:

 

Installed shinobi

Stopped openvpn-as docker and installed and setup VPN Manager plugin

I had been using MS Edge on laptop with a pinned tab to UnRaid. Switched to Firefox, also pinned a tab to UnRaid, but both were running at the same time

 

unraid5-diagnostics-20210930-0843.zip

Link to comment
1 hour ago, jcato said:

 

 

Sep 30 08:41:23 unRaid5 nginx: 2021/09/30 08:41:23 [alert] 12576#12576: worker process 18277 exited on signal 6

 

How to stop this error message? Reboot?
 

 

When I had this (I believe) I stopped my dockers one by one until it stopped coming..  And for me it was just the first one...  Reboot will work as well...

 

Link to comment
  • 2 weeks later...

I'm also experiencing this issue. I'm running v6.9.2. 

 

I restarted nginx as others have and started tailing the logs and it seems to have stopped for the time being. 

 

2021/10/09 14:01:50 [error] 11655#11655: *1640711 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
2021/10/09 14:01:51 [crit] 11655#11655: ngx_slab_alloc() failed: no memory
2021/10/09 14:01:51 [error] 11655#11655: shpool alloc failed
2021/10/09 14:01:51 [error] 11655#11655: nchan: Out of shared memory while allocating channel /cpuload. Increase nchan_max_reserved_memory.
2021/10/09 14:01:51 [alert] 11655#11655: *1640714 header already sent while keepalive, client: 192.168.x.x, server: 0.0.0.0:80
2021/10/09 14:01:51 [alert] 10574#10574: worker process 11655 exited on signal 11
2021/10/09 14:01:51 [crit] 12579#12579: ngx_slab_alloc() failed: no memory
2021/10/09 14:01:51 [error] 12579#12579: shpool alloc failed
2021/10/09 14:01:51 [error] 12579#12579: nchan: Out of shared memory while allocating channel /cpuload. Increase nchan_max_reserved_memory.
2021/10/09 14:01:51 [error] 12579#12579: *1640716 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost"

 

Any help with this @limetech would be appreciated.  

Link to comment
  • 3 weeks later...

Im getting this error too, i received an email from my server this morning, i cant even remember how i set this up

 

error: error setting owner of /var/log/nginx/error.log to uid 0 and gid 0: Operation not permitted

 

could it be something to do with the logs not being able to be cleared?

Link to comment
  • 2 weeks later...
On 11/1/2021 at 6:56 PM, huskycdn said:

I've been having this issue for the past year currently (Version: 6.9.2) was using chrome switched to firefox. I just reboot the server, happens every other month or so. Would be nice to hear an official reply from limetech

 

I agree. I've seen this happening for who knows how long... I've got life to deal with and then this crap crops up. Thank goodness it's not stopping anything right now but it'd be nice to know that this won't end up being a bigger problem than "I can't access the web terminal".

Link to comment

It happens every 7-14 days for me since I updated to 6.9. I found this few lines (entered one after another in SSH) really fixing it every time the bug happens (ie. no restarts needed):

 

/usr/sbin/nginx -s stop
/usr/sbin/nginx
/etc/rc.d/rc.php-fpm restart

 

My 6.8 had easily 200+ days uptime without that issue.

Link to comment
1 hour ago, Squid said:

Q.  Are you guys perchance leaving the dashboard open on a browser tab and never closing it?

I am not..  I'm more diligent now about closing any web terminals I try to keep the browser on the setting page.

I did notice I got this once when I had a grafana webpage auto refreshing.  Not sure if that was related at all..

 

Even if we do have the dash board open all the time, it shouldn't matter??

Link to comment
On 11/13/2021 at 3:14 PM, Squid said:

Q.  Are you guys perchance leaving the dashboard open on a browser tab and never closing it?

I was in the 6.8, now I do not - not the dashboard, not any of the unraid pages.

Also, I noticed that when that happens all the websockets (and ajax calls) are failing - even if the dashboard page loads, the dockers or unassigned devices aren't going to be loaded, or maybe one the them will load, another will fail. I can't even shut down/restart from the gui - I need to time it between web server restarts. And the web terminal restarts every few secs, so my "fix" can only be applied via the normal SSH.

Link to comment
  • 1 month later...

had this a couple of times, was rebuilding so didn't really care as I was due restarts anyway.

 

Just had it whilst chasing what makes my Docker Log fill up so quickly.

 

Specifically I am running Microsoft Edge Version 97.0.1072.55 (Official build) (64-bit), and I opened the terminal from the dashboard, as soon as I entered the command

"co=$(docker inspect --format='{{.Name}}' $(docker ps -aq --no-trunc) | sed 's/^.\(.*\)/\1/' | sort); for c_name in $co; do c_size=$(docker inspect --format={{.ID}} $c_name | xargs -I @ sh -c 'ls -hl /var/lib/docker/containers/@/@-json.log' | awk '{print $5 }'); YE='\033[1;33m'; NC='\033[0m'; PI='\033[1;35m'; RE='\033[1;31m'; case "$c_size" in *"K"*) c_size=${YE}$c_size${NC};; *"M"*) p=${c_size%.*}; q=${p%M*}; r=${#q}; if [[ $r -lt 3 ]]; then c_size=${PI}$c_size${NC}; else c_size=${RE}$c_size${NC}; fi ;; esac; echo -e "$c_name $c_size"; done "

 

I started getting the nginx hissy fit

Jan 10 10:18:44 TheNewdaleBeast nginx: 2022/01/10 10:18:44 [error] 3871#3871: nchan: Out of shared memory while allocating message of size 10016. Increase nchan_max_reserved_memory.
Jan 10 10:18:44 TheNewdaleBeast nginx: 2022/01/10 10:18:44 [error] 3871#3871: *4554856 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/update2?buffer_length=1 HTTP/1.1", host: "localhost"
Jan 10 10:18:44 TheNewdaleBeast nginx: 2022/01/10 10:18:44 [error] 3871#3871: MEMSTORE:00: can't create shared message for channel /update2)

I have 256GB Memory, and was at less than 11% utilisation so that wasn't an obvious issue, the output from df -h shows no issues either.
Filesystem      Size  Used Avail Use% Mounted on
rootfs          126G  812M  126G   1% /
tmpfs            32M  2.5M   30M   8% /run
/dev/sda1        15G  617M   14G   5% /boot
overlay         126G  812M  126G   1% /lib/firmware
overlay         126G  812M  126G   1% /lib/modules
devtmpfs        126G     0  126G   0% /dev
tmpfs           126G     0  126G   0% /dev/shm
cgroup_root     8.0M     0  8.0M   0% /sys/fs/cgroup
tmpfs           128M   28M  101M  22% /var/log
tmpfs           1.0M     0  1.0M   0% /mnt/disks
tmpfs           1.0M     0  1.0M   0% /mnt/remotes
/dev/md1         13T  6.4T  6.4T  51% /mnt/disk1
/dev/md2         13T  9.5T  3.3T  75% /mnt/disk2
/dev/md4         13T  767G   12T   6% /mnt/disk4
/dev/md5        3.7T  3.0T  684G  82% /mnt/disk5
/dev/md8         13T  9.6T  3.2T  76% /mnt/disk8
/dev/md10        13T  9.1T  3.7T  72% /mnt/disk10
/dev/md11        13T  8.9T  3.9T  70% /mnt/disk11
/dev/md12        13T  7.0T  5.8T  55% /mnt/disk12
/dev/md15        13T  5.7T  7.2T  45% /mnt/disk15
/dev/sdq1       1.9T  542G  1.3T  30% /mnt/app-sys-cache
/dev/sdh1       1.9T   17M  1.9T   1% /mnt/download-cache
/dev/nvme0n1p1  932G  221G  711G  24% /mnt/vm-cache
shfs            106T   60T   46T  57% /mnt/user0
shfs            106T   60T   46T  57% /mnt/user
/dev/loop2       35G   13G   21G  38% /var/lib/docker
/dev/loop3      1.0G  4.2M  905M   1% /etc/libvirt
tmpfs            26G     0   26G   0% /run/user/0

 

I then ran the following commands in order to restore connectivity (after first restart I had no access)

root@TheNewdaleBeast:~# /etc/rc.d/rc.nginx restart
Checking configuration for correct syntax and
then trying to open files referenced in configuration...
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
Shutdown Nginx gracefully...
Found no running processes.
Nginx is already running
root@TheNewdaleBeast:~# /etc/rc.d/rc.nginx restart
Nginx is not running
root@TheNewdaleBeast:~# /etc/rc.d/rc.nginx start
Starting Nginx server daemon...

 

and normal service was restored

 

Jan 10 10:22:40 TheNewdaleBeast root: Starting unraid-api v2.26.14
Jan 10 10:22:40 TheNewdaleBeast root: Loading the "production" environment.
Jan 10 10:22:41 TheNewdaleBeast root: Daemonizing process.
Jan 10 10:22:41 TheNewdaleBeast root: Daemonized successfully!
Jan 10 10:22:42 TheNewdaleBeast unraid-api[13364]: ✔️ UNRAID API started successfully!

 

Not sure if this is helpful, but to me it appeared that issue was caused (or coincidental with) use of web terminal.

Link to comment
On 11/13/2021 at 4:14 PM, Squid said:

Q.  Are you guys perchance leaving the dashboard open on a browser tab and never closing it?

well yes pretty much. I like to keep an eye on my server with a multiple tabs for main, dashboard and several dockers opened up. normally several hours a day

Edited by huskycdn
grammer
Link to comment
12 hours ago, huskycdn said:

well yes pretty much. I like to keep an eye on my server with a multiple tabs for main, dashboard and several dockers opened up. normally several hours a day

This issue will hopefully disappear once 6.10 is released.  After a certain amount of time, Main / Dashboard will reload themselves

Link to comment
39 minutes ago, Squid said:

This issue will hopefully disappear once 6.10 is released.  After a certain amount of time, Main / Dashboard will reload themselves

Is @limetech even looking at this?  He chimed in a while ago and then crickets...  There really a lot of us that have this issue.  Now I don't know if this "a Lot" is still in the noise for the amount total unraid users so it's not a big enough issue...

I've just had this happen twice last week...  And then sometimes it goes for months without hitting this.. *sigh*

Link to comment

Most reports do not get replied to.  That doesn't mean nothing is being looked at.  As I said

3 hours ago, Squid said:

After a certain amount of time, Main / Dashboard will reload themselves

Which should solve at least some it is (the running out of shared memory) was one of the symptoms that resulted in the above.

Link to comment
3 hours ago, Squid said:

This issue will hopefully disappear once 6.10 is released.  After a certain amount of time, Main / Dashboard will reload themselves

7 minutes ago, Squid said:

Most reports do not get replied to.  That doesn't mean nothing is being looked at.  As I said

Which should solve at least some it is (the running out of shared memory) was one of the symptoms that resulted in the above.

 

I, Normally, don't have main or dashboard open.  For me, it's the docker window and a couple of the docker web interfaces.

And in the past it was open web terminals (I only use putty now!). So I don't know if the 6.10 release will fix for all..  

Fingers crossed that it will though!!! And I do hope that they are looking into this as it's slightly annoying...  Not bad enough to bug them constantly...  but always hoping for a post here from them saying that they found the issue! 🙂

 

Link to comment
  • 2 weeks later...

Having the same issue here on 6.9.2, these commands did resolve it without a reboot:

 

On 11/13/2021 at 9:06 PM, vforge said:
/usr/sbin/nginx -s stop
/usr/sbin/nginx
/etc/rc.d/rc.php-fpm restart

 

I do leave the dashboard open on some devices but would think there should be a way to do this without causing a catastrophe, perhaps it could timeout open dashboards after a certain time if it's causing a memory leak.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.