[6.12] Unraid webui stop responding then Nginx crash

bonienl · July 16, 2023

53 minutes ago, SpaceInvader said:

Another thing I just noticed is that with IPv6 enabled there always is this /usr/sbin/atd running a nginx reload every three seconds

This is triggered because the DHCP client on your system is continuously adding and removing IP addresses.

It is unclear yet to me why this is happening though. Still studying the diagnostics to find a clue.

I have seen similar behavior in other diagnostics (together with failing services) and disabling IPv6 and run IPv4 only solves the problem.

bonienl · July 16, 2023

@SpaceInvader

Can you test ipv6 and start the system in safe mode?

Zuim · July 17, 2023

2 hours ago, bonienl said:

@SpaceInvader

Can you test ipv6 and start the system in safe mode?

IPv6 connectivity definitely works in both directions. The nas has internet access and can be reached from the local network through ipv6. The reason I want to use IPv6 in the first place is that it is more reliable and faster with my provider, since they use DSLite.

Interesting that it is continuously changing IPs I didn't see that in the log.

I'll reboot it in safe mode with ipv6 on and will add the diagnostics, if it crashes again.

Zuim · July 17, 2023

So it just crashed in safe mode. I also disabled my vpn, before booting to safemode, to also exclude that.

The crash was at about 2:48 in the log. At about 2:32 I tried enabling and then disabling nfs again, since it also seems weird that the nfs/rpc stuff attempts to start, even though it is disabled in the ui.

The array was stopped during this.

The nginx error log is also attached, since I think it is not included in the diag zip and it contains more lines than the syslog.

-------

I also just did another test with my second network interface disabled only ipv6 enabled (instead of both). It had the same crash in the attached file. There are also a bunch of these mesasages

 emhttpd: error: get_limetech_time, 251: Connection timed out (110): -2 (7)

which seems to be an unrelated bug, with some unraid server not supporting ipv6.

I'm out of ideas, since I basically disabled everything possible and it still crashes.

nas-diagnostics-20230717-0250.zip nginx_error.log nas-diagnostics-20230717-0412_only_ipv6.zip

Edited July 17, 2023 by SpaceInvader
add ipv6 only

coolspot · July 17, 2023

On 6/25/2023 at 10:22 AM, H3ms said:
No crash since i'm not using the webgui at all...

But i have a lot of a new error
 [2023/06/25 16:19:13.971533,  0] ../../source3/nmbd/nmbd_packets.c:761(queue_query_name)
Jun 25 16:19:13 NAS-ICARUS nmbd[21478]:   queue_query_name: interface 2 has NULL IP address !
nas-icarus-diagnostics-20230625-1619.zip

Is there an update on this error? I'm seeing it in my logs, it doesn't look to be critical, just a nuisance?

What is interface 2 - is that ETH1?

In my case, ETH1 is bonded to ETH0, so it won't have it's own IP address?

Edited July 17, 2023 by coolspot

bonienl · July 17, 2023

7 hours ago, SpaceInvader said:

So it just crashed in safe mode. I also disabled my vpn, before booting to safemode, to also exclude that.

Thanks for testing

8 hours ago, SpaceInvader said:

I tried enabling and then disabling nfs again

This is weird in your logs, because NFS is not enabled, it should not get started at all. Yet it does and gives lots of errors.

In my testing enabling / disabling NFS in the GUI gives the correct behavior when starting the system. I never get the errors seen in your log. I checked all your config files and can't find anything wrong. A mystery!

Somehow I think the RPC/NFS errors are related to the NGINX errors, or in other words the source of these errors make these services fail.

8 hours ago, SpaceInvader said:

The nginx error log is also attached,

Can you post the content of file: etc/nginx/conf.d/servers.conf

8 hours ago, SpaceInvader said:

which seems to be an unrelated bug,

Yeah unrelated, has nothing to do with the problem. Limetech site doesn't respond to IPv6.

8 hours ago, SpaceInvader said:

I'm out of ideas,

We keep on investigating this issue.

bonienl · July 17, 2023

6 hours ago, coolspot said:

Is there an update on this error? I'm seeing it in my logs, it doesn't look to be critical, just a nuisance?

This is a bug in netbios. You can ignore the message or disable netbios to stop it.

stanger89 · July 17, 2023

I think I might be seeing a similar issue, after updating to 6.12, I've been documenting it in another thread (possibly the wrong place):

windiz · July 17, 2023

I haven't yet updated to 6.12.3. Since everything is working except webui and it's laborious to shut all services down on VMs and to start them up again. Strange thing is that I have another unraid server which has almost identical config and is in same network than the server which has webui crashing. I upgraded unraid version for both of them at the same time but that server 2 has been running without webui crashing since update.

Difference between them is that unraid server 1 is the "main" server with more containers and also VMs but the server 2 is just file storage. I don't know if that has anything to do with the problem.

Unraid server 1 (the one with webui crashing):

Containers:

elasticsearch
piwigo
endlessh
netdata
diskover
mariadb
mysql
Plex
deluge

Plugins:

community.applications.plg - 2023.07.03 (Up to date)
dynamix.active.streams.plg - 2023.02.19 (Up to date)
dynamix.cache.dirs.plg - 2023.02.19 (Up to date)
dynamix.file.integrity.plg - 2023.03.26 (Up to date)
dynamix.system.temp.plg - 2023.02.04b (Up to date)
file.activity.plg - 2023.06.15 (Up to date)
fix.common.problems.plg - 2023.04.26 (Up to date)
open.files.plg - 2023.06.12 (Up to date)
tips.and.tweaks.plg - 2023.07.05 (Up to date)
unbalance.plg - v2021.04.21 (Up to date)
unRAIDServer.plg - 6.12.2
user.scripts.plg - 2023.03.29 (Up to date)

VMs:

3 VMs running

Unraid2 (stable):

Containers:

netdata

Plugins:

community.applications.plg - 2023.07.21 (Update available: 2023.07.03)
dynamix.file.integrity.plg - 2023.03.26 (Up to date)
fix.common.problems.plg - 2023.04.26 (Update available: 2023.07.16)
unassigned.devices.plg - 2023.07.04 (Update available: 2023.07.16)
unassigned.devices-plus.plg - 2023.04.15 (Up to date)
unbalance.plg - v2021.04.21 (Up to date)
unRAIDServer.plg - 6.12.2
user.scripts.plg - 2023.03.29 (Update available: 2023.07.16)

VMs:

No VMs

Zuim · July 17, 2023

8 hours ago, bonienl said:

Can you post the content of file: etc/nginx/conf.d/servers.conf

The file is attached. Btw the webui is accessible using both the ipv6 and ipv4 directly.

That I can't get rpc nfs to not start also seems very weird to me. Maybe I'll try making a completely clean install on another usb stick later to see if there is something going on with my install.

servers.conf

So after testing the fresh install I just created with the USB Creator I got the exact same issue! The only thing I did was enable ipv4+ipv6 and ssh.

diagnostics-fresh-install.zip

Edited July 17, 2023 by SpaceInvader

sbihero · July 17, 2023

I would like to add another data point for this issue.

I'm new to the unraid and just set it up for 2 weeks. I started with 6.12.2. Everthing works great until 2days ago.

The symptom is exactly same as this post. The webgui crashed after a few hours running and I have to kill the nginx and restart it to access the webgui. More interesting thing is that, I restarted my server when the first time I occured this issue. At that time, even though the array haven't running, all the dockers and VMs haven't started, the webgui was still crashed.

After reading though this post, I highly suspect this is related to ipv6. Because as far as I remember, the last thing I was doing on my server is setting up Qbittorrnet docker. One process I'm done is enable ipv6 in my unraid network setting since the bt tracker I'm using only allow ipv6 connection.

Let me know if I should attach anything here or make a new post to help solve this issue. Thanks!

sbihero · July 17, 2023

1 hour ago, SpaceInvader said:

So after testing the fresh install I just created with the USB Creator I got the exact same issue! The only thing I did was enable ipv4+ipv6 and ssh.

diagnostics-fresh-install.zip

I think we end up with same conclusion. At this point I wouldn't worry about ssh since I haven't enable ssh yet when my first crash happened.

Zuim · July 17, 2023

26 minutes ago, sbihero said:

I think we end up with same conclusion. At this point I wouldn't worry about ssh since I haven't enable ssh yet when my first crash happened.

yeah, the only reason I enabled ssh was, to be able to use the diagnostics command and restart nginx, without a reboot, since that would loose the syslog.

The fresh install I just tested also did not have any storage devices setup and not even activated the trial (so a bunch of stuff stayed disabled).

--------------

Big update! I just found that setting address assignment to static for ipv6 resolves the atd process reloading nginx and there are no more entries in the nginx error log.

The address settings are unmodified from the suggested values.

So this probably means there is an issue with the unraid dhcp for ipv6.

Edited July 17, 2023 by SpaceInvader

bonienl · July 17, 2023

1 hour ago, SpaceInvader said:

So after testing the fresh install I just created with the USB Creator I got the exact same issue!

It is not exactly the same. In this fresh install NFS is not started (correct) because it isn't enabled, while your previous log showed NFS getting started (wrong) with lots of errors.

It is really strange that from the moment nginx is started there are errors, can't explain that.

IPv4 and IPv6 both look alright.

Don't know if it is related to your ethernet controller card, which is a Realtek 2.5G version (but operating on 1G).

If you have another NIC perhaps it is worth to try.

Zuim · July 17, 2023

1 minute ago, bonienl said:

Don't know if it is related to your ethernet controller card, which is a Realtek 2.5G version (but operating on 1G).

If you have another NIC perhaps it is worth to try.

In my other tests with the full os I actually used the 1Gig nic (pci card) to connect to the router. In my usual setup the 2.5Gbit is in bridge mode to my pc. But I tested both individually with the other disabled.

I don't know, if you saw my last message yet, but it seems to be related to the dhcp function specifically.

bonienl · July 17, 2023

34 minutes ago, SpaceInvader said:

Big update! I just found that setting address assignment to static for ipv6 resolves the atd process reloading nginx and there are no more entries in the nginx error log.

Wow, great find. Let me digest this and see if I can come with a possible solution.

Quote

I don't know, if you saw my last message yet, but it seems to be related to the dhcp function specifically.

Maybe, maybe not, but at least we have a pointer to work on.

Thx

bonienl · July 17, 2023

@SpaceInvader

Question: when you change back to DHCP assignment, does the problem come back too?

Zuim · July 17, 2023

10 minutes ago, bonienl said:

@SpaceInvader

Question: when you change back to DHCP assignment, does the problem come back too?

yes it comes back and when setting it to auto again it goes away again.

I attached a log, where I switch from static to automatic at 0:47, which results in the nginx errors again and then switch back at 0:49.

nas-diagnostics-20230718-0050.zip

stanger89 · July 18, 2023

I just had my server go unresponsive. Things had been going fine for several hours, then I started the Jellyfin container and the whole server went tango uniform. As usual for me when my server is in this state there's no way to recover the diagnostics, but I did capture this on my syslog server:

Jul 17 20:00:57	unRAID	kern	info	kernel	docker0: port 2(veth52713cd) entered blocking state
Jul 17 20:00:57	unRAID	kern	info	kernel	docker0: port 2(veth52713cd) entered disabled state
Jul 17 20:00:57	unRAID	kern	info	kernel	device veth52713cd entered promiscuous mode
Jul 17 20:01:07	unRAID	kern	info	kernel	eth0: renamed from vethb8bbec1
Jul 17 20:01:07	unRAID	kern	info	kernel	IPv6: ADDRCONF(NETDEV_CHANGE): veth52713cd: link becomes ready
Jul 17 20:01:07	unRAID	kern	info	kernel	docker0: port 2(veth52713cd) entered blocking state
Jul 17 20:01:07	unRAID	kern	info	kernel	docker0: port 2(veth52713cd) entered forwarding state
Jul 17 20:01:16	unRAID	daemon	warning	php-fpm[7233]	[WARNING] [pool www] child 30889 exited on signal 9 (SIGKILL) after 43.290647 seconds from start
Jul 17 20:01:18	unRAID	daemon	warning	php-fpm[7233]	[WARNING] [pool www] child 30890 exited on signal 9 (SIGKILL) after 45.306223 seconds from start
Jul 17 20:01:20	unRAID	daemon	warning	php-fpm[7233]	[WARNING] [pool www] child 30891 exited on signal 9 (SIGKILL) after 47.270700 seconds from start

When I get it back up I'll either disable IPv6 entirely or follow SpaceInvader's instructions above and see if it resolves things for me.

-edit

Turns out I already had it set to IPv4 only. I tried changing it to IPv4+IPv6 and setting the IPv6 to Static using the settings in SpaceInvader's post but then I got an error starting Docker about 2a02:908:1060:c0::7c31 being an invalid address. So I was unable to try that.

Edited July 18, 2023 by stanger89

Doogs · July 18, 2023

19 hours ago, stanger89 said:

Turns out I already had it set to IPv4 only. I tried changing it to IPv4+IPv6 and setting the IPv6 to Static using the settings in SpaceInvader's post but then I got an error starting Docker about 2a02:908:1060:c0::7c31 being an invalid address. So I was unable to try that.

You can't set your IPv6 address to the same one in someone else's post. Set your box to "automatic assignment"

stanger89 · July 18, 2023

Unfortunately I've got IPv6 disabled on my router too, so automatic assignment didn't populate with anything. I've reverted back to 6.11.5 until all this gets sorted out.

srirams · July 19, 2023

I'm on 6.12.3 and I still get this behavior

Jul 19 03:37:45 trantor nginx: 2023/07/19 03:37:45 [alert] 30307#30307: worker process 6589 exited on signal 6
Jul 19 03:37:46 trantor nginx: 2023/07/19 03:37:46 [alert] 30307#30307: worker process 6769 exited on signal 6
Jul 19 03:37:48 trantor nginx: 2023/07/19 03:37:48 [alert] 30307#30307: worker process 6819 exited on signal 6
Jul 19 03:37:50 trantor nginx: 2023/07/19 03:37:50 [alert] 30307#30307: worker process 6888 exited on signal 6
Jul 19 03:37:55 trantor nginx: 2023/07/19 03:37:55 [alert] 30307#30307: worker process 7004 exited on signal 6
Jul 19 03:37:58 trantor nginx: 2023/07/19 03:37:58 [alert] 30307#30307: worker process 7256 exited on signal 6

Starting a cloudflared docker seems to trigger this, but it doesn't get fixed immediately after stopping the cloudflared docker container either.

Zuim · July 19, 2023

Since setting the IPv6 config to static it has been stable! The nfs service also stopped attempting to start.
Unfortunately I can't keep it static forever, since my provider semi-regularly changes the assigned prefix.

I think the issue stanger89 has is different, since this did not occur for me.

On 7/18/2023 at 3:19 AM, stanger89 said:

Jul 17 20:01:16	unRAID	daemon	warning	php-fpm[7233]	[WARNING] [pool www] child 30889 exited on signal 9 (SIGKILL) after 43.290647 seconds from start

as opposed to these messages, when it crashes for me

Quote

[alert] 5835#5835: worker process 25751 exited on signal 6
[alert] 5835#5835: shared memory zone "memstore" was locked by 25751

also only the webui is unresponsive. SSH and other services still work after the webui crashes.

Beermedlar · July 20, 2023

After upgrade 6.12.3, still have problems about Nginx and ipv6. I felt tired and decided to give up. I will use TrueNas rebuild my system on this weekend.

chlovejoy60 · July 21, 2023

On 7/20/2023 at 1:48 AM, Beermedlar said:

After upgrade 6.12.3, still have problems about Nginx and ipv6. I felt tired and decided to give up. I will use TrueNas rebuild my system on this weekend.

I've done the same, my patience has been exhausted. Now running Version: 6.11.5 - so far so good.

[6.12] Unraid webui stop responding then Nginx crash

User Feedback

Recommended Comments

bonienl 1768

Link to comment

bonienl 1768

Link to comment

Zuim 2

Link to comment

Zuim 2

Link to comment

coolspot 10

Link to comment

bonienl 1768

Link to comment

bonienl 1768

Link to comment

stanger89 0

Link to comment

windiz 2

Link to comment

Zuim 2

Link to comment

sbihero 1

Link to comment

sbihero 1

Link to comment

Zuim 2

Link to comment

bonienl 1768

Link to comment

Zuim 2

Link to comment

bonienl 1768

Link to comment

bonienl 1768

Link to comment

Zuim 2

Link to comment

stanger89 0

Link to comment

Doogs 2

Link to comment

stanger89 0

Link to comment

srirams 2

Link to comment

Zuim 2

Link to comment

Beermedlar 1

Link to comment

chlovejoy60 1

Link to comment

Join the conversation