Nginx Errors Filling Logs


Recommended Posts

I had this (Among other weirdness with UnRaid) issue with a previous version of UnRaid, so I decided to start fresh and install a new version of UnRaid and only import my data. I rebuilt my docker containers (did not re-use docker.img) and did resuse my libert.img so my VM's came across no issues. 

 

After everything was said and done, everything was working great for a couple days. This morning though once again I'm seeing these errors in the syslog which I was also seeing in the logs before I did the reinstall:

 

Apr  7 06:55:06 Unraid-Host nginx: 2020/04/07 06:55:06 [alert] 25415#25415: worker process 4423 exited on signal 11
Apr  7 06:55:07 Unraid-Host kernel: nginx[4424]: segfault at 38 ip 00000000004dc37e sp 00007ffd526fdad0 error 6 in nginx[421000+105000]
Apr  7 06:55:07 Unraid-Host kernel: Code: 5c 41 5d 41 5e c3 66 0f 1f 44 00 00 48 8b 7c 24 08 e8 c6 d4 f4 ff 49 89 c6 48 85 c0 0f 88 aa 01 00 00 48 89 ef e8 32 c0 02 00 <4c> 89 70 08 66 0f 1f 44 00 00 b8 01 00 00 00 48 83 c4 10 5b 5d 41
Apr  7 06:55:07 Unraid-Host nginx: 2020/04/07 06:55:07 [alert] 25415#25415: worker process 4424 exited on signal 11
Apr  7 06:55:07 Unraid-Host kernel: nginx[4428]: segfault at 38 ip 00000000004dc37e sp 00007ffd526fdad0 error 6 in nginx[421000+105000]
Apr  7 06:55:07 Unraid-Host kernel: Code: 5c 41 5d 41 5e c3 66 0f 1f 44 00 00 48 8b 7c 24 08 e8 c6 d4 f4 ff 49 89 c6 48 85 c0 0f 88 aa 01 00 00 48 89 ef e8 32 c0 02 00 <4c> 89 70 08 66 0f 1f 44 00 00 b8 01 00 00 00 48 83 c4 10 5b 5d 41
Apr  7 06:55:07 Unraid-Host nginx: 2020/04/07 06:55:07 [alert] 25415#25415: worker process 4428 exited on signal 11
Apr  7 06:55:07 Unraid-Host kernel: nginx[4430]: segfault at 38 ip 00000000004dc37e sp 00007ffd526fdad0 error 6 in nginx[421000+105000]
Apr  7 06:55:07 Unraid-Host kernel: Code: 5c 41 5d 41 5e c3 66 0f 1f 44 00 00 48 8b 7c 24 08 e8 c6 d4 f4 ff 49 89 c6 48 85 c0 0f 88 aa 01 00 00 48 89 ef e8 32 c0 02 00 <4c> 89 70 08 66 0f 1f 44 00 00 b8 01 00 00 00 48 83 c4 10 5b 5d 41
Apr  7 06:55:07 Unraid-Host emhttpd: error: publish, 244: Connection reset by peer (104): read
Apr  7 06:55:07 Unraid-Host nginx: 2020/04/07 06:55:07 [alert] 25415#25415: worker process 4430 exited on signal 11
Apr  7 06:55:07 Unraid-Host kernel: nginx[4431]: segfault at 38 ip 00000000004dc37e sp 00007ffd526fdad0 error 6 in nginx[421000+105000]
Apr  7 06:55:07 Unraid-Host kernel: Code: 5c 41 5d 41 5e c3 66 0f 1f 44 00 00 48 8b 7c 24 08 e8 c6 d4 f4 ff 49 89 c6 48 85 c0 0f 88 aa 01 00 00 48 89 ef e8 32 c0 02 00 <4c> 89 70 08 66 0f 1f 44 00 00 b8 01 00 00 00 48 83 c4 10 5b 5d 41

 

When I see these logs, I'm not able to get a consistent connection to the UI. It'll hang and I have to reload the page multiple times. I then restart Nginx and then will see this is the logs indefinately:

 

Apr  7 06:56:35 Unraid-Host nginx: 2020/04/07 06:56:35 [alert] 5073#5073: worker process 6418 exited on signal 6
Apr  7 06:56:37 Unraid-Host nginx: 2020/04/07 06:56:37 [alert] 5073#5073: worker process 6481 exited on signal 6
Apr  7 06:56:39 Unraid-Host nginx: 2020/04/07 06:56:39 [alert] 5073#5073: worker process 6506 exited on signal 6
Apr  7 06:56:41 Unraid-Host nginx: 2020/04/07 06:56:41 [alert] 5073#5073: worker process 6520 exited on signal 6
Apr  7 06:56:43 Unraid-Host nginx: 2020/04/07 06:56:43 [alert] 5073#5073: worker process 6594 exited on signal 6
Apr  7 06:56:45 Unraid-Host nginx: 2020/04/07 06:56:45 [alert] 5073#5073: worker process 6599 exited on signal 6
Apr  7 06:56:47 Unraid-Host nginx: 2020/04/07 06:56:47 [alert] 5073#5073: worker process 6696 exited on signal 6
Apr  7 06:56:49 Unraid-Host nginx: 2020/04/07 06:56:49 [alert] 5073#5073: worker process 6775 exited on signal 6
Apr  7 06:56:51 Unraid-Host nginx: 2020/04/07 06:56:51 [alert] 5073#5073: worker process 6780 exited on signal 6

This causes my /var/log to fill up very quickly and the UI of my UnRaid server to be sluggish. Does anyone happen to be able to point me in the right direction? Running version 6.8.3. 

 

Thanks

 

Link to comment
  • 2 weeks later...
  • 1 month later...

So I decided to build a new UnRaid host since I kept having issues with this one, built a new host with a 3700x, 32GB of RAM and a new motherboard. I moved the USB drive over as is since it was a new install from when I rebuilt it 3 months ago and obviously the same disks (3x8TB Data, 2x500GB SSD Cache). After 48 hours I'm getting the same errors in the logs and Nginx is crashing and the UI is extremely sluggish. 

 

These are the two logs that are filling up /var/log within hours. Sometimes restarting Nginx fixes everything, other times it requires a full reboot of the Host:

 

root@Unraid-Host:~# tail -n 5 /var/log/nginx/error.log
2020/06/09 09:21:46 [alert] 30084#30084: worker process 26338 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/06/09 09:21:47 [alert] 30084#30084: worker process 26341 exited on signal 6
ker process: ./nchan-1.2.6/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
2020/06/09 09:21:49 [alert] 30084#30084: worker process 26346 exited on signal 6
root@Unraid-Host:~# tail -n 5 /var/log/syslog
Jun  9 09:21:49 Unraid-Host nginx: 2020/06/09 09:21:49 [alert] 30084#30084: worker process 26346 exited on signal 6
Jun  9 09:21:50 Unraid-Host nginx: 2020/06/09 09:21:50 [alert] 30084#30084: worker process 26490 exited on signal 6
Jun  9 09:21:51 Unraid-Host nginx: 2020/06/09 09:21:51 [alert] 30084#30084: worker process 26494 exited on signal 6
Jun  9 09:21:52 Unraid-Host nginx: 2020/06/09 09:21:52 [alert] 30084#30084: worker process 26496 exited on signal 6
Jun  9 09:21:53 Unraid-Host nginx: 2020/06/09 09:21:53 [alert] 30084#30084: worker process 26504 exited on signal 6
root@Unraid-Host:~#

I've also been trying to collect new diagnostic file, but I'm unable to do so right now as the UI is just hanging even after restarting Nginx. Will try to collect new ones tonight when I'll reboot it. 

Edited by Frostbite2600
Adding info at the end
Link to comment

I started a couple of SMART errors on one of my 512GB SSD Cache drives, so I went ahead and got 2x1TB SSD and invoked the mover to move all files off of the cache, and then once I replaced it the mover moved everything back to the cache drives. After 48 hours the errors are back and the UI is locking up. 

 

I'm unable to get diagnostics at this time since the UI won't allow me to download while these errors are occurring, though it's the same errors spamming the logs as above. 

 

Anyone have any suggestions by chance?

Link to comment
25 minutes ago, johnnie.black said:

Your are overclocking the RAM considering the CPU and number of DIMMs used, see here, it's known to cause stability issues with Ryzen, since 2 DIMMS are single Rank and the outher dual dual Rank not sure witch speed is the correct, 1866 or 2133, probably 1866, and that's where I would start.

According to the link you showed, 3rd gen Ryzen should be able to handle DDR4-3200 when 2 of 4 slots are being used which is where I'm at, but mine is clocked at the clock settings for the specific RAM that I have (2666):

 

root@Unraid-Host:~# dmidecode --type memory | grep -A 5 "Manufacturer: Kingston" | grep -v Serial
	Manufacturer: Kingston
	Asset Tag: Not Specified
	Part Number: KHX2666C16/16G
	Rank: 2
	Configured Memory Speed: 2667 MT/s
--
	Manufacturer: Kingston
	Asset Tag: Not Specified
	Part Number: KHX2666C16/16G
	Rank: 2
	Configured Memory Speed: 2667 MT/s
root@Unraid-Host:~#

Are you saying that even though I'm well within acceptable range for the RAM frequency, to try and clock it to be slower to see if that has any result?

Link to comment
16 minutes ago, Frostbite2600 said:

According to the link you showed, 3rd gen Ryzen should be able to handle DDR4-3200

The you posted the wrong diags:

 

Apr 17 18:10:48 Unraid-Host kernel: smpboot: CPU0: AMD Ryzen 7 1800X Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)

 

Link to comment

Yeah that's the original diags, I can't get new ones to post because the UI won't allow me to download them. After removing everything else from the equation I went out and bought a 3700x and a new motherboard, and the issues are still happening after 2-3 days. 

On 6/9/2020 at 11:22 AM, Frostbite2600 said:

So I decided to build a new UnRaid host since I kept having issues with this one, built a new host with a 3700x, 32GB of RAM and a new motherboard. I moved the USB drive over as is since it was a new install from when I rebuilt it 3 months ago and obviously the same disks (3x8TB Data, 2x500GB SSD Cache). After 48 hours I'm getting the same errors in the logs and Nginx is crashing and the UI is extremely sluggish. 

 

Link to comment
3 hours ago, Frostbite2600 said:

Yeah that's the original diags, I can't get new ones to post because the UI won't allow me to download them.

Can you get diagnostics immediately after booting? That would be better than nothing. Also you might try setting up Syslog Server so you can get syslog from before hang:

 

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601

 

Link to comment
5 hours ago, trurl said:

Can you get diagnostics immediately after booting? That would be better than nothing. Also you might try setting up Syslog Server so you can get syslog from before hang:

 

https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601

 

Attached! As for the syslog, the UnRaid server never ceases to respond, I can still SSH into it and all my VM's and Containers are running as expected. It's just the UI that's sluggish and won't let me download the diagnostics bundle, even after restarting Nginx. But I'm still able to access all logs on the UnRaid server and can manually grab them and tar them if needed. 

 

Thanks!

unraid-host-diagnostics-20200615-2113.zip

Link to comment

This is a tricky one.  I think we need to go step by step in testing to verify the source of the issue.  First, disable your plugins, your VMs, and your Docker Containers and reboot.  Let the system just idle with the array started for a while and see if the server remains responsive.  Then turn on your Docker containers.  If you run into issues, reboot with docker disabled, and start turning containers on one by one.  If you can get all containers running, next we move on to VMs.  There is just nothing glaring in the logs immediately before all the error messages start showing up, so we need to resort to isolating the Apps and VMs to figure it out.

Link to comment

So I built another temporary UnRaid host to migrate my VM's to. On the existing UnRaid server I stopped everything, disabled VM's, disabled Docker, removed all plugins and rebooted.  So it's literally just a NAS doing nothing else. Errors came back after a couple days like clockwork. Diagnostics attached for that. 

 

The temporary host that I built, it's my old 1800x and I installed a fresh installation of UnRaid. I then SCP'd the each of the VM's disk image over and created a new VM, pointing to the images so that I wasn't bringing a corrupt libvert or anything over. I also brought a single Docker container over (UniFi for my network) by SCP'ing the appdata folder for the UniFi container and installing it. After a couple days, this host too started throwing the same errors. Interestingly, it stopped after a day or so but I have a feeling it's going to come back. Exact same errors. 

 

Both diagnostics are attached. The "Unraid-Host" is my permanent UnRaid host. The "Tower" is the temporary. 

 

unraid-host-diagnostics-20200627-1139.zip tower-diagnostics-20200627-1141.zip

Link to comment

Update:

It's been 5 days of uptime and the errors haven't come back since disabling the Dynamix System Temp plugin. I've re-enabled the docker containers and Virtual Machines on the Primary Unraid server this morning and so far so good. Not confident to call this "solved" quite yet as I'd like to go a week or so without the errors, but so far it looks promising. 

 

Thanks

Link to comment
  • 5 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.