Jump to content
  • [6.12] Unraid webui stop responding then Nginx crash


    H3ms
    • Retest Urgent

    Hi,

     

    Another day, another problem haha.

     

    Since the update to 6.12 (Rc6 was OK) my webui stop working correctly.

    The webui stop refreshing itself automaticaly (need F5 to refresh data on screen).

     

    Then finally, nginx crashes and the interface stops responding (on port 8080 for me). I have to SSH into the server to find the nginx PID, kill it and then start nginx.

     

    Ive set a syslog server earlier but i dont find anything relative to this except: 

    2023/06/15 20:42:50 [alert] 14427#14427: worker process 15209 exited on signal 6

     

    I attached the syslog and the diag file.

     

    Thx in advance.

     

    nas-icarus-diagnostics-20230615-2146.zip syslog




    User Feedback

    Recommended Comments



    Any news on how to fix this yet.  Hard rebooting my server to get the webgui to work frequently here.  SHould I just revert back to previous Unraid version?

    Link to comment
    1 minute ago, laffer98 said:

    Any news on how to fix this yet.  Hard rebooting my server to get the webgui to work frequently here.  SHould I just revert back to previous Unraid version?

    Look at the first page, there is a way to restart Nginx of the webgui.

    But it's terminal only. 

     

    Link to comment

    I'm having the same problems with the nginex crashing.

    Upgraded to 6.12 then it started. Updated to 6.12.1 but the same error comes back. 

    WebGUI is down, everything else seems to be running.

     

    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25930 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25930
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25931 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25931
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25932 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25932
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25933 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25933
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25941 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25941
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25964 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25964
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 25986 exited on signal 6
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: shared memory zone "memstore" was locked by 25986
    Jun 27 12:31:28 NEST nginx: 2023/06/27 12:31:28 [alert] 7261#7261: worker process 26001 exited on signal 6
    
    Jun 27 12:46:49 NEST nginx: 2023/06/27 12:46:49 [alert] 5701#5701: *40 open socket #22 left in connection 12
    Jun 27 12:46:49 NEST nginx: 2023/06/27 12:46:49 [alert] 5701#5701: *34 open socket #23 left in connection 13
    Jun 27 12:46:49 NEST nginx: 2023/06/27 12:46:49 [alert] 5701#5701: *53 open socket #29 left in connection 19
    Jun 27 12:46:49 NEST nginx: 2023/06/27 12:46:49 [alert] 5701#5701: aborting
    Jun 27 12:46:51 NEST nginx: 2023/06/27 12:46:51 [alert] 7083#7083: *287 open socket #3 left in connection 10
    Jun 27 12:46:51 NEST nginx: 2023/06/27 12:46:51 [alert] 7083#7083: *289 open socket #4 left in connection 11
    Jun 27 12:46:51 NEST nginx: 2023/06/27 12:46:51 [alert] 7083#7083: *291 open socket #15 left in connection 12
    Jun 27 12:46:51 NEST nginx: 2023/06/27 12:46:51 [alert] 7083#7083: *293 open socket #24 left in connection 13

     

    Has anyone managed to downgrade without the error coming back?

     

     

    nest-diagnostics-20230627-1301.zip

    Link to comment

    Same problem, right after 6.12 rc6. LOG shows 'shared mem locked' , lots of them. then ipv6 is blank can't get a new one.docker & vm closed.arrey stopped.

    1 idea never tried. only use ipv4.

    Link to comment

    i have the same problem every 2-3 days nginx stops working and has this error :

    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [error] 28347#28347: nchan: Out of shared memory while allocating channel /disks. Increase nchan_max_reserved_memory.
    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [error] 28347#28347: *668759 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [crit] 28347#28347: ngx_slab_alloc() failed: no memory

    After nginx restart everything works again.

    Link to comment

    Please retest in 6.12.2, it has fixes that should help with the issues mentioned here. If you continue to have issues we need diagnostics from 6.12.2 after the problem occurs. Note that it is difficult to help multiple people in a single thread because problems that seem related can have different solutions.

    Link to comment
    13 hours ago, ljm42 said:

    Please retest in 6.12.2, it has fixes that should help with the issues mentioned here. If you continue to have issues we need diagnostics from 6.12.2 after the problem occurs. Note that it is difficult to help multiple people in a single thread because problems that seem related can have different solutions.

    Still happening. SSH'd in, but diagnostics hangs via CLI. Here's a post-nginx restart diagnostics.

    smurf-raid-diagnostics-20230630-0719.zip

    Link to comment

    Can you test with IPv6 disabled?

    In your log this is happening continuously. The IPv6 address is added and deleted over and over again.

     

    Jun 29 15:38:30 Smurf-RAID dhcpcd[1805]: br0: pid 0 deleted address fd2e:8ae:70ec:1:26be:5ff:fee1:1747/64
    Jun 29 15:38:30 Smurf-RAID dhcpcd[1805]: br0: part of a Router Advertisement expired
    Jun 29 15:38:30 Smurf-RAID dhcpcd[1805]: br0: deleting route to fd2e:8ae:70ec:1::/64
    Jun 29 15:38:30 Smurf-RAID dhcpcd[1805]: br0: adding address fd2e:8ae:70ec:1:26be:5ff:fee1:1747/64
    Jun 29 15:38:30 Smurf-RAID dhcpcd[1805]: br0: adding route to fd2e:8ae:70ec:1::/64
    Jun 29 15:41:31 Smurf-RAID dhcpcd[1805]: br0: expired address fd2e:8ae:70ec:1:26be:5ff:fee1:1747/64
    Jun 29 15:41:31 Smurf-RAID dhcpcd[1805]: br0: part of a Router Advertisement expired
    Jun 29 15:41:31 Smurf-RAID dhcpcd[1805]: br0: deleting route to fd2e:8ae:70ec:1::/64
    Jun 29 15:41:31 Smurf-RAID dhcpcd[1805]: br0: adding address fd2e:8ae:70ec:1:26be:5ff:fee1:1747/64
    Jun 29 15:41:31 Smurf-RAID dhcpcd[1805]: br0: adding route to fd2e:8ae:70ec:1::/64

     

    Link to comment

    The docker rollback in 6.12.2 fixed this issue thus far.  Been stable for 4 days now, couldnt stay up for more than 40 minutes before that.  

    Link to comment
    21 hours ago, laffer98 said:

    I'm now getting this error.

     

    failed (98: Address already in use)

    tower-diagnostics-20230702-1357 - Copy.zip

     

    Please remove this line from your go script:
    cd /boot/packages && find . -name '*.auto_install' -type f -print | sort | xargs -n1 sh -c 

     

    If you need custom packages, create a folder called "extra" on the flash drive and place the files there, they will be installed on boot.

     

    But while you are tracking down this issue, don't add those files to extra. We need to simplify things to figure out what is going on.

     

    Please boot without any custom packages and provide updated diagnostics showing the problem

    Link to comment
    On 7/2/2023 at 9:57 AM, lavavex said:

    have this issue with 6.12.2 as well where webgui just wouldn't even start, then sever crashed at some point and in the middle of the night the web gui came back i guess... here is a diag.

    lavavex-unraid-diagnostics-20230702-1156.zip

     

    I don't see any problems here, were you having issues at the time you generated these diagnostics?

     

    Actually, there is a potential problem, not sure about this:

    Jul  2 09:48:17 Lavavex-Unraid kernel: TCP: request_sock_TCP: Possible SYN flooding on port 8123. Sending cookies.  Check SNMP counters.

     

    Link to comment
    44 minutes ago, ljm42 said:

     

    I don't see any problems here, were you having issues at the time you generated these diagnostics?

     

    Actually, there is a potential problem, not sure about this:

    Jul  2 09:48:17 Lavavex-Unraid kernel: TCP: request_sock_TCP: Possible SYN flooding on port 8123. Sending cookies.  Check SNMP counters.

     

    That’s home assistant, I’m assuming it is just getting A lot of data from sensors or something

    • Like 1
    Link to comment
    On 6/29/2023 at 12:11 PM, ramiro said:

    i have the same problem every 2-3 days nginx stops working and has this error :

    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [error] 28347#28347: nchan: Out of shared memory while allocating channel /disks. Increase nchan_max_reserved_memory.
    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [error] 28347#28347: *668759 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    Jun 29 08:05:09 ramiroserver nginx: 2023/06/29 08:05:09 [crit] 28347#28347: ngx_slab_alloc() failed: no memory

    After nginx restart everything works again.

     

    For everyone having issues with nginx, please try closing all tabs in all browsers on all devices that point to the webgui. It is possible that nchan isn't proper handling things when tabs go to sleep, and that forgotten tabs left open somewhere could be cause this out of memory issue. Check for open tabs on other computers, phones, etc.  And then restart the server.

    Link to comment

    I haven't changed the way I use the webgui... It's the update the problem, not our way of using it.

    There is nothing more in the syslogs on my side, i see nothing but nginx still crashes and spawn random error like

    023/07/03 23:07:44 [alert] 26013#26013: *361456 open socket #32 left in connection 17
    2023/07/03 23:07:44 [alert] 26013#26013: aborting
    2023/07/03 23:08:54 [info] 10721#10721: Using 116KiB of shared memory for nchan in /etc/nginx/nginx.conf:161
    2023/07/03 23:08:54 [info] 10721#10721: Using 131072KiB of shared memory for nchan in /etc/nginx/nginx.conf:161
    2023/07/03 23:08:54 [alert] 1738#1738: *361633 open socket #3 left in connection 13

    at random time since 6.12 rc7. 

    What change have you made between rc6 & rc7?

     

    When the webgui is closed everywhere, there is no crashes. But it's not a solution :/

     

    • Upvote 1
    Link to comment
    5 hours ago, ljm42 said:

    try closing all tabs in all browsers on all devices that point to the webgui

    This is what I have been doing and for me the crash has only occured, while I was actively using the webui. Sometimes after only a few minutes. So it definitely seems to be related to the webui being active. I have also had a few instances, where the webui was unresponsive for ~10s and then responsive again afterwards.
     

    Link to comment

    Update:

     

    The nginex crashes and open socket have stopped. No crach for 24h. 

     

    Changes made in Unraid: 

    - set Interface eth0 to IPv4 only

     

    Changes made in WIN10:

    - Chrome updated to Version 114.0.5735.199 (Official Build) (64-bit)

     

    Most of my crashes happed while using Chrome. Fall back to EDGE for som days but is chrashed while using it but not as fast as when using Chrome. 

     

    Chenge made in router:

    - Disabled IPv6 -> resulted in a DNS trouble that mede me rebuild the AiMesh from screatch. 
    I hade to rebuild the AiMesh (running AC5300 as main , AC88U and AC68U as nodes)

     

    24h has passed since the router rebuild (reinstall) and the unraid server WebGUI is still running. 

     

    Fingers crossed!

    • Like 2
    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...