• [6.9.1] bug with nginx / nchan "exited on signal 6"


    Dovy6
    • Urgent

    There appears to be a bug with nchan as seen in this link: https://github.com/slact/nchan/issues/534

    It would seem to me that this is related to the issues I am having. This is the second or third time this has happened to me. Out of nowhere, with no apparent, obvious trigger, my syslog gets filled with hundreds of messages like this one:

    root@unraid:~# tail /var/log/syslog
    Mar 15 00:45:47 unraid nginx: 2021/03/15 00:45:47 [alert] 3161#3161: worker process 4945 exited on signal 6
    Mar 15 00:45:49 unraid nginx: 2021/03/15 00:45:49 [alert] 3161#3161: worker process 4964 exited on signal 6
    Mar 15 00:45:51 unraid nginx: 2021/03/15 00:45:51 [alert] 3161#3161: worker process 4985 exited on signal 6
    Mar 15 00:45:53 unraid nginx: 2021/03/15 00:45:53 [alert] 3161#3161: worker process 5003 exited on signal 6
    Mar 15 00:45:55 unraid nginx: 2021/03/15 00:45:55 [alert] 3161#3161: worker process 5023 exited on signal 6

    This repeats forever until the logs fill up, and while this is happening Unraid grinds slowly to a halt.

    tail /var/log/nginx/error.log shows this

    root@unraid:~# tail -n 50 /var/log/nginx/error.log
    2021/03/15 00:45:20 [alert] 3161#3161: worker process 4358 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:22 [alert] 3161#3161: worker process 4427 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:22 [alert] 3161#3161: worker process 4454 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:24 [alert] 3161#3161: worker process 4461 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:26 [alert] 3161#3161: worker process 4514 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:27 [alert] 3161#3161: worker process 4584 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:28 [alert] 3161#3161: worker process 4599 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:29 [alert] 3161#3161: worker process 4607 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:30 [alert] 3161#3161: worker process 4659 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:31 [alert] 3161#3161: worker process 4712 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:32 [alert] 3161#3161: worker process 4747 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:34 [alert] 3161#3161: worker process 4776 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:36 [alert] 3161#3161: worker process 4795 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:38 [alert] 3161#3161: worker process 4816 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:40 [alert] 3161#3161: worker process 4850 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:41 [alert] 3161#3161: worker process 4872 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:43 [alert] 3161#3161: worker process 4886 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:45 [alert] 3161#3161: worker process 4908 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:47 [alert] 3161#3161: worker process 4945 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:49 [alert] 3161#3161: worker process 4964 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:51 [alert] 3161#3161: worker process 4985 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:53 [alert] 3161#3161: worker process 5003 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.

    I happened to be using my server, logged in via ssh, when this happened this time, and I was able to run '/etc/rc.d/rc.nginx stop', and this terminates nginx (and obviously means I cannot use the Unraid GUI) but appears to stop the system from crashing to a halt.

     

    Please see

     where some others have noted that this may be related to an old, stale Unraid tab open in a browser somewhere. I will try to track down any open tabs. I only have one other computer that may possibly have a tab open but don't have access to it at this exact moment, so can't test that theory right now.

     

    I am unfortunately unable to trigger this bug on demand.

    I was able to generate a diagnostics.zip, but I'm having trouble uploading it right now. I think its a permissions issue. I'll attach it once I figure that out.

     

    Thanks for your help everyone

     




    User Feedback

    Recommended Comments



    I'm running 6.10.RC2... And while I've lived with this being the norm even when running 6.9.2, I decided to ivestigate...
    I run my unraid server behind pfsense with HAProxy & Letencrypt so this allows me to run unraid as a sub domain with a FQDN and Cert... unraid is not accessible from the WAN without VPN due to a ACL rule.
    I too can't use the webtermianl for any length of time due to it refreshing, I had a similuar issue with Home Assistant.

    Adding this to Home Assistants Backend within HAProxy resolved the refreshing issue.

    backend.thumb.png.db4db882ed54fe81adfaa1b93b7e184b.png

     

    And as an extra step, while reading this thread @Mgutt gave me in idea
     

    Quote

    GET ws://tower:5000/webterminal/ws HTTP/1.1

     

    May need to add a 2nd backend to resolve the ws connection.

    You can find similuar uses of this if you run vaultwarden with websocket true...

     

    I'll set mine up and report back results.

    Edited by CIA
    Link to comment

    not sure if this is helpful/unhelpful... 

     

    Pretty fresh install of 6.9.2 of unraid as a first time user.  Logs kept filling up and plex ends up crashing is my indicator.

     

    Thankfully came across this topic of multi browser windows open because it was getting pretty infuriating.   of course i'm the kind of fella who has about 5 devices at any one time (phones, ipad, laptops, desktop) and i've been researching and configuring the install. I wasn't surviving more than a five days.   Had to go through each one (desktop last) and then it stopped.

     

    snippets:

    Feb 13 11:22:46 sparkraid nginx: 2022/02/13 11:22:46 [error] 12299#12299: *391121 FastCGI sent in stderr: "Primary script un
    known" while reading response header from upstream, client: 192.168.0.175, server: , request: "POST /plugins/dynamix.system.
    temp/include/SystemTemp.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "sparkraid.lan", referrer: 
    "http://sparkraid.lan/Dashboard"
    Feb 13 11:22:47 sparkraid nginx: 2022/02/13 11:22:47 [error] 12299#12299: *391118 FastCGI sent in stderr: "Primary script un
    known" while reading response header from upstream, client: 192.168.0.175, server: , request: "POST /plugins/dynamix.system.
    temp/include/SystemTemp.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "sparkraid.lan", referrer: 
    "http://sparkraid.lan/Dashboard"
    Feb 13 11:22:47 sparkraid nginx: 2022/02/13 11:22:47 [alert] 12297#12297: worker process 12299 exited on signal 6
    Feb 13 11:22:49 sparkraid nginx: 2022/02/13 11:22:49 [alert] 12297#12297: worker process 2878 exited on signal 6
    Feb 13 11:22:50 sparkraid nginx: 2022/02/13 11:22:50 [alert] 12297#12297: worker process 2925 exited on signal 6
    Feb 13 11:22:51 sparkraid nginx: 2022/02/13 11:22:51 [alert] 12297#12297: worker process 2945 exited on signal 6
    Feb 13 11:22:52 sparkraid nginx: 2022/02/13 11:22:52 [alert] 12297#12297: worker process 2955 exited on signal 6
    Feb 13 11:22:53 sparkraid nginx: 2022/02/13 11:22:53 [alert] 12297#12297: worker process 2997 exited on signal 6
    Feb 13 11:22:54 sparkraid nginx: 2022/02/13 11:22:54 [alert] 12297#12297: worker process 3029 exited on signal 6
    Feb 13 11:22:55 sparkraid nginx: 2022/02/13 11:22:55 [alert] 12297#12297: worker process 3048 exited on signal 6
    Feb 13 11:22:56 sparkraid nginx: 2022/02/13 11:22:56 [alert] 12297#12297: worker process 3058 exited on signal 6
    Feb 13 11:22:57 sparkraid nginx: 2022/02/13 11:22:57 [alert] 12297#12297: worker process 3079 exited on signal 6
    Feb 13 11:22:58 sparkraid nginx: 2022/02/13 11:22:58 [alert] 12297#12297: worker process 3129 exited on signal 6
    Feb 13 11:22:59 sparkraid nginx: 2022/02/13 11:22:59 [alert] 12297#12297: worker process 3153 exited on signal 6

    and so on... 

     

    for what it's worth, that ip is my ipad. 

     

     

    running containers:

    netdata/netdata

    plexinc/pms-docker

    tautulli/tautulli

    lscr.io/linuxserver/nextcloud (started before install though)

     

    1 VM

    haos_ova-7.1.qcow2

     

    Apps:

    Appdata Backup/Restore v2

    Auto Update Applications

    binhex-krusader (not running)

    Community Applications

    Dynamix Active Streams

    Dynamix System Buttons

    Dynamix System Info

    Fix Common Problems

    luckyBackup (not running)

    My Servers

    NerdPack GUI

    netdata

    nextcloud

    Plex-Media-Server

    Portainer (not running)

    Preclear Disk

    speedtest (not running)

    speedtest-tracker (not running)

    tautulli

    Unassigned Devices

    Unassigned Devices Plus (Addon)

    watchtower (not running)

     

    Link to comment

    This is absolutely still a problem in 2023. Getting this on version 6.11.5.

     

    Is it possible for Unraid to move away from nchan? Seems ridiculous that this bug has existed for 3 years now.

    Link to comment

    When a process is exited on signal 6 it means it failed to get the memory allocation required to run it.

     

    Start with booting your system in safe mode without plugins, docker and VM running.

    Then start adding things.

     

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.