• [6.9.1] bug with nginx / nchan "exited on signal 6"


    Dovy6
    • Urgent

    There appears to be a bug with nchan as seen in this link: https://github.com/slact/nchan/issues/534

    It would seem to me that this is related to the issues I am having. This is the second or third time this has happened to me. Out of nowhere, with no apparent, obvious trigger, my syslog gets filled with hundreds of messages like this one:

    root@unraid:~# tail /var/log/syslog
    Mar 15 00:45:47 unraid nginx: 2021/03/15 00:45:47 [alert] 3161#3161: worker process 4945 exited on signal 6
    Mar 15 00:45:49 unraid nginx: 2021/03/15 00:45:49 [alert] 3161#3161: worker process 4964 exited on signal 6
    Mar 15 00:45:51 unraid nginx: 2021/03/15 00:45:51 [alert] 3161#3161: worker process 4985 exited on signal 6
    Mar 15 00:45:53 unraid nginx: 2021/03/15 00:45:53 [alert] 3161#3161: worker process 5003 exited on signal 6
    Mar 15 00:45:55 unraid nginx: 2021/03/15 00:45:55 [alert] 3161#3161: worker process 5023 exited on signal 6

    This repeats forever until the logs fill up, and while this is happening Unraid grinds slowly to a halt.

    tail /var/log/nginx/error.log shows this

    root@unraid:~# tail -n 50 /var/log/nginx/error.log
    2021/03/15 00:45:20 [alert] 3161#3161: worker process 4358 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:22 [alert] 3161#3161: worker process 4427 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:22 [alert] 3161#3161: worker process 4454 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:24 [alert] 3161#3161: worker process 4461 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:26 [alert] 3161#3161: worker process 4514 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:27 [alert] 3161#3161: worker process 4584 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:28 [alert] 3161#3161: worker process 4599 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:29 [alert] 3161#3161: worker process 4607 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:30 [alert] 3161#3161: worker process 4659 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:31 [alert] 3161#3161: worker process 4712 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:32 [alert] 3161#3161: worker process 4747 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:34 [alert] 3161#3161: worker process 4776 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:36 [alert] 3161#3161: worker process 4795 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:38 [alert] 3161#3161: worker process 4816 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:40 [alert] 3161#3161: worker process 4850 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:41 [alert] 3161#3161: worker process 4872 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:43 [alert] 3161#3161: worker process 4886 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:45 [alert] 3161#3161: worker process 4908 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:47 [alert] 3161#3161: worker process 4945 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:49 [alert] 3161#3161: worker process 4964 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:51 [alert] 3161#3161: worker process 4985 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.
    2021/03/15 00:45:53 [alert] 3161#3161: worker process 5003 exited on signal 6
    ker process: ./nchan-1.2.7/src/store/spool.c:479: spool_fetch_msg: Assertion `spool->msg_status == MSG_INVALID' failed.

    I happened to be using my server, logged in via ssh, when this happened this time, and I was able to run '/etc/rc.d/rc.nginx stop', and this terminates nginx (and obviously means I cannot use the Unraid GUI) but appears to stop the system from crashing to a halt.

     

    Please see

     where some others have noted that this may be related to an old, stale Unraid tab open in a browser somewhere. I will try to track down any open tabs. I only have one other computer that may possibly have a tab open but don't have access to it at this exact moment, so can't test that theory right now.

     

    I am unfortunately unable to trigger this bug on demand.

    I was able to generate a diagnostics.zip, but I'm having trouble uploading it right now. I think its a permissions issue. I'll attach it once I figure that out.

     

    Thanks for your help everyone

     




    User Feedback

    Recommended Comments



    I've found any and all Unraid tabs on all computers and either closed or restarted the tabs, and this problem is still persisting, so that is not actually the fix. My next step will have to be rebooting the server, which from my experience fixes the problem, but only temporarily. Once again, unfortunately, I'm not quite sure how to intentionally reproduce the issue. I do know it renders my server neigh unusable, so I've upped the Priority to Urgent. If there are any other troubleshooting steps to do, please advise

    Link to comment

    I went and restarted the browsers on my laptop, while I had putty open in several windows running 'tail -f /var/log/syslog' and 'tail -f /var/log/nginx/error.log'

     

    The errors stopped after restarting the browsers. I am attaching an updated diagnostics.zip file from after in the hope that it can help.

    unraid-diagnostics-20210315-1541.zip

    Link to comment

     

    On 3/31/2021 at 6:12 AM, relink said:

    I am having the exact same issue. I only have tabs open on one computer at the moment. I am trying to get the diagnostics but I cant get it to download right now. 

     

    Your issue is not related. The logs show:
     

    Mar 31 09:07:40 SERVERUS kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window]
    Mar 31 09:07:40 SERVERUS kernel: caller _nv000708rm+0x1af/0x200 [nvidia] mapping multiple BARs

     

    Here is a starting point: 

     

    Link to comment

    When you say:

       I happened to be using my server, logged in via ssh, when this happened

    and:

       I had putty open in several windows

    Are you referring to the built-in web terminal that you access from the Unraid webgui? Or do you mean you are using the stand alone application called putty to SSH into your server, and not the built-in web terminal?

     

    Both ways are valid, I'm just trying to narrow down what could be happening

    Link to comment

    Also, what browsers are you using? If you use multiple, have you noticed that the issue resolves when you close a particular one?

    Link to comment

    I am using the stand alone app, putty, to acess my server. I have not yet determined how to intentionally trigger this happening, unfortunately, so I can't explore and figure out exactly where the bug gets triggered. I access Unraid with both Firefox and Chrome, and yeah, I'm the guy with 3 browser windows and 30 tabs open in each. I am unsure, but to the best of my recollection, it was Chrome that I restarted to fix the problem. I can't promise that I'm correct, though, unfortunately. 

    Link to comment

    Please let me know if there's anything you can suggest I do to try to trigger it to help isolate the bug. Also, be advised, I updated to 6.9.2 today. Again, this only happens occasionally (has happened to me at least 3 separate times,) but I don't know why or when

    Link to comment

    Thanks! Sorry to hammer on this point, are you saying you don't use the web terminal at all? So we can rule it out as a potential source of the problem?

    Link to comment

    I have used the web terminal before. The last time I had this error crop up I was using putty, though. I'm not sure if the web terminal is a potential source. I can open the web terminal and play around and see if I can trigger the bug... Any clues as to what I should do that would potentially cause it? Any help from my logs attached above, when the error happened and when I was able to clear it by restarting web browsers?

     

    Link to comment
    On 4/14/2021 at 11:52 AM, ljm42 said:

    Your issue is not related.

    Thank you for this, I believe you may have helped me solve another issue I was having. 

    • Like 1
    Link to comment
    On 4/14/2021 at 11:13 AM, Dovy6 said:

    Any clues as to what I should do that would potentially cause it?

     

    You mentioned that you solved the problem by closing forgotten tabs on another computer. Can you estimate how long those tabs had been left open? More than 7 days?

    Link to comment

    Almost definitely. I use the tabs often, though. I have the same open browser windows on my laptop for months now, both firefox and chrome, each with at least 2 unraid tabs. Recently I've been using my desktop computer more and my laptop less, meaning my laptop tabs have been stale for a while. Probably over a week since I've used them...

    Link to comment

    I'm having a similar issue I have the log filling up in two places though one is /var/log/nginx and the other is /var/log/syslog. They both record the same error.

     

    image.thumb.png.b7bde057023487b8fb0d7477bd0bceab.png

     

    I restart my machine every 2 days, so nothing is stale on my end.

    Quote

     

    root@Tower:~# du -sm /var/log/*

    0 /var/log/btmp

    0 /var/log/cron

    0 /var/log/debug

    1 /var/log/dmesg

    1 /var/log/docker.log

    1 /var/log/faillog

    1 /var/log/lastlog

    1 /var/log/libvirt

    0 /var/log/maillog

    0 /var/log/messages

    0 /var/log/nfsd

    54 /var/log/nginx

    0 /var/log/packages

    1 /var/log/pkgtools

    0 /var/log/plugins

    0 /var/log/pwfail

    0 /var/log/removed_packages

    0 /var/log/removed_scripts

    0 /var/log/removed_uninstall_scripts

    1 /var/log/samba

    0 /var/log/scripts

    0 /var/log/secure

    0 /var/log/setup

    0 /var/log/spooler

    0 /var/log/swtpm

    75 /var/log/syslog

    0 /var/log/vfio-pci

    1 /var/log/wtmp

     

     

    Edited by Mustafa
    Link to comment

    Instead of restarting the server I did the following:

    /etc/rc.d/rc.nginx restart

     

    I have dynamix system stats installed, so I went ahead and also increased the size of memory allocated to log from ~120 MBs to 300 MBs

    Edited by Mustafa
    grammar buddy, grammar
    Link to comment

    on the one hand i hope this helps someone (myself included) on the other hand, i'm afraid i'll show off my ignorance with the things i'm about to say. 

    I'm not sure if i'm the only one who did this...but i *THINK* this might have been my problem. 

     

    I was watching my logs to make sure they weren't doing the "exited on signal 6" thing.  but...of course it was if Nginx was running.  I have a proxy host set up for tower, cuz...duh, of course.  and i got to thinking ...if maybe somehow that DNS name was interfering somehow?  I'm not a networking guy per se, but i was resolving the hostname tower for a long time before i set up nginx.  now, of course when i use Putty to connect i go to tower over port 22, but could the name resolution be being forwarded to nginx?  I don't think so....but like i said, i'm not a networking guy. 

     

    anyway, i disabled the forwarder for tower and the messages seem to have stopped.  Made a new DNS entry for Unraid and it seems to work just fine.  now maybe just a matter of getting out of the tower habit?

     

    can anyone else confirm similar stuff?

    Edited by allroy1975
    Link to comment

    My PC woke up and is now reloading a Webterminal infinitely. Even a new opened Webterminal does it. While this happens the logs were filled with the error of this bug report:

    nginx: <datetime> [alert] 8330#8330: worker process <random_number> exited on signal 6

     

    Now I try to investigate the problem. At first what happens on the network monitor of the browser:

     

    /webterminal/token is requested as follows:

    GET /webterminal/token HTTP/1.1
    Host: tower:5000
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
    Accept: */*
    Referer: http://tower:5000/webterminal/
    Accept-Encoding: gzip, deflate
    Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
    Cookie: users_view=user1; db-box2=2%3B0%3B1; test.sdd=short; test.sdb=short; test.sdc=short; diskio=diskio; ca_startupButton=topperforming; port_select=eth1; unraid_11111111621fb6eace5f11d511111111=11111111365ea67534cf76a011111111; ca_dockerSearchFlag=false; ca_searchActive=true; ca_categoryName=undefined; ca_installMulti=false; ca_categoryText=Search%20for%20webdav; ca_sortIcon=true; ca_filter=webdav; ca_categories_enabled=%5Bnull%2C%22installed_apps%22%2C%22inst_docker%22%2C%22inst_plugins%22%2C%22previous_apps%22%2C%22prev_docker%22%2C%22prev_plugins%22%2C%22onlynew%22%2C%22new%22%2C%22random%22%2C%22topperforming%22%2C%22trending%22%2C%22Backup%3A%22%2C%22Cloud%3A%22%2C%22Network%3A%22%2C%22Network%3AFTP%22%2C%22Network%3AWeb%22%2C%22Network%3AOther%22%2C%22Plugins%3A%22%2C%22Productivity%3A%22%2C%22Tools%3A%22%2C%22Tools%3AUtilities%22%2C%22All%22%2C%22repos%22%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%5D; ca_selectedMenu=All; ca_data=%7B%22docker%22%3A%22%22%2C%22section%22%3A%22AppStore%22%2C%22selected_category%22%3A%22%22%2C%22subcategory%22%3A%22%22%2C%22selected_subcategory%22%3A%22%22%2C%22selected%22%3A%22%7B%5C%22docker%5C%22%3A%5B%5D%2C%5C%22plugin%5C%22%3A%5B%5D%2C%5C%22deletePaths%5C%22%3A%5B%5D%7D%22%2C%22lastUpdated%22%3A0%2C%22nextpage%22%3A0%2C%22prevpage%22%3A0%2C%22currentpage%22%3A1%2C%22searchFlag%22%3Atrue%2C%22searchActive%22%3Atrue%2C%22previousAppsSection%22%3A%22%22%7D; col=1; dir=0; docker_listview_mode=basic; one=tab1
    

     

    response:

    HTTP/1.1 200 OK
    Server: nginx
    Date: Sat, 28 Aug 2021 16:35:53 GMT
    Content-Type: application/json;charset=utf-8
    Content-Length: 13
    Connection: keep-alive
    
    

    content:

    {"token": ""}

     

    ws://tower:5000/webterminal/ws is requested:

    GET ws://tower:5000/webterminal/ws HTTP/1.1
    Host: tower:5000
    Connection: Upgrade
    Pragma: no-cache
    Cache-Control: no-cache
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
    Upgrade: websocket
    Origin: http://tower:5000
    Sec-WebSocket-Version: 13
    Accept-Encoding: gzip, deflate
    Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
    Cookie: users_view=user1; db-box2=2%3B0%3B1; test.sdd=short; test.sdb=short; test.sdc=short; diskio=diskio; ca_startupButton=topperforming; port_select=eth1; unraid_11111111621fb6eace5f11d511111111=11111111365ea67534cf76a011111111; ca_dockerSearchFlag=false; ca_searchActive=true; ca_categoryName=undefined; ca_installMulti=false; ca_categoryText=Search%20for%20webdav; ca_sortIcon=true; ca_filter=webdav; ca_categories_enabled=%5Bnull%2C%22installed_apps%22%2C%22inst_docker%22%2C%22inst_plugins%22%2C%22previous_apps%22%2C%22prev_docker%22%2C%22prev_plugins%22%2C%22onlynew%22%2C%22new%22%2C%22random%22%2C%22topperforming%22%2C%22trending%22%2C%22Backup%3A%22%2C%22Cloud%3A%22%2C%22Network%3A%22%2C%22Network%3AFTP%22%2C%22Network%3AWeb%22%2C%22Network%3AOther%22%2C%22Plugins%3A%22%2C%22Productivity%3A%22%2C%22Tools%3A%22%2C%22Tools%3AUtilities%22%2C%22All%22%2C%22repos%22%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%5D; ca_selectedMenu=All; ca_data=%7B%22docker%22%3A%22%22%2C%22section%22%3A%22AppStore%22%2C%22selected_category%22%3A%22%22%2C%22subcategory%22%3A%22%22%2C%22selected_subcategory%22%3A%22%22%2C%22selected%22%3A%22%7B%5C%22docker%5C%22%3A%5B%5D%2C%5C%22plugin%5C%22%3A%5B%5D%2C%5C%22deletePaths%5C%22%3A%5B%5D%7D%22%2C%22lastUpdated%22%3A0%2C%22nextpage%22%3A0%2C%22prevpage%22%3A0%2C%22currentpage%22%3A1%2C%22searchFlag%22%3Atrue%2C%22searchActive%22%3Atrue%2C%22previousAppsSection%22%3A%22%22%7D; col=1; dir=0; docker_listview_mode=basic; one=tab1
    Sec-WebSocket-Key: aaaaaaaa3CNW7Y3Waaaaaaaa
    Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
    Sec-WebSocket-Protocol: tty
    
    

    response:

    HTTP/1.1 101 Switching Protocols
    Server: nginx
    Date: Sat, 28 Aug 2021 16:35:53 GMT
    Connection: upgrade
    Upgrade: WebSocket
    Sec-WebSocket-Accept: aaaaaaaaFh/OM7XjuLssaaaaaaaa
    Sec-WebSocket-Protocol: tty
    
    

    content:

    data:undefined,

     

    EDIT: Ah, damn it. I closed one of the still open GUI-Tabs and by that the WebTerminal does not reload anymore 😩

     

    So it seems to be a connection between the background process which loads notifications and the WebTerminal. I will try to investigate the problem when it happens again.

     

    But we can compare against the requests which happen if this bug is not present. This time it loads three different URLs:

     

    ws://tower:5000/webterminal/ws

    GET ws://tower:5000/webterminal/ws HTTP/1.1
    Host: tower:5000
    Connection: Upgrade
    Pragma: no-cache
    Cache-Control: no-cache
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
    Upgrade: websocket
    Origin: http://tower:5000
    Sec-WebSocket-Version: 13
    Accept-Encoding: gzip, deflate
    Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
    Cookie: users_view=user1; db-box2=2%3B0%3B1; test.sdd=short; test.sdb=short; test.sdc=short; diskio=diskio; ca_startupButton=topperforming; port_select=eth1; unraid_11111111621fb6eace5f11d511111111=11111111365ea67534cf76a011111111; ca_dockerSearchFlag=false; ca_searchActive=true; ca_categoryName=undefined; ca_installMulti=false; ca_categoryText=Search%20for%20webdav; ca_sortIcon=true; ca_filter=webdav; ca_categories_enabled=%5Bnull%2C%22installed_apps%22%2C%22inst_docker%22%2C%22inst_plugins%22%2C%22previous_apps%22%2C%22prev_docker%22%2C%22prev_plugins%22%2C%22onlynew%22%2C%22new%22%2C%22random%22%2C%22topperforming%22%2C%22trending%22%2C%22Backup%3A%22%2C%22Cloud%3A%22%2C%22Network%3A%22%2C%22Network%3AFTP%22%2C%22Network%3AWeb%22%2C%22Network%3AOther%22%2C%22Plugins%3A%22%2C%22Productivity%3A%22%2C%22Tools%3A%22%2C%22Tools%3AUtilities%22%2C%22All%22%2C%22repos%22%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%5D; ca_selectedMenu=All; ca_data=%7B%22docker%22%3A%22%22%2C%22section%22%3A%22AppStore%22%2C%22selected_category%22%3A%22%22%2C%22subcategory%22%3A%22%22%2C%22selected_subcategory%22%3A%22%22%2C%22selected%22%3A%22%7B%5C%22docker%5C%22%3A%5B%5D%2C%5C%22plugin%5C%22%3A%5B%5D%2C%5C%22deletePaths%5C%22%3A%5B%5D%7D%22%2C%22lastUpdated%22%3A0%2C%22nextpage%22%3A0%2C%22prevpage%22%3A0%2C%22currentpage%22%3A1%2C%22searchFlag%22%3Atrue%2C%22searchActive%22%3Atrue%2C%22previousAppsSection%22%3A%22%22%7D; col=1; dir=0; docker_listview_mode=basic; one=tab1
    Sec-WebSocket-Key: aaaaaaaaqoOk/3z+aaaaaaaa
    Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits
    Sec-WebSocket-Protocol: tty
    
    

    response

    HTTP/1.1 101 Switching Protocols
    Server: nginx
    Date: Sat, 28 Aug 2021 16:51:11 GMT
    Connection: upgrade
    Upgrade: WebSocket
    Sec-WebSocket-Accept: aaaaaaaaDWIMhZ8VeZoxaaaaaaaa
    Sec-WebSocket-Protocol: tty
    
    

    This time, there was no content!

     

    /webterminal/ request:

    GET /webterminal/ HTTP/1.1
    Host: tower:5000
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    Upgrade-Insecure-Requests: 1
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
    Referer: http://tower:5000/Docker
    Accept-Encoding: gzip, deflate
    Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
    Cookie: users_view=user1; db-box2=2%3B0%3B1; test.sdd=short; test.sdb=short; test.sdc=short; diskio=diskio; ca_startupButton=topperforming; port_select=eth1; unraid_11111111621fb6eace5f11d511111111=11111111365ea67534cf76a011111111; ca_dockerSearchFlag=false; ca_searchActive=true; ca_categoryName=undefined; ca_installMulti=false; ca_categoryText=Search%20for%20webdav; ca_sortIcon=true; ca_filter=webdav; ca_categories_enabled=%5Bnull%2C%22installed_apps%22%2C%22inst_docker%22%2C%22inst_plugins%22%2C%22previous_apps%22%2C%22prev_docker%22%2C%22prev_plugins%22%2C%22onlynew%22%2C%22new%22%2C%22random%22%2C%22topperforming%22%2C%22trending%22%2C%22Backup%3A%22%2C%22Cloud%3A%22%2C%22Network%3A%22%2C%22Network%3AFTP%22%2C%22Network%3AWeb%22%2C%22Network%3AOther%22%2C%22Plugins%3A%22%2C%22Productivity%3A%22%2C%22Tools%3A%22%2C%22Tools%3AUtilities%22%2C%22All%22%2C%22repos%22%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%5D; ca_selectedMenu=All; ca_data=%7B%22docker%22%3A%22%22%2C%22section%22%3A%22AppStore%22%2C%22selected_category%22%3A%22%22%2C%22subcategory%22%3A%22%22%2C%22selected_subcategory%22%3A%22%22%2C%22selected%22%3A%22%7B%5C%22docker%5C%22%3A%5B%5D%2C%5C%22plugin%5C%22%3A%5B%5D%2C%5C%22deletePaths%5C%22%3A%5B%5D%7D%22%2C%22lastUpdated%22%3A0%2C%22nextpage%22%3A0%2C%22prevpage%22%3A0%2C%22currentpage%22%3A1%2C%22searchFlag%22%3Atrue%2C%22searchActive%22%3Atrue%2C%22previousAppsSection%22%3A%22%22%7D; col=1; dir=0; docker_listview_mode=basic; one=tab1
    

    response:

    HTTP/1.1 200 OK
    Server: nginx
    Date: Sat, 28 Aug 2021 16:51:11 GMT
    Content-Type: text/html
    Content-Length: 112878
    Connection: keep-alive
    content-encoding: gzip
    
    

    content:

    <!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"><meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"><title>ttyd - Terminal</title>
    ...

     

    /webterminal/token request

    GET /webterminal/token HTTP/1.1
    Host: tower:5000
    Connection: keep-alive
    Pragma: no-cache
    Cache-Control: no-cache
    User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.159 Safari/537.36
    Accept: */*
    Referer: http://tower:5000/webterminal/
    Accept-Encoding: gzip, deflate
    Accept-Language: de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7
    Cookie: users_view=user1; db-box2=2%3B0%3B1; test.sdd=short; test.sdb=short; test.sdc=short; diskio=diskio; ca_startupButton=topperforming; port_select=eth1; unraid_11111111621fb6eace5f11d511111111=11111111365ea67534cf76a011111111; ca_dockerSearchFlag=false; ca_searchActive=true; ca_categoryName=undefined; ca_installMulti=false; ca_categoryText=Search%20for%20webdav; ca_sortIcon=true; ca_filter=webdav; ca_categories_enabled=%5Bnull%2C%22installed_apps%22%2C%22inst_docker%22%2C%22inst_plugins%22%2C%22previous_apps%22%2C%22prev_docker%22%2C%22prev_plugins%22%2C%22onlynew%22%2C%22new%22%2C%22random%22%2C%22topperforming%22%2C%22trending%22%2C%22Backup%3A%22%2C%22Cloud%3A%22%2C%22Network%3A%22%2C%22Network%3AFTP%22%2C%22Network%3AWeb%22%2C%22Network%3AOther%22%2C%22Plugins%3A%22%2C%22Productivity%3A%22%2C%22Tools%3A%22%2C%22Tools%3AUtilities%22%2C%22All%22%2C%22repos%22%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%2Cnull%5D; ca_selectedMenu=All; ca_data=%7B%22docker%22%3A%22%22%2C%22section%22%3A%22AppStore%22%2C%22selected_category%22%3A%22%22%2C%22subcategory%22%3A%22%22%2C%22selected_subcategory%22%3A%22%22%2C%22selected%22%3A%22%7B%5C%22docker%5C%22%3A%5B%5D%2C%5C%22plugin%5C%22%3A%5B%5D%2C%5C%22deletePaths%5C%22%3A%5B%5D%7D%22%2C%22lastUpdated%22%3A0%2C%22nextpage%22%3A0%2C%22prevpage%22%3A0%2C%22currentpage%22%3A1%2C%22searchFlag%22%3Atrue%2C%22searchActive%22%3Atrue%2C%22previousAppsSection%22%3A%22%22%7D; col=1; dir=0; docker_listview_mode=basic; one=tab1
    

    response

    HTTP/1.1 200 OK
    Server: nginx
    Date: Sat, 28 Aug 2021 16:51:11 GMT
    Content-Type: application/json;charset=utf-8
    Content-Length: 13
    Connection: keep-alive
    

    content:

    {"token": ""}

     

     

    EDIT: Ok, had this bug again. This time the shares and main tab were parallel open while the Terminal reloaded. After closing the main tab, it stopped. This time I leave only the main tab open to verify the connection to this page.

    Link to comment

    I also have this problem. And it's really annoying!!

    I really need a solution to this problem somehow. I'm so happy with unRAID else, but this problem is really killing everything. Can't work with the system at all....

    Any idea to a solution??

    Link to comment

    same issue here. Also, what I noticed is that if I open the webterminal while this is happening, the terminal just keeps disconnecting/reconnecting

     

    after restarting the gui, the web terminal won't load any more with a 503 Bad gateway error from the browser while this is in the log: 

     

    Oct 28 15:31:17 Tower nginx: 2021/10/28 15:31:17 [crit] 31423#31423: *20062687 connect() to unix:/var/run/ttyd.sock failed (2: No such file or directory) while connecting to upstream, client: 192.168.1.56, server: , request: "GET /webterminal/ HTTP/1.1", upstream: "http://unix:/var/run/ttyd.sock:/", host: "tower.local", referrer: "http://tower.local/Main"

     

     

     

     

    Link to comment

    I have the same issue. Logs fill up with that error message. When this happens, I'm unable to use the Web Terminal (it repeatively disconnects/reconnects), and I'm unable to SSH into the machine as well.  The only solution I've found is to enable Telnet (yuck!) for me to get into the machine to clear out the nginx logs and restart the nginx service.

    Link to comment

    I have experienced this same problem, and it's frustrating to see the same issue recurring across multiple threads for many years and I don't think I've seen anything from Unraid devs on this, other than to suggest a memory check.   Multiple years, dozens of users, and near silence.

     

    My uptime stays about 20-25 days then the logs start filling up, and once the log hits 100% the GUI has major issues.   Most of the web UI works in the sense that you can move around sluggishly, but as noted the web sockets suffer complete failure, but the only time I look at the Dashboard is to confirm that logs hit 100% and it's time to restart.   Upon which I would suffer a forced multi day parity check.  (This I recently discovered was a result of restarting with an ssh session being open, so glad that problem is at least resolved)

     

    Usually I know the logs are full when the docker menu starts misbehaving.   

     

    My first suggestion is to send a series of notifications that the log is nearing capacity as we experience for drives with storage and temperature.  This is an absolutely necessity.  It won't be a big window in terms of reacting to the issue, but it's something.

     

    It took 6 hours and 7500+ errors every 2 seconds in my most recent experience.  Not sure how usable the system is during this part. 

     

    line 2829: Jan  3 18:27:53 Tera nginx: 2022/01/03 18:27:53 [alert] 16163#16163: worker process 15731 exited on signal 6
    ....
    line 10504 Jan  4 00:21:40 Tera nginx: 2022/01/04 00:21:40 [alert] 16163#16163: worker process 14537 exited on signal 6

     

    Then nginx and nchan start to fail with memory allocation errors and things get much worse, that only lasts for 8 minutes before logs are completely full and it just stops.

     

    Jan  4 00:21:41 Tera nginx: 2022/01/04 00:21:41 [crit] 14620#14620: ngx_slab_alloc() failed: no memory
    ...
    Jan  4 00:29:57 Tera nginx: 2022/01/04 00:29:57 [error] 12073#12073: *8145363 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", hos

     

    And then 3 days later,  Jan 7,  I discover Unraid is having issues because of zero notifications.

     

    In my use case I don't use web terminals.  I think that's just another indicator of the logs being full just like the docker menu and the dashboard failures.  Simple html templates function, but anything advanced fails.

     

    What does ring a bell though is the idea behind stale tabs.  I have a desktop and a laptop that I rotate through, and each one has a Brave tab open to Unraid for admin convenience.  I usually sit on Docker or Main tabs.  I don't think there should be anything wrong with that.  But as a workaround for now I will stop this behavior or see if I can find an extension to reload the tab each day.  I remember the days I only went down for an Unraid upgrade.  Those days have been sorely missed for years.  

     

    I find it interesting that quite a few people are just restarting nginx and life goes on.  What are you doing about the logs being at 100%?  Just ignoring it, or is there a service to restart that as well? 

    Edited by thespooler
    Link to comment

    A process aborting on signal 6, may be happening due to different reasons. Most common reasons are (1) not enough memory to execute, e.g. your system hasn’t enough RAM to do all you want. And (2) wrong library inclusion, which usually happens when installing other packages with conflicting library files. For such cases it is recommended to run Unraid in safe mode and check if the problem persists.

     

    To tackle open web sessions we have changed the mode of operation to an event driven model in Unraid 6.10, which is currently available as a release candidate (rc)..

     

    Link to comment

    @bonienl I appreciate you commenting on this.  I'm happy to hear there is a change going forward that might address this. 

     

    While it's true we might not all suffer this issue for the same reasons.  I see some people get this after a few days, so I'm thankful it only happens after 20+ days.  Which I can appreciate also makes it harder to understand what is happening.   It feels like a slow memory leak to me.

     

    In regards to memory, in my use case, when things start misbehaving I always hit the Dashboard to verify the logs are at 100% and conveniently Memory Utilization is the only other state (going from my own memory) that is working.  So I always get a sense of memory use after and it was never anything that alarmed me.  But here's memory from diagnostics once the system is toast.

     

                  total        used        free      shared  buff/cache   available
    Mem:           11Gi       3.6Gi       1.7Gi       1.0Gi       6.1Gi       6.6Gi
    Swap:            0B          0B          0B
    Total:         11Gi       3.6Gi       1.7Gi

     

    nginx was still running with the same process id as indicated by the logs, so I don't think any additional memory was suddenly freed up once it started throwing signals 6s. Without a notification that the logs are filling up, it's hard to know when this is occurring to catch it in real time and then there is such a short window before it's all over.

     

    After the thousands of single error signal 6s,  memory errors start mixing in with the signal 6s: 

     

    Jan  4 00:21:43 Tera nginx: 2022/01/04 00:21:43 [crit] 14673#14673: ngx_slab_alloc() failed: no memory
    Jan  4 00:21:43 Tera nginx: 2022/01/04 00:21:43 [error] 14673#14673: shpool alloc failed
    Jan  4 00:21:43 Tera nginx: 2022/01/04 00:21:43 [error] 14673#14673: nchan: Out of shared memory while allocating message of size 7391. Increase nchan_max_reserved_memory.
    Jan  4 00:21:43 Tera nginx: 2022/01/04 00:21:43 [error] 14673#14673: *8142824 nchan: error publishing message (HTTP status code 500), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    Jan  4 00:21:43 Tera nginx: 2022/01/04 00:21:43 [error] 14673#14673: MEMSTORE:00: can't create shared message for channel /disks
    Jan  4 00:21:44 Tera nginx: 2022/01/04 00:21:44 [alert] 16163#16163: worker process 14673 exited on signal 6

     

    And these signals 6s (if not all of them) are coming from nchan's process id.

     

    And finally to tie it all back to old logins, the thousands of signal 6s come after this chunk in the logs.  Though these entries are 20+ minutes apart, they are back to back:

     

    Jan  3 18:02:18 Tera nginx: 2022/01/03 18:02:18 [error] 15731#15731: *8026641 limiting requests, excess: 20.409 by zone "authlimit", client: 192.168.0.101, server: , request: "PROPFIND /login HTTP/1.1", host: "tera"
    Jan  3 18:27:53 Tera nginx: 2022/01/03 18:27:53 [alert] 16163#16163: worker process 15731 exited on signal 6

     

    Does the GUI use WebDAV calls like PROPFIND?  That IP is my main desktop, there's nothing WebDAVish  I use.  I don't know if Brave might try and do some discovery, but it seems like a stretch.

     

    Not sure if your third suggestion is referring to plug-ins.  Everything extravagant is Docker.  The majority of plug-ins are all yours or Squids.  

     

    With my 20+ days of up time, I'm going to avoid going into safe mode for now. 

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.