[SOLVED] getting 'upstream timed out' after replacing cache drive


Elvin

Recommended Posts

Hi,

 

This is my first post, I must say thank you for the great product, I've been using unraid for the past few months and am very happy with it. 

 

Yesterday I replaced my cache drive with a bigger capacity one. Followed SpaceInvaderOne's tutorial to do so and it was uneventful. Also I removed a deticated download drive that was attached using unassigned plugin, and moved all downloads back to the cache drive. It all seemed fine.

 

Then after an hour the gui crashed, sshed into the server and found the php upstream was crashed, restarted the php-fpm and gui went back. A couple of hours later it crashed again.

 

I once more brought back the gui and found something interesting in the log that even when the gui is working there's still upstream timeouts, seems it's just a matter of time before the gui crash again:

Feb 18 11:48:48 Tower nginx: 2020/02/18 11:48:48 [error] 16849#16849: *104952 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:49:44 Tower nginx: 2020/02/18 11:49:44 [error] 16849#16849: *105155 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:49:59 Tower nginx: 2020/02/18 11:49:59 [error] 16849#16849: *104680 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:50:16 Tower nginx: 2020/02/18 11:50:16 [error] 16849#16849: *105627 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:50:32 Tower nginx: 2020/02/18 11:50:32 [error] 16849#16849: *105699 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:50:48 Tower nginx: 2020/02/18 11:50:48 [error] 16849#16849: *105277 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:50:48 Tower php-fpm[26006]: [WARNING] [pool www] server reached max_children setting (50), consider raising it
Feb 18 11:51:04 Tower nginx: 2020/02/18 11:51:04 [error] 16849#16849: *104952 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.71, server: , request: "POST /webGui/include/DashUpdate.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 11:51:10 Tower kernel: traps: lsof[8531] general protection ip:14609c129b8e sp:45be87439b02ae3c error:0
Feb 18 11:51:10 Tower kernel: traps: lsof[4871] general protection ip:14fb3507bb8e sp:9afe3aed78c50d54 error:0
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[14609c10a000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[6790] general protection ip:146f3c982b8e sp:71c958dbcce8fe8a error:0
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[14fb3505c000+16b000]
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[146f3c963000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[21992] general protection ip:1501699fdb8e sp:b2932a95f542dac8 error:0 in libc-2.30.so[1501699de000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[19946] general protection ip:14c1be0efb8e sp:7a6d759637c27e85 error:0
Feb 18 11:51:10 Tower kernel: traps: lsof[8654] general protection ip:153a2e6cfb8e sp:f30d8a89e719c2be error:0
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[14c1be0d0000+16b000]
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[153a2e6b0000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[17877] general protection ip:153e1a39cb8e sp:99345f40b304935b error:0 in libc-2.30.so[153e1a37d000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[15864] general protection ip:153972906b8e sp:d876a80e085a9958 error:0 in libc-2.30.so[1539728e7000+16b000]
Feb 18 11:51:10 Tower kernel: traps: lsof[23966] general protection ip:153940ddbb8e sp:ce22525a5af6fe85 error:0
Feb 18 11:51:10 Tower kernel: traps: lsof[27813] general protection ip:148d43c9bb8e sp:57ec0bae37944cf2 error:0
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[153940dbc000+16b000]
Feb 18 11:51:10 Tower kernel: in libc-2.30.so[148d43c7c000+16b000]
Feb 18 11:52:37 Tower kernel: do_general_protection: 39 callbacks suppressed
Feb 18 11:52:37 Tower kernel: traps: lsof[23691] general protection ip:14ee0fc0bb8e sp:ea914e071e71beaa error:0 in libc-2.30.so[14ee0fbec000+16b000]
Feb 18 11:52:37 Tower kernel: traps: lsof[21567] general protection ip:145fc8ee9b8e sp:fa082ceef5f9c066 error:0 in libc-2.30.so[145fc8eca000+16b000]
Feb 18 11:57:29 Tower kernel: traps: lsof[30030] general protection ip:1495ddd49b8e sp:f0860d63a8c3cba0 error:0 in libc-2.30.so[1495ddd2a000+16b000]
Feb 18 12:03:55 Tower kernel: traps: lsof[15334] general protection ip:14f574e29b8e sp:1a359b26c3b01784 error:0 in libc-2.30.so[14f574e0a000+16b000]
Feb 18 12:03:55 Tower kernel: traps: lsof[19560] general protection ip:152bf0fa6b8e sp:d4afcd2a711b7170 error:0
Feb 18 12:03:55 Tower kernel: traps: lsof[21562] general protection ip:14ef1ce11b8e sp:5f2b2cd7bd8fd805 error:0
Feb 18 12:03:55 Tower kernel: traps: lsof[17517] general protection ip:152d24cabb8e sp:f848db7e58aa3b82 error:0
Feb 18 12:03:55 Tower kernel: in libc-2.30.so[152bf0f87000+16b000]
Feb 18 12:03:55 Tower kernel: in libc-2.30.so[152d24c8c000+16b000]
Feb 18 12:03:55 Tower kernel: in libc-2.30.so[14ef1cdf2000+16b000]
Feb 18 12:03:55 Tower kernel: traps: lsof[13413] general protection ip:14cb69079b8e sp:867e9a61d935833c error:0 in libc-2.30.so[14cb6905a000+16b000]
Feb 18 12:03:55 Tower kernel: traps: lsof[11381] general protection ip:149d4397fb8e sp:dc8913e53531c716 error:0 in libc-2.30.so[149d43960000+16b000]

 

I feel hard to believe the php upstream problem is caused by my cache drive replacement but the problem did occor right after it so I'll mentioned my process anyway.

 

Diagostics file attached.

 

Any help is appreciated. Thanks!

tower-diagnostics-20200218-1221.zip

 

 

UPDATE 1:

after another restart i noticed some uncommanded calls to /plugins/preclear.disk/Preclear.php:

Feb 18 15:53:16 Tower kernel: mdcmd (81): spindown 4
Feb 18 15:55:05 Tower nginx: 2020/02/18 15:55:05 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 15:57:11 Tower nginx: 2020/02/18 15:57:11 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 15:59:17 Tower nginx: 2020/02/18 15:59:17 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:01:23 Tower nginx: 2020/02/18 16:01:23 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:03:29 Tower nginx: 2020/02/18 16:03:29 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:05:34 Tower nginx: 2020/02/18 16:05:34 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:07:40 Tower nginx: 2020/02/18 16:07:40 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:09:46 Tower nginx: 2020/02/18 16:09:46 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:10:49 Tower sshd[22350]: Accepted none for root from 192.168.50.8 port 60295 ssh2
Feb 18 16:11:52 Tower nginx: 2020/02/18 16:11:52 [error] 16849#16849: *134467 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 192.168.50.8, server: , request: "POST /plugins/preclear.disk/Preclear.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock", host: "tower", referrer: "http://tower/Dashboard"
Feb 18 16:19:44 Tower login[25404]: ROOT LOGIN on '/dev/pts/1'

I navigated to preclear plugin page and clicked its 'fix preclear' button, seems have stopped those uncommanded calls. Will keep watching.

 

UPDATE 2:

 

Preclear isn't causing the problem. Tried Fix common problems, no luck. Still getting a lot of timeouts on DashUpdate.

 

 

UPDATE 3:

 

OK I uninstalled & reinstalled preclear plugin. Reinstalled unraid nvidia build 6.8.2. Rebooted a few times during this process. Now the issue seems gone. Not sure which particular thing fixed it though.

 

UPDATE 4:

 

I'm marking this as solved in case anyone got caught by similar issues.

Edited by Elvin
Link to comment
  • 2 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.