Jump to content
  • [6.12.13] Crashing multiple times per week


    62165
    • Minor

    My Unraid server is crashing multiple times per week, as reported by Plex users. At least two times I've seen it happen personally I was watching a Plex LiveTV from my HDHomeRun.  When it becomes unresponsive, all my containers and VMs stop "working". Adguard DNS is unresponsive and Plex stops. I can't HTTPS or SSH to Unraid, but I can ping the server. My HomeAssistant VM is also unresponsive. Both times the logs have php-fpm errors for pool www, which appears to be the web server or reverse proxy for Unraid (NGINX). 

     

    I also see in my logs a couple I/O errors for /dev/sda...which is my flash drive. looking into those now, but they're not at the same time as the issue. They seem to happen during the flash drive backup. Could it be related?

     

    Oct 10 05:06:31 DUNDERMIFFLIN kernel: I/O error, dev sda, sector 9940182 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2

     

    Oct  7 20:10:00 DUNDERMIFFLIN rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="9531" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
    Oct  7 21:07:11 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 525 exited on signal 9 (SIGKILL) after 15.221162 seconds from start
    Oct  7 21:07:25 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 557 exited on signal 9 (SIGKILL) after 12.457896 seconds from start
    Oct  7 21:07:38 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 585 exited on signal 9 (SIGKILL) after 13.360187 seconds from start
    Oct  7 21:12:43 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 624 exited on signal 9 (SIGKILL) after 302.225506 seconds from start
    Oct  7 21:13:51 DUNDERMIFFLIN webGUI: Successful login user root from 192.168.34.7
    Oct  7 21:23:21 DUNDERMIFFLIN webGUI: Unsuccessful login user root from 192.168.34.7
    Oct  7 21:23:25 DUNDERMIFFLIN webGUI: Successful login user root from 192.168.34.7

     

    Oct 14 20:52:05 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 22752 exited on signal 9 (SIGKILL) after 12.648918 seconds from start
    Oct 14 20:52:17 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 22914 exited on signal 9 (SIGKILL) after 11.826932 seconds from start
    Oct 14 20:52:29 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23117 exited on signal 9 (SIGKILL) after 12.055986 seconds from start
    Oct 14 20:52:29 DUNDERMIFFLIN emhttpd: read SMART /dev/sdf
    Oct 14 20:52:41 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23272 exited on signal 9 (SIGKILL) after 12.191249 seconds from start
    Oct 14 20:52:54 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23398 exited on signal 9 (SIGKILL) after 12.778874 seconds from start
    Oct 14 20:53:08 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23444 exited on signal 9 (SIGKILL) after 13.847954 seconds from start
    Oct 14 20:55:30 DUNDERMIFFLIN nginx: 2024/10/14 20:55:30 [error] 9738#9738: *1201487 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 172.17.0.11, server: , request: "GET /Main HTTP/1.1", subrequest: "/auth-request.php", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "dundermifflin.domain.us"
    Oct 14 20:55:30 DUNDERMIFFLIN nginx: 2024/10/14 20:55:30 [error] 9738#9738: *1201487 auth request unexpected status: 502 while sending to client, client: 172.17.0.11, server: , request: "GET /Main HTTP/1.1", host: "dundermifflin.domain.us"
    Oct 14 20:57:14 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 24010 exited on signal 9 (SIGKILL) after 196.943122 seconds from start

     

    dundermifflin-diagnostics-20241014-2121.zip




    User Feedback

    Recommended Comments

    JorgeB

    Posted

    I'm not seeing anything relevant logged, is the syslog-previous from after a crash?

    scott47

    Posted

    Hi 62165,

     

    Is this still happening?  And are you able to check the CPU Usage on the dashboard?  

     

    I am having a real problem with Plex. Before I create my own post, I'm trying to rule out a few things. 

     

    In my case, Plex would "act up" and when I checked the CPU Usage on the Dashboard, I saw that 1 or 2 

    CPU's were pegged at 100% and never dropped. 

     

    Restarting the Plex docker "fixes" the problem, but not for long.  Sometimes the CPU goes back to, and

    stays at 100%, a few minutes after I restart Plex, and sometimes it takes hours but it always happens.

     

    Is that anything like what you're seeing?



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...