[6.12.13] Crashing multiple times per week

Minor

My Unraid server is crashing multiple times per week, as reported by Plex users. At least two times I've seen it happen personally I was watching a Plex LiveTV from my HDHomeRun. When it becomes unresponsive, all my containers and VMs stop "working". Adguard DNS is unresponsive and Plex stops. I can't HTTPS or SSH to Unraid, but I can ping the server. My HomeAssistant VM is also unresponsive. Both times the logs have php-fpm errors for pool www, which appears to be the web server or reverse proxy for Unraid (NGINX).

I also see in my logs a couple I/O errors for /dev/sda...which is my flash drive. looking into those now, but they're not at the same time as the issue. They seem to happen during the flash drive backup. Could it be related?

Oct 10 05:06:31 DUNDERMIFFLIN kernel: I/O error, dev sda, sector 9940182 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2

Oct  7 20:10:00 DUNDERMIFFLIN rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="9531" x-info="https://www.rsyslog.com"] rsyslogd was HUPed
Oct  7 21:07:11 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 525 exited on signal 9 (SIGKILL) after 15.221162 seconds from start
Oct  7 21:07:25 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 557 exited on signal 9 (SIGKILL) after 12.457896 seconds from start
Oct  7 21:07:38 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 585 exited on signal 9 (SIGKILL) after 13.360187 seconds from start
Oct  7 21:12:43 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 624 exited on signal 9 (SIGKILL) after 302.225506 seconds from start
Oct  7 21:13:51 DUNDERMIFFLIN webGUI: Successful login user root from 192.168.34.7
Oct  7 21:23:21 DUNDERMIFFLIN webGUI: Unsuccessful login user root from 192.168.34.7
Oct  7 21:23:25 DUNDERMIFFLIN webGUI: Successful login user root from 192.168.34.7

Oct 14 20:52:05 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 22752 exited on signal 9 (SIGKILL) after 12.648918 seconds from start
Oct 14 20:52:17 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 22914 exited on signal 9 (SIGKILL) after 11.826932 seconds from start
Oct 14 20:52:29 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23117 exited on signal 9 (SIGKILL) after 12.055986 seconds from start
Oct 14 20:52:29 DUNDERMIFFLIN emhttpd: read SMART /dev/sdf
Oct 14 20:52:41 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23272 exited on signal 9 (SIGKILL) after 12.191249 seconds from start
Oct 14 20:52:54 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23398 exited on signal 9 (SIGKILL) after 12.778874 seconds from start
Oct 14 20:53:08 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 23444 exited on signal 9 (SIGKILL) after 13.847954 seconds from start
Oct 14 20:55:30 DUNDERMIFFLIN nginx: 2024/10/14 20:55:30 [error] 9738#9738: *1201487 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 172.17.0.11, server: , request: "GET /Main HTTP/1.1", subrequest: "/auth-request.php", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "dundermifflin.domain.us"
Oct 14 20:55:30 DUNDERMIFFLIN nginx: 2024/10/14 20:55:30 [error] 9738#9738: *1201487 auth request unexpected status: 502 while sending to client, client: 172.17.0.11, server: , request: "GET /Main HTTP/1.1", host: "dundermifflin.domain.us"
Oct 14 20:57:14 DUNDERMIFFLIN php-fpm[8156]: [WARNING] [pool www] child 24010 exited on signal 9 (SIGKILL) after 196.943122 seconds from start

dundermifflin-diagnostics-20241014-2121.zip

[6.12.13] Crashing multiple times per week

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)