Sudden Daily Crashes (Unclean Shutdowns) 6.12.8


Go to solution Solved by DaveDoesStuff,

Recommended Posts

Since approximately last week I've suddenly started getting crashes (reboots ok) with no discernible pattern.

 

Originally I was on 6.12.5 when this issue started. Prior to this I've been going strong for over a year with no issues (that couldn't have been attributed to user error anyway :D). After I noticed the issues (I hadn't looked at my alerts history for a while, because it recovered quickly and completely after each crash I can't be sure when it started) I updated to 6.12.8 but the crashes persisted.

 

This morning I removed some old drives I had sitting in unassigned devices that has previously shown SMART errors and updated my motherboard bios to the latest version. No effect.

 

No errors after running memtest overnight (so inconclusive) and RAM is set to default/stock in BIOS (learned this lesson a long time ago).

 

I was watching plex with syslog open beside it for this latest crash and didn't see any warning times. There was nothing array intensive running, only real load was on my iGPU (Ryzen 5700G) doing transcoding in UNMANIC. Nothing else of note bar the usual "aars" etc...

 

  • Diagnostics from after a graceful shutdown earlier today attached.
  • Diagnostics from after the crash I had a few minutes ago also attached.

 

I have syslogs from before and after the last crash available...but I'm concerned about posting any potentially sensitive data that might be contained in them 😕 

 

In the pre crash log these are the only warnings/errors:

Mar 11 12:25:31 iBstorage kernel: Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
Mar 11 12:25:31 iBstorage kernel: ACPI: Early table checksum verification disabled
Mar 11 12:25:31 iBstorage kernel: floppy0: no floppy controllers found
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 5: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 5: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:25:31 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 12:26:41 iBstorage kernel: BTRFS info (device loop2): using crc32c (crc32c-intel) checksum algorithm
Mar 11 12:36:16 iBstorage root: Fix Common Problems: Warning: Syslog mirrored to flash

 

And the immediate entries before the crash don't have anything obvious:

Mar 11 12:56:33 iBstorage kernel: 
Mar 11 12:58:21 iBstorage emhttpd: spinning down /dev/sdg
Mar 11 13:19:48 iBstorage emhttpd: spinning down /dev/sdh
Mar 11 13:28:26 iBstorage emhttpd: read SMART /dev/sdh
Mar 11 13:43:27 iBstorage emhttpd: spinning down /dev/sdh
Mar 11 14:12:23 iBstorage emhttpd: read SMART /dev/sdh
Mar 11 14:13:01 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 14:14:01 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 14:27:22 iBstorage emhttpd: spinning down /dev/sdf
Mar 11 14:29:34 iBstorage emhttpd: spinning down /dev/sdh
Mar 11 14:50:15 iBstorage emhttpd: read SMART /dev/sdh
Mar 11 18:00:02 iBstorage root: Starting Mover
Mar 11 18:00:02 iBstorage root: Forcing turbo write on
Mar 11 18:00:02 iBstorage root: ionice -c 2 -n 7 nice -n 0 /usr/local/emhttp/plugins/ca.mover.tuning/age_mover start 0 0 0 '' '' '' '' '' 65 '' '' 50
Mar 11 18:00:02 iBstorage kernel: mdcmd (38): set md_write_method 1
Mar 11 18:00:02 iBstorage kernel: 
Mar 11 18:00:02 iBstorage root: Restoring original turbo write mode
Mar 11 18:00:02 iBstorage kernel: mdcmd (39): set md_write_method 1
Mar 11 18:00:02 iBstorage kernel: 
Mar 11 18:04:25 iBstorage emhttpd: read SMART /dev/sdc
Mar 11 18:04:56 iBstorage emhttpd: read SMART /dev/sdf
Mar 11 18:11:10 iBstorage emhttpd: read SMART /dev/sdg
Mar 11 18:11:10 iBstorage emhttpd: read SMART /dev/sdi
Mar 11 18:11:35 iBstorage kernel: mdcmd (40): set md_write_method 1
Mar 11 18:11:35 iBstorage kernel: 
Mar 11 18:27:11 iBstorage emhttpd: spinning down /dev/sdi
Mar 11 18:27:58 iBstorage emhttpd: spinning down /dev/sdg
Mar 11 18:28:05 iBstorage emhttpd: spinning down /dev/sdf
Mar 11 18:31:35 iBstorage kernel: mdcmd (41): set md_write_method 0
Mar 11 18:31:35 iBstorage kernel: 
Mar 11 18:33:38 iBstorage emhttpd: spinning down /dev/sdc
Mar 11 19:16:25 iBstorage emhttpd: spinning down /dev/sdh
Mar 11 19:33:16 iBstorage emhttpd: read SMART /dev/sdf
Mar 11 19:34:07 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 19:39:28 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 19:42:28 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 19:59:18 iBstorage emhttpd: spinning down /dev/sdf
Mar 11 20:03:34 iBstorage emhttpd: read SMART /dev/sdf
Mar 11 20:04:28 iBstorage flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update
Mar 11 20:20:27 iBstorage emhttpd: spinning down /dev/sdf

 

In the post crash log these are the only warnings/errors:

Mar 11 20:23:54 iBstorage kernel: Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
Mar 11 20:23:54 iBstorage kernel: ACPI: Early table checksum verification disabled
Mar 11 20:23:54 iBstorage kernel: floppy0: no floppy controllers found
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 5: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 5: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:23:54 iBstorage rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ]
Mar 11 20:24:52 iBstorage root: Error response from daemon: network with name br0 already exists
Mar 11 20:24:53 iBstorage kernel: BTRFS info (device loop2): using crc32c (crc32c-intel) checksum algorithm
Mar 11 20:34:15 iBstorage root: Fix Common Problems: Warning: Syslog mirrored to flash

 

I can't pinpoint anything useful in the above...but after staring at logs for so many hours this week I might just be blind to it. Hoping a fresh set of eyes might help.

 

Any feedback/help/insight would be very welcome.

postcrash-diagnostics-20240311-2029.zip precrash-diagnostics-20240311-1109.zip

Edited by DaveDoesStuff
Forgot diagnostics
Link to comment

Unfortunately there's nothing relevant logged in the persistent syslog, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

  • Thanks 1
Link to comment
2 hours ago, JorgeB said:

Unfortunately there's nothing relevant logged in the persistent syslog, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

 I've been considering that and dreading the Mrs screaming at me that the recipe apps and plex are down while she is in between our 11 month olds naps :D 

 

Decided to order another RAM kit when I saw Amazon could get it to me next day (here in Ireland that is very rare) without breaking my bank account too badly. If that doesn't help then I will fall back on safe mode for a few days.

 

Thanks for taking a look, sometimes it just helps to know I haven't gone totally mad :P 

Link to comment
Posted (edited)

Interesting development, I have a new RAM kit ready to go but decided to turn everything except Plex server and Omada controller off while waiting to install it to see if it helped.

 

Haven't had a crash in close to 48 hours, then this morning I re-enabled Urbackup and watched the container logs as it started. I observed it having issues with being unable to find a BTRFS filesystem OR a ZFS dataset. In my case I have an XFS encrypted array and a ZFS cache...so possible this is causing some issues.

 

Decided to run a manual full image backup and started getting segfaults:

image.thumb.png.2b650d5774821b0ded9c14df18ff9045.png

 

Correction, the segfaults occurred before the manual backup started.

 

Whether this is actually to do with the crashes remains to be seen. Right now the backup is still running and unraid appears fine. Also my previous backups do not align with the times of the crashes.

 

I know segfaults like that can sometimes come down to XMP profiles but my RAM is running stock. Again could be dodgy RAM but I would have expected a crash tbh.

 

I'm now wondering in this XFS pool versus ZFS cache setup is causing issues with my dockers (running on cache in folder mode) in general though, I'm considering moving docker.img back to a BTRFS image on the ZFS cache pool as opposed to the current folder setup.

 

Will update the thread with more info as I find it in the hopes it helps someone else.

Edited by DaveDoesStuff
Correction, the segfaults occurred before the manual backup started.
Link to comment
  • Solution

Been back to running all of my dockers (same ones I had running while getting the crashes) for 7 days now and zero crashes or unclean shutdowns.

 

No further system changes from my side, only things that would have changed are docker versions and plugin versions. Must have been something in there causing it. Whatever it was it seems resolved now by itself. Weird.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.