I've searched, and I've tried, and now I have to ask for help.
I built the NSFW Anniversary server. maybe 9 months ago. It's a monster, and I love it, but now it stops responding from the network. MOST times the console is up, but any shutdown command just hangs. I have to cold boot it to get it to come back, which triggers a parity check. Unless I let the parity check finish (11 hours), or pause it, docker is nearly unresponsive. Pause the parity check and docker containers spring to life immediately.
I suspected cache issues. I changed the format of the cache disk to vfat due to btrfs issues. (SMART was clean, last I checked)
I've deleted and reinstalled Docker and all of my containers a number of times, suspecting that it was a specific container locking it up.
Changed the Unraid USB and rebuilt the install from scratch.
Pulled the 10Gb NIC and ran 1Gb for a while.
Stopped using VLANs and put all of the containers directly on br.0 and br.1
Ran memtest about 10 times, rotating other (identical) DIMMs through the system to eliminate an ECC error.
Thought I was good after memtest ran clean twice - I was optimistic that I finally found an error! And replacing the RAM fixed it!
And it's a brick again. Yesterday and this morning. I can't keep it running for 24 hours.
At least this time I managed to get syslog copied to /boot. I've attached it here. I suspect hardware, but I sure haven't been able to pin it down.
Pretty vanilla install with the common plugins and Docker: radarr - sonarr - sabnzbd - plex - zoneminder
syslog