Thanks again for your help!
The issue was twofold. The first, is that my bios was getting reset; despite no warnings in ipmi, the CMOS battery was bad, and replacing that seems to have solved the hard crashes I was seeing (now that it remembers c-states are supposed to be disabled). I was able to get a little over 6 days of uptime before I re-enabled TDarr (more on that below).
The other issue I was seeing is that sometimes it wouldn't crash and reset, it would just hit 100% CPU usage and become unresponsive. I was able to kill docker one time this happened, and while the system didn't recover, I did see CPU usage drop to a more normal level. From there, I started experimenting with the containers I'm running, and long story short, it's 100% TDarr.
For whatever reason, on this hardware while I'm using my GPU to re-encode videos it just saturates the CPU. There's a setting in the application to set ffmpeg priority to low and that seems to have fixed the issue.
If someone somehow stumbles on this from Google, the TDarr setting is under GPU > Options > Low FFmpeg/HandBrake process priority.