Whoa, I think you may have diagnosed my issue, I'm so glad I asked for clarification (though apologies, I should have done some leg work and found that post you linked first - just read it over). Truthfully, I've done so many changes to both diagnose/resolve my locking issue and just add functionality to my UnRAID server that it was a bit of a nightmare to diagnose. Just to close (hopefully) this topic and maybe serve to help others:
In early September 2021, I moved from a Supermicro X10 + Xeon 2600 v3 platform to the Asrock E3C246D4U + Xeon E-2126G setup I have now - I did upgrade the BIOS to L2.34 at that point - but no immediate issues with lockups. Unraid was 6.9.3
In late September 2021, I installed the FileBrowser docker (testing out a potential Krusader alternative) - it was and is the only docker using br0.
From late September to now, I was experiencing random lock ups, first every few days, then eventually daily.
At this point, I started down the CPU_CATERR, "is my BIOS version doing this", "is it some hardware failure"
My review of the syslogs did show this trace call error activity, but I thought very little of it since process still seemed to occur after these log entries, often for several hours before I would notice a lock up. It didn't help that most lock ups seemed to happen overnight, or while I was at work,.
This past weekend, I went into full "I need to fix this" mode when dealing with my kid wanting to watch Frozen II and my wife wanting to catchup on some old series in our waitlist. I then:
Posted on this thread
Gutted my UnRAID server including: converting my mirrored cache pools to XFS (out of fear for some newly developed BTRFS issues) [totalling my tiered cache setup], removing a SAS expander, disabling turboboost despite no CPU_CATERR messages ever, removing a host of plugins I had installed over the last month.
I factory reset the BIOS and IPMI, and set my docker network to ipvlan.
Presently, my system has been up and running for 1.5 days, FileBrowser remains installed and functional and a scan of my syslog reveals no call trace errors over this period. Like you and your PSU situation, I consider this all still in testing, but if stability persists over the week/weekend I'll slowly reinstall my cache pools and plugins.
Fingers crossed, and thanks!!!