Looking for help diagnosing the problem(s) and figuring out next steps to fix them and get back to a stable system. The problems started a few weeks ago with what seemed like random crashes every few days. These were hard crashes that left the system completely inaccessible and required a power cycle to get back running. After a few crashes I thought I had isolated the problem to media transcoding. It seemed that anytime the CPU/GPU usage would be high for an extended period of time the system would crash. I tried using Tdarr and Unmanic and transcoding with both my Quadro GPU and the iGPU of my Intel CPU, but would get the same crashes. For a while, if I disabled those programs the system would be stable for days, even a week. I also noticed that when I would restart the system after crashes I would have SMART errors in the array. They would always be on Disk 4 and sometimes Disk 2. I would acknowledge them to clear them out and I don't recall any additional ones popping up while the system was running. I just assumed these were caused by the crash while something was being read on the drives during the transcoding. I thought I had isolated the problem to the Tdarr/Unmanic transcoding and just left those stopped and continued to use my server like normal. That was until a few days ago when I started getting crashes whenever Plex was causing heavy CPU usage during into/credit detection, audio analysis, chapter images, etc. tasks. I went into Plex and changed all of those tasks to Never in the hope that this would stabilize the system while I had to travel for a couple of days this week. Everything seemed to be working fine and I was able to access Plex while traveling. Now I get home late last night and the system is still running, but I log into the GUI this morning and see that Disk 2 has entered an error state and is disabled. I am coming here for help in figuring out what the actual issue(s) is/are and what I need to do to fix them. I did run the quick SMART tests on the 2 drives that were giving errors after crashes and didn't find anything. I have also booted into memtest and ran several cycles with no errors. I have attached the diagnostics ZIP file from this morning. The system status has not changed since I ran it, nor have i made any changes since then. Any assistance is greatly appreciated. Please let me know if there is any other information needed to properly diagnose the issues. Thank you.
cerealkiller-diagnostics-20250911-0911.zip