January 21, 20251 yr I keep getting MCE's that I don't know how to fix along with the OS crashes. Jan 20 22:22:37 Media kernel: [Hardware Error]: Corrected error, no action required. Jan 20 22:22:37 Media kernel: [Hardware Error]: CPU:0 (17:71:0) MC27_STATUS[-|CE|MiscV|-|-|-|SyndV|-|-|-]: 0x982000000002080b Jan 20 22:22:37 Media kernel: [Hardware Error]: IPID: 0x0001002e00000500, Syndrome: 0x000000005a020005 Jan 20 22:22:37 Media kernel: [Hardware Error]: Power, Interrupts, etc. Ext. Error Code: 2, Link Error. Jan 20 22:22:37 Media kernel: [Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: GEN, part-proc: SRC (no timeout) And then it's also prone to intermittent crashes even if nothing is running on it. I ran a memtest for 48 hours without a hitch just to make sure in the beginning that finished with no problems. I even swapped out my Nvidia RTX 4000 card with a RTX 5000 just to see if it would make any difference and the problem persists. The plugin "Fix Common Problems" also keeps listing this : Jan 21 00:51:02 Media root: Fix Common Problems Version 2024.12.19a Jan 21 00:51:05 Media root: Fix Common Problems: Error: Machine Check Events detected on your server Jan 21 00:51:06 Media root: mcelog: ERROR: AMD Processor family 23: mcelog does not support this processor. Â Please use the edac_mce_amd module instead. media-diagnostics-20250121-0956.zip
January 21, 20251 yr Author The odd thing is, this machine and its hardware ran Windows 10 & 11 Pro for years before this without any similar issues. The crashes didn't start until I switched to running UnRaid 🤔 Edited January 21, 20251 yr by Xarien
January 21, 20251 yr Community Expert The hardware errors are being detected by Linux, not Unraid, but the errors could be unrelated, since at least that one was corrected, make sure this was taken care of: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/#findComment-819173 Â
January 24, 20251 yr Author Disabling C-State in BIOS caused a crash in less than 8 hours...which is a new record 😅 Edited January 24, 20251 yr by Xarien
January 24, 20251 yr Community Expert If you have multiple RAM sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM.
January 26, 20251 yr Author On 1/24/2025 at 12:56 PM, JorgeB said: If you have multiple RAM sticks try using the server with just one, if the same try with a different one, that will basically rule out bad RAM. Yes I have - thanks for the tip though as most people miss this simple 'trick' during troubleshooting 🫡 Also as I earlier mentioned, I ran memtest86+ for 48 hours straight....which would have revealed errors as well if it was a RAM issue.  I've tried turning off XMP without changed results.  I've tried with another identical pair from another PC running on Ryzen 9 5900X that runs 24/7/365 without issues ...and yet the same still happens with this Ryzen 9 3900X  Both computers runs the same BIOS version F39D on their X570 AORUS mainboards. Only difference is that the 5900X is the Ultra edition and the 3900X has the PRO edition. Both also uses Intel X710-2 (dual SFP+) 10GbE cards  On-board LAN is therefor turned off as it's not in use. So I'm hoping someone is able to read from the logs I posted, something concrete that can put me on track as to why this keeps happening? 🤔 Edited January 26, 20251 yr by Xarien
January 26, 20251 yr Community Expert 3 hours ago, Xarien said: which would have revealed errors as well if it was a RAM issue. Memtest is only definitive if it finds errors, many examples on the forum of not finding anything and RAM still being the issue. Â If you have 2 CPUs swapping them would also be a good test.
January 27, 20251 yr Author No, unfortunately I don't have any AM4 socket CPU's just lying around unused to test with.
February 7, 20251 yr Author After trying various BIOS versions as well, I ended up ordering a AM4 socket 5900XT to test with, so we'll see if it yields results as stable or if the problem persists just to determin if it's the CPU or the mainboard right now... So we'll see in 10-12 days time...
February 21, 20251 yr Author Solution Well, I think we can safely say the CPU is fishy New one has UnRaid purring like a kitten...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.