-
-
MCE event | seeking assistance | logs attached
Hello all, My Unraid gave a popup this morning. "Machine Check Events detected on your server" I ran into this issue few months ago, but it seemed to go away once I patched the BIOS. Diagnostics attached. I'd appreciate your insights on this. Thank you, -Adam fx8350-diagnostics-20250901-1747.zip
-
MCE event | seeking assistance | logs attached
Quick update: Disabled C states in BIOS Re-Ran Parity check. The previously flagged 168 errors were fixed. Thank you all for your input & assistance in resolving this issue! This community is awesome 😁
-
MCE event | seeking assistance | logs attached
I see, so does that mean I should disable VM manager while the parity check is running? Dockers isn't used on this node, so it's already disabled.
-
MCE event | seeking assistance | logs attached
Is it safe to run the parity check while a VM is running?
-
MCE event | seeking assistance | logs attached
thank you @trurl. I will read up on the Ryzen thread.
-
MCE event | seeking assistance | logs attached
are these results within acceptable range? The canceled check ran with auto-correct enabled, the latest scan ran without auto-correct.
-
MCE event | seeking assistance | logs attached
Yep, upgraded two versions on the BIOS. Re-running the parity now. So far so good.
-
MCE event | seeking assistance | logs attached
root@FX8350:~# mcelog mcelog: ERROR: AMD Processor family 25: mcelog does not support this processor. Please use the edac_mce_amd module instead. CPU is unsupported If it makes a difference, the hardware on this machine was upgraded a few months ago; from FX8350 to Ryzen 5800x.
-
MCE event | seeking assistance | logs attached
I found two pending BIOS patches for the board. Installed both before running memtest. Lastly, Docker is not enabled on this node. Thanks again for sharing your insight.
-
MCE event | seeking assistance | logs attached
Memtest report attached, where would I retrieve the mce log after reboot? thx!!MemTest86-Report-20250325-044002.html
-
MCE event | seeking assistance | logs attached
Mar 24 05:19:32 FX8350 kernel: microcode: microcode updated early to new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU0: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU1: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU2: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU4: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU3: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU6: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU8: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU7: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU12: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: mce: [Hardware Error]: Machine check events logged Mar 24 05:19:32 FX8350 kernel: microcode: CPU13: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU1: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU3: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU9: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU11: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU6: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU8: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU12: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU14: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU7: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU13: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000001000108 Mar 24 05:19:32 FX8350 kernel: mce: [Hardware Error]: TSC 0 ADDR ffffff81a0a012 MISC d012000100000000 SYND 4d000000 IPID 500b000000000 Mar 24 05:19:32 FX8350 kernel: mce: [Hardware Error]: PROCESSOR 2:a20f12 TIME 1742818730 SOCKET 0 APIC 4 microcode a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU5: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU2: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU10: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU15: patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU9: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU11: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU14: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU5: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU15: new patch_level=0x0a201210 Mar 24 05:19:32 FX8350 kernel: microcode: CPU10: new patch_level=0x0a201210
-
MCE event | seeking assistance | logs attached
Hello all, It appears my Unraid (v7) crashed this morning. Ran the "fix common problem" scan; with the following error Machine Check Events detected on your server Diagnostics attached. I'd appreciate your help in identifying the root cause. Thank you, -Adam fx8350-diagnostics-20250324-1310.zip
-
MCE event | seeking assistance | logs attached
Hello all, I happened to login to my unraid (v6.11.5) node today and saw a red notification stating "Errors have been found with your server". Under the settings tab, I see the following message; "Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the Unraid forums. The output of mcelog (if installed) has been logged" Syslog excerpt below; I'm also attaching the full log and diagnostics zip files. May 27 15:48:45 FX8350 kernel: mce: [Hardware Error]: Machine check events logged May 27 15:48:45 FX8350 kernel: [Hardware Error]: Corrected error, no action required. May 27 15:48:45 FX8350 kernel: [Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[-|CE|MiscV|AddrV|-|CECC|-]: 0x9d0cc0f2001d011b May 27 15:48:45 FX8350 kernel: [Hardware Error]: Error Addr: 0x0000000162934700 May 27 15:48:45 FX8350 kernel: [Hardware Error]: MC4 Error (node 0): L3 cache tag error. May 27 15:48:45 FX8350 kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD Jun 1 04:20:05 FX8350 root: Fix Common Problems: Error: Machine Check Events detected on your server Jun 1 04:20:05 FX8350 root: mcelog: ERROR: AMD Processor family 21: mcelog does not support this processor. Please use the edac_mce_amd module instead. Any help or suggestions would be greatly appreciated. Hardware Info: MBoard: ASrock 990FX Extreme9 CPU: AMD FX8350 Thank you, -Adam
adam5622
Members
-
Joined
-
Last visited