January 2Jan 2 I have an old machine that I've recently brought back online after it having fits of dying on me. I believe the original issue was either power related or due to c-states and both of those issues were addressed (updated the bios and changed to typical idle along with installing a new power supply). After about 7 days of uptime, the server died today and restarted with a machine check events in the syslog.Jan 1 21:32:21 Tower kernel: microcode: CPU0: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU1: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU2: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU3: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU4: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU5: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: mce: [Hardware Error]: Machine check events loggedJan 1 21:32:21 Tower kernel: microcode: CPU6: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: mce: [Hardware Error]: CPU 9: Machine Check: 0 Bank 0: baa0000000060135Jan 1 21:32:21 Tower kernel: microcode: CPU7: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: mce: [Hardware Error]: TSC 0 MISC d012000100000000 SYND 2d030000 IPID b000000000Jan 1 21:32:21 Tower kernel: microcode: CPU8: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1767321109 SOCKET 0 APIC 3 microcode 8001139Jan 1 21:32:21 Tower kernel: microcode: CPU9: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU10: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU11: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU12: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU13: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU14: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: CPU15: patch_level=0x08001139Jan 1 21:32:21 Tower kernel: microcode: Microcode Update Driver: v2.2.Attached the full syslog for reference.According to the the help thread, this MCE error could be an issue since it's not at the beginning of start-up. Looks like I have 3 possible cores that are dead/dying. Or it could be nothing? Processor is a Ryzen 1700X - ran it as in my gaming desktop since release and then repurposed it into the Unraid server back in 2021.Would appreciate if someone could help point me in the right direction. Already considering replacing the core hardware (Motherboard, RAM, CPU) but would love to get some more life out of the machine instead of spending a couple grand to start the new year.EDIT: Not really solved but I cut my losses and replaced the hardware.tower-syslog-20260102-0459.zip Edited January 29Jan 29 by fdoteris Question is no longer relevant.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.