SeveNx7 Posted April 6, 2021 Share Posted April 6, 2021 Hey everyone! My servers been pretty upset lately, having a bunch of freezes and finally took a look at the MCELog and saw this: Apr 5 08:47:22 kernel: smpboot: CPU0: Intel(R) Xeon(R) CPU L5640 @ 2.27GHz (family: 0x6, model: 0x2c, stepping: 0x2) Apr 5 08:47:22 kernel: mce: [Hardware Error]: Machine check events logged Apr 5 08:47:22 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 8: 8c0000400001009f Apr 5 08:47:22 kernel: mce: [Hardware Error]: TSC 0 ADDR 34dbbd500 MISC 3a40080100021083 Apr 5 08:47:22 kernel: mce: [Hardware Error]: PROCESSOR 0:206c2 TIME 1617626815 SOCKET 0 APIC 0 microcode 1f Apr 5 08:47:22 kernel: Performance Events: PEBS fmt1+, Westmere events, 16-deep LBR, Intel PMU driver. Apr 5 08:47:22 kernel: core: CPUID marked event: 'bus cycles' unavailable Now I always had a suspicion that CPU was a little weird and I'd have issues every once and a while and the freezes happened before. just recently its been about every other day. this happened at startup and it would point to a faulty CPU? I had ran a memory test a while ago and everything came up good. I originally thought it meant bank 8 of the memory but now realize its talking about the CPU itself! (CPU 0 core When it brings up the other cpu it just passes through like a champ in the logs!. L5640s are pretty cheap so I dont feel bad scooping up another and giving it a go! Thanks! Quote Link to comment
trurl Posted April 6, 2021 Share Posted April 6, 2021 Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread. Quote Link to comment
SeveNx7 Posted April 6, 2021 Author Share Posted April 6, 2021 dont mind the parity rebuild from a failed drive! barkbox-diagnostics-20210405-2027.zip Quote Link to comment
trurl Posted April 6, 2021 Share Posted April 6, 2021 What makes you think you had a failed drive? Quote Link to comment
SeveNx7 Posted April 6, 2021 Author Share Posted April 6, 2021 (edited) Had about 1500 read errors. Tried swapping cables and psu and continued to grow error counts Unpaid disabled the drive around 3 times. Edited April 6, 2021 by SeveNx7 Quote Link to comment
trurl Posted April 6, 2021 Share Posted April 6, 2021 Did you examine the SMART report for the disk? Do you still have the disk? Unraid disables a disk when a write to it fails. Several reasons a write can fail, and most often it isn't due to a bad disk. Quote Link to comment
Squid Posted April 6, 2021 Share Posted April 6, 2021 Apr 5 19:27:20 barkbox root: Memory ECC error occurred during scrub Apr 5 19:27:20 barkbox root: Memory corrected error count (CORE_ERR_CNT): 1 Apr 5 19:27:20 barkbox root: Memory transaction Tracker ID (RTId): 83 Apr 5 19:27:20 barkbox root: Memory DIMM ID of error: 2 Apr 5 19:27:20 barkbox root: Memory channel ID of error: 0 You have memory issues. The other mce that you referenced is semi-normal, happens to a fair amount of users when the OS initializes the cores. That one can be safely ignored. Quote Link to comment
SeveNx7 Posted April 6, 2021 Author Share Posted April 6, 2021 39 minutes ago, Squid said: Apr 5 19:27:20 barkbox root: Memory ECC error occurred during scrub Apr 5 19:27:20 barkbox root: Memory corrected error count (CORE_ERR_CNT): 1 Apr 5 19:27:20 barkbox root: Memory transaction Tracker ID (RTId): 83 Apr 5 19:27:20 barkbox root: Memory DIMM ID of error: 2 Apr 5 19:27:20 barkbox root: Memory channel ID of error: 0 You have memory issues. The other mce that you referenced is semi-normal, happens to a fair amount of users when the OS initializes the cores. That one can be safely ignored. So if I’m reading that correctly it would be cpu 0’s bank of memory, slot 2? Quote Link to comment
Squid Posted April 6, 2021 Share Posted April 6, 2021 Your System Event Log in the BIOS should also have more info. Quote Link to comment
SeveNx7 Posted April 6, 2021 Author Share Posted April 6, 2021 Yeah. Seems bank 2 was the culprit, as a bonus when I start up now I don't get the first MCE Error I used to get about cpu bank 8. I'll run some more tests but hopefully that was it! Thanks again everyone! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.