MCE Errors, Just trying to confirm my suspicions!

April 6, 20215 yr

Hey everyone! My servers been pretty upset lately, having a bunch of freezes and finally took a look at the MCELog and saw this:

Apr 5 08:47:22 kernel: smpboot: CPU0: Intel(R) Xeon(R) CPU L5640 @ 2.27GHz (family: 0x6, model: 0x2c, stepping: 0x2)
Apr 5 08:47:22 kernel: mce: [Hardware Error]: Machine check events logged
Apr 5 08:47:22 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 8: 8c0000400001009f
Apr 5 08:47:22 kernel: mce: [Hardware Error]: TSC 0 ADDR 34dbbd500 MISC 3a40080100021083
Apr 5 08:47:22 kernel: mce: [Hardware Error]: PROCESSOR 0:206c2 TIME 1617626815 SOCKET 0 APIC 0 microcode 1f
Apr 5 08:47:22 kernel: Performance Events: PEBS fmt1+, Westmere events, 16-deep LBR, Intel PMU driver.
Apr 5 08:47:22 kernel: core: CPUID marked event: 'bus cycles' unavailable

Now I always had a suspicion that CPU was a little weird and I'd have issues every once and a while and the freezes happened before. just recently its been about every other day. this happened at startup and it would point to a faulty CPU? I had ran a memory test a while ago and everything came up good. I originally thought it meant bank 8 of the memory but now realize its talking about the CPU itself! (CPU 0 core

When it brings up the other cpu it just passes through like a champ in the logs!. L5640s are pretty cheap so I dont feel bad scooping up another and giving it a go!

Thanks!

Quote

April 6, 20215 yr

Community Expert

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Quote

April 6, 20215 yr

Author

dont mind the parity rebuild from a failed drive!

barkbox-diagnostics-20210405-2027.zip

Quote

April 6, 20215 yr

Community Expert

What makes you think you had a failed drive?

Quote

April 6, 20215 yr

Author

Had about 1500 read errors. Tried swapping cables and psu and continued to grow error counts Unpaid disabled the drive around 3 times.

Edited April 6, 20215 yr by SeveNx7

Quote

April 6, 20215 yr

Community Expert

Did you examine the SMART report for the disk?

Do you still have the disk?

Unraid disables a disk when a write to it fails. Several reasons a write can fail, and most often it isn't due to a bad disk.

Quote

April 6, 20215 yr

Apr  5 19:27:20 barkbox root: Memory ECC error occurred during scrub
Apr  5 19:27:20 barkbox root: Memory corrected error count (CORE_ERR_CNT): 1
Apr  5 19:27:20 barkbox root: Memory transaction Tracker ID (RTId): 83
Apr  5 19:27:20 barkbox root: Memory DIMM ID of error: 2
Apr  5 19:27:20 barkbox root: Memory channel ID of error: 0

You have memory issues.

The other mce that you referenced is semi-normal, happens to a fair amount of users when the OS initializes the cores. That one can be safely ignored.

Quote

April 6, 20215 yr

Author

39 minutes ago, Squid said:
Apr  5 19:27:20 barkbox root: Memory ECC error occurred during scrub
Apr  5 19:27:20 barkbox root: Memory corrected error count (CORE_ERR_CNT): 1
Apr  5 19:27:20 barkbox root: Memory transaction Tracker ID (RTId): 83
Apr  5 19:27:20 barkbox root: Memory DIMM ID of error: 2
Apr  5 19:27:20 barkbox root: Memory channel ID of error: 0
You have memory issues.

The other mce that you referenced is semi-normal, happens to a fair amount of users when the OS initializes the cores. That one can be safely ignored.

So if I’m reading that correctly it would be cpu 0’s bank of memory, slot 2?

Quote

April 6, 20215 yr

Your System Event Log in the BIOS should also have more info.

Quote

April 6, 20215 yr

Author

Yeah. Seems bank 2 was the culprit, as a bonus when I start up now I don't get the first MCE Error I used to get about cpu bank 8.

I'll run some more tests but hopefully that was it!

Thanks again everyone!

Quote

MCE Errors, Just trying to confirm my suspicions!

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)