hawnkey Posted November 29, 2022 Share Posted November 29, 2022 When this crash occurs, it seems to try to reboot the system, but never does successfully. It will show LOGIN: but the UI never starts and I can't connect via the IP or via PuTTY. Here is a subset of the error reporting from syslog immediately after the event. Nov 27 07:38:06 Tower kernel: mce: CMCI storm detected: switching to poll mode Nov 27 07:38:06 Tower kernel: mce: [Hardware Error]: Machine check events logged Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: cc04200000010090 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: TSC 3ee1e59de211c Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: ADDR 24be480 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: MISC 142184e86 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1669556286 SOCKET 0 APIC 0 Nov 27 07:38:06 Tower kernel: EDAC MC0: 4224 CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x24be offset:0x480 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:1) Nov 27 07:38:06 Tower kernel: mce: [Hardware Error]: Machine check events logged Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: cc000f8000010090 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: TSC 3ee1e59e32afc Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: ADDR 2519e00 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: MISC 40181486 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1669556286 SOCKET 0 APIC 0 Nov 27 07:38:06 Tower kernel: EDAC MC0: 62 CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x2519 offset:0xe00 grain:32 syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:0) Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: cc00064000010090 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: TSC 3ee1e59e4c06c Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: ADDR 12b60bb40 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: MISC 4218ca86 Nov 27 07:38:06 Tower kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1669556286 SOCKET 0 APIC 0 I've attached a file with all of the events immediately prior, during, and after this error starts. Crash Log 27 Nov.txt Quote Link to comment
Solution Squid Posted November 29, 2022 Solution Share Posted November 29, 2022 1 minute ago, hawnkey said: CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 Bad stick. Your system event log may have more information beyond channel 0, dimm 0 Quote Link to comment
hawnkey Posted November 29, 2022 Author Share Posted November 29, 2022 6 minutes ago, Squid said: Bad stick. Your system event log may have more information beyond channel 0, dimm 0 Thanks. Where can I go to find more information? This is pulled directly from the syslog. Quote Link to comment
Squid Posted November 29, 2022 Share Posted November 29, 2022 System Event Log in your BIOS Quote Link to comment
hawnkey Posted November 29, 2022 Author Share Posted November 29, 2022 perfect. Thanks so much! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.