Jump to content

Machine Check Events detected on your server


Recommended Posts

Posted (edited)

Version: 6.9.2

 

Plugins: Community applications 2021.03.28

Fix Common Problems 2021.04.28

Nerd Tools 2021.01.08

Unassigned Devices 2021.05.1

Unassigned Devices Plus  2021.05.01a

 

Supermicro X8DTL-iF with dual Xeon L5640

16GB ECC Mem

2x 4TB

4x 8TB

1x 12TB

1x 14TB

256GB Cache

1TB unassigned device

 

Running a Fix Common Problems scan, I get a ''Machine Check Events detected on your server''. Your server has detected hardware errors. I had issue with connection earlier, but I fixed it with a full reboot, as well as modem, router reset. Ethernet speed is down to 100/100 which I suspect a faulty cat6 cable. But this last issue is not related, just giving a full disclosure.

 

Here is the syslog

 

Edit: Added tower diagnostics, not sure if both are required.

tower-syslog-20210502-1648.zip

tower-diagnostics-20210502-1301.zip

Edited by thedoors55
added info
Posted

Looks like Memory issues

May  2 09:13:13 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May  2 09:13:13 Tower kernel: mce: CMCI storm detected: switching to poll mode
May  2 09:13:13 Tower kernel: EDAC MC0: 37 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:14 Tower kernel: EDAC MC0: 32763 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:15 Tower kernel: EDAC MC0: 39 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:16 Tower kernel: EDAC MC0: 32707 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:17 Tower kernel: EDAC MC0: 34 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:18 Tower kernel: EDAC MC0: 5 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May  2 09:13:21 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:23 Tower kernel: EDAC MC0: 25 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:28 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:30 Tower kernel: EDAC MC0: 2 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:43 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)

 

Posted (edited)
6 minutes ago, Squid said:

Looks like Memory issues



May  2 09:13:13 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May  2 09:13:13 Tower kernel: mce: CMCI storm detected: switching to poll mode
May  2 09:13:13 Tower kernel: EDAC MC0: 37 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:14 Tower kernel: EDAC MC0: 32763 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:15 Tower kernel: EDAC MC0: 39 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:16 Tower kernel: EDAC MC0: 32707 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:17 Tower kernel: EDAC MC0: 34 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:18 Tower kernel: EDAC MC0: 5 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
### [PREVIOUS LINE REPEATED 1 TIMES] ###
May  2 09:13:21 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:23 Tower kernel: EDAC MC0: 25 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:28 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:30 Tower kernel: EDAC MC0: 2 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
May  2 09:13:43 Tower kernel: EDAC MC0: 1 CE error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)

 

Yeah from looking at it quickly thats what I figured. Stick possibly dying? Best solution would be to replace them?

Edited by thedoors55
bad format

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...