Jump to content

Machine check errors and segfault


david11129

Recommended Posts

I recently upgraded my server from an E3-1285lv3 system to a dual E5-2680 systems with 128gb ram. The ram is new to the server and not carried over. The last couple days I got a warning for machine check errors, and then today I got a segfault warning from Plex. I'm suspecting a possible bad RAM stick, but surprisingly the BIOS doesn't have any ECC related logs. I am going to pull half the ram and try to narrow it down some. Does this sound like the right direction? I am attaching diagnostics for help. Thanks in advance! The problems start yesterday at approximately 1am.

tower-diagnostics-20190719-0146.zip

syslog.txt

Link to comment
Jul 17 01:57:17 Tower kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x1e90c6a offset:0x940 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:1)

 

1 hour ago, david11129 said:

bad RAM stick

 

Link to comment
13 minutes ago, Squid said:

Jul 17 01:57:17 Tower kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x1e90c6a offset:0x940 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:1)

 

 

Would this explain the Segfault as well? Any idea why the event wasn't BIOS detected? I pulled the First CPU's ram and replaced it with 4x8gb sticks. I have ECC errors with other ram sticks, and those were all reported in the BIOS event log. Also, Ths machine was my ESXI host for awhile. Is Esxi just not as in your face with system warnings? I consider this a good thing.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...