mhowland24 Posted June 1, 2021 Share Posted June 1, 2021 please see attached diagnostics. I've had memory issues in the past but the seller just keeps telling me to reseat all the ram. anyway I would really appreciate some direction when it comes to fixing this issue, thanks tower-diagnostics-20210531-2337.zip Quote Link to comment
6of6 Posted June 1, 2021 Share Posted June 1, 2021 It's my understanding that unraid has a memory check that's available when you log in with a monitor attached to the physical/actual computer running unraid. I know it's really there because I've seen it. Try running that. I'm scared to run it on my ECC ram. PS: I did not look at the files you attached. 6. Quote Link to comment
ChatNoir Posted June 1, 2021 Share Posted June 1, 2021 @mhowland24, did you run a Memtest to confirm ? Reseatting the DIMMs is a good advice in general, but it the test still detect errors it would be something to fix anyway. The memtest that ships with Unraid does not detect errors on ECC, you should make a boot drive from https://www.memtest86.com/ Looks like real errors that are corrected by the ECC. May 25 20:28:16 Tower kernel: mce: [Hardware Error]: Machine check events logged May 25 20:28:16 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR May 25 20:28:16 Tower kernel: EDAC sbridge MC1: CPU 10: Machine Check Event: 0 Bank 7: 8c00004000010092 May 25 20:28:16 Tower kernel: EDAC sbridge MC1: TSC a0e86f108cc10 May 25 20:28:16 Tower kernel: EDAC sbridge MC1: ADDR fec0de980 May 25 20:28:16 Tower kernel: EDAC sbridge MC1: MISC 1407ed086 May 25 20:28:16 Tower kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1621985296 SOCKET 1 APIC 20 May 25 20:28:16 Tower kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0xfec0de offset:0x980 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0092 socket:1 ha:0 channel_mask:4 rank:1) May 26 04:40:11 Tower root: Fix Common Problems: Error: Machine Check Events detected on your server May 26 04:40:11 Tower root: Hardware event. This is not a software error. May 26 04:40:11 Tower root: MCE 0 May 26 04:40:11 Tower root: CPU 10 BANK 7 TSC a0e86f108cc10 May 26 04:40:11 Tower root: MISC 1407ed086 ADDR fec0de980 May 26 04:40:11 Tower root: TIME 1621985296 Tue May 25 20:28:16 2021 May 26 04:40:11 Tower root: MCG status: May 26 04:40:11 Tower root: MCi status: May 26 04:40:11 Tower root: Corrected error May 26 04:40:11 Tower root: MCi_MISC register valid May 26 04:40:11 Tower root: MCi_ADDR register valid May 26 04:40:11 Tower root: MCA: MEMORY CONTROLLER RD_CHANNEL2_ERR May 26 04:40:11 Tower root: Transaction: Memory read error May 26 04:40:11 Tower root: STATUS 8c00004000010092 MCGSTATUS 0 May 26 04:40:11 Tower root: MCGCAP 1000c1b APICID 20 SOCKETID 1 May 26 04:40:11 Tower root: PPIN abdb2681abd60c88 May 26 04:40:11 Tower root: MICROCODE 42e May 26 04:40:11 Tower root: CPUID Vendor Intel Family 6 Model 62 May 26 04:40:11 Tower root: mcelog: warning: 8 bytes ignored in each record May 26 04:40:11 Tower root: mcelog: consider an update Quote Link to comment
JorgeB Posted June 1, 2021 Share Posted June 1, 2021 You can also check the system event log in the BIOS (or the IMPI event log) as there should be more info on which is the affected DIMM slot, then replace or remove that DIMM. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.