Dovy6 Posted August 27, 2021 Share Posted August 27, 2021 FCP is telling me about a Hardware Problem I apparently have. I've attached Diagnostics as recommended, below. Can anyone help me and tell me if everything is about to die on me? No noticeable symptoms from the machine, other than being flagged for this error. Thank you unraid-diagnostics-20210827-1635.zip Quote Link to comment
Squid Posted August 28, 2021 Share Posted August 28, 2021 Bad memory Aug 26 18:29:11 unraid kernel: mce: [Hardware Error]: Machine check events logged Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 7: 8c00004000010090 Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: TSC 17b040b44b483 Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: ADDR 1abf19d00 Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: MISC 50020286 Aug 26 18:29:11 unraid kernel: EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1630016951 SOCKET 0 APIC 0 Aug 26 18:29:11 unraid kernel: EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x1abf19 offset:0xd00 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:0 ha:0 channel_mask:1 rank:1) The system event log in the BIOS will hopefully identify the stick further Quote Link to comment
Dovy6 Posted August 29, 2021 Author Share Posted August 29, 2021 How critical is this error? Is this "the machine may crash and burn at any moment" critical, or "You need to replace that RAM stick before further issues arise" type critical? Thank you so much for your help Quote Link to comment
ChatNoir Posted August 29, 2021 Share Posted August 29, 2021 2 hours ago, Dovy6 said: How critical is this error? Is this "the machine may crash and burn at any moment" critical, or "You need to replace that RAM stick before further issues arise" type critical? It will probably not burn down the house, but you might experience data or FS corruption, system crash, etc. Quote Link to comment
Squid Posted August 29, 2021 Share Posted August 29, 2021 8 hours ago, Dovy6 said: How critical is this error? Is this "the machine may crash and burn at any moment" critical, or "You need to replace that RAM stick before further issues arise" type critical? Thank you so much for your help My opinion: Presumably, you bought ECC memory so that once errors started occurring you would replace the stick(s). While the errors are being corrected now (due to the ECC), at some point (maybe never) the sticks won't be able to correct the error. At that point many systems will simply completely stop all execution of everything (making it appear to be a hard lockup) which is a pain to diagnose unless you think to go into the System EVent Log to see what happened. Or to put another way, why did you buy ECC memory in the first place? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.