Spartacus09 Posted July 16, 2019 Share Posted July 16, 2019 (edited) Is there a command to identify which memory slot this error is referring or list all of them, I'm assuming its likely A1 of the 8 slots. I'm not sure if it starts at channel #1 or channel #0 and might be A2 though (motherboard manual calls A1 channel A). Jul 14 21:33:07 unRAID kernel: mce: [Hardware Error]: Machine check events logged Jul 14 21:33:07 unRAID kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jul 14 21:33:07 unRAID kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 (channel:1 slot:0 page:0x109a826 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:0 ha:1 channel_mask:2 rank:0) Jul 15 04:21:17 unRAID kernel: mce: [Hardware Error]: Machine check events logged Jul 15 04:21:17 unRAID kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jul 15 04:21:17 unRAID kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 (channel:1 slot:0 page:0x109a826 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:0 ha:1 channel_mask:2 rank:0) Jul 15 04:40:06 unRAID root: Fix Common Problems: Error: Machine Check Events detected on your server Jul 15 11:31:07 unRAID kernel: mce: [Hardware Error]: Machine check events logged Jul 15 11:31:07 unRAID kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jul 15 11:31:07 unRAID kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 (channel:1 slot:0 page:0x109a826 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:0 ha:1 channel_mask:2 rank:0) Jul 15 18:31:29 unRAID kernel: mce: [Hardware Error]: Machine check events logged Jul 15 18:31:29 unRAID kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Jul 15 18:31:29 unRAID kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 (channel:1 slot:0 page:0x109a826 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:0 ha:1 channel_mask:2 rank:0) Edited July 17, 2019 by Spartacus09 Quote Link to comment
Frank1940 Posted July 16, 2019 Share Posted July 16, 2019 You are not alone in this problem. What I would suggest you do is start by googling CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 That will get you started. Apparently, you already have your MB manual for reference. You could try modifying the search parameters to include your MB spec and see if that gives you more specific help. Probably, by apply logic and knowledge to your particular situation, some type of pattern will be become obvious. (Nobody intentionally confused how the information is displayed in the syslog but each manufacturer seems to have their own slot numbering nomenclature ...) Quote Link to comment
Spartacus09 Posted July 16, 2019 Author Share Posted July 16, 2019 (edited) 57 minutes ago, Frank1940 said: You are not alone in this problem. What I would suggest you do is start by googling CE memory scrubbing error on CPU_SrcID#0_Ha#1_Chan#1_DIMM#0 That will get you started. Apparently, you already have your MB manual for reference. You could try modifying the search parameters to include your MB spec and see if that gives you more specific help. Probably, by apply logic and knowledge to your particular situation, some type of pattern will be become obvious. (Nobody intentionally confused how the information is displayed in the syslog but each manufacturer seems to have their own slot numbering nomenclature ...) Thanks so a guy here was receiving a channel 0 dimm 0 error also with a supermicro mobo sounds like SM labels start at 0: https://serverfault.com/questions/792225/how-to-find-which-memory-has-ce-error Looks like its likely slot A2 then, ill give that shot. What is the steps to clear that hardware error out of the logs so I can see if it comes back? (I updated unraid versions previously and it cleared it but didn't reoccur til a week or so later). Edited July 16, 2019 by Spartacus09 Quote Link to comment
Spartacus09 Posted July 19, 2019 Author Share Posted July 19, 2019 On 7/16/2019 at 11:10 AM, Spartacus09 said: What is the steps to clear that hardware error out of the logs so I can see if it comes back? (I updated unraid versions previously and it cleared it but didn't reoccur til a week or so later). Restarting apparently clears the errors associated, or at least there are no errors now after replacing the ram in A1/A2. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.