ClintE Posted November 8, 2018 Share Posted November 8, 2018 (edited) This is interesting. Started out at 6.6.2. Upgraded the other day to 6.6.3 after backup of flash drive. No problems at all. Today I update (after backing up flash) to 6.6.4, and now I get random missing disks after reboot. The disks are there, bios & controller sees them fine. GUI reports stale configuration. Thoughts, anyone? Edit: Now it's running fine. Restarted computer about 3 or 4 times and all the disks finally showed up. Started array. All is good. So far. Edited November 8, 2018 by ClintE Quote Link to comment
JorgeB Posted November 8, 2018 Share Posted November 8, 2018 Next time it happens grab and post the diagnostics 1 Quote Link to comment
ClintE Posted November 8, 2018 Author Share Posted November 8, 2018 Will do, thanks! Quote Link to comment
ClintE Posted November 8, 2018 Author Share Posted November 8, 2018 14 hours ago, johnnie.black said: Next time it happens grab and post the diagnostics I know some errors in log files can safely be ignored, but this probably isn't that case: syndrome:0x0 - OVERFLOW area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:0) kernel: mce: [Hardware Error]: Machine check events logged kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR kernel: EDAC sbridge MC1: CPU 8: Machine Check Event: 0 Bank 5: 8c00004000010091 kernel: EDAC sbridge MC1: TSC 0 kernel: EDAC sbridge MC1: ADDR 20227e0340 kernel: EDAC sbridge MC1: MISC 20423a1a86 kernel: EDAC sbridge MC1: PROCESSOR 0:206d7 TIME 1541673606 SOCKET 1 APIC 20 kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x20227e0 offset:0x340 grain:32 Started going though log files and found this, it seems to be repeating depending on system activity. Looks like the errors are the same each time. System is running properly though; parity check ran with zero errors. Thanks for any insight into what this might mean. Quote Link to comment
John_M Posted November 8, 2018 Share Posted November 8, 2018 24 minutes ago, ClintE said: kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x20227e0 offset:0x340 grain:32 You have some bad memory. Quote Link to comment
JorgeB Posted November 8, 2018 Share Posted November 8, 2018 Check the board's system event log, it should identify the problem DIMM. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.