SDguy Posted March 29, 2017 Share Posted March 29, 2017 I've gotten this error message a few times now and haven't really seen what this message really pertains to. I've had a couple of machine lock-ups in the past and would really love some help figuring out what might be the cause of this error report. Thanks for any help in advance! towerofpower-diagnostics-20170326-2215.zip Quote Link to comment
Squid Posted March 30, 2017 Share Posted March 30, 2017 (edited) Looks like a memory issue Quote Mar 26 04:30:31 TowerOfPower root: Fix Common Problems: Error: Machine Check Events detected on your server Mar 26 04:30:31 TowerOfPower root: Hardware event. This is not a software error. Mar 26 04:30:31 TowerOfPower root: MCE 0 Mar 26 04:30:31 TowerOfPower root: CPU 6 BANK 8 Mar 26 04:30:31 TowerOfPower root: MISC 5cfbfb0000006040 ADDR c2c340b40 Mar 26 04:30:43 TowerOfPower root: TIME 1490441889 Sat Mar 25 04:38:09 2017 Mar 26 04:30:43 TowerOfPower root: MCG status: Mar 26 04:30:43 TowerOfPower root: MCi status: Mar 26 04:30:43 TowerOfPower root: Corrected error Mar 26 04:30:43 TowerOfPower root: MCi_MISC register valid Mar 26 04:30:43 TowerOfPower root: MCi_ADDR register valid Mar 26 04:30:43 TowerOfPower root: MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERRMar 26 04:30:43 TowerOfPower root: Transaction: Memory read error Mar 26 04:30:43 TowerOfPower root: Memory read ECC errorMar 26 04:30:43 TowerOfPower root: Memory corrected error count (CORE_ERR_CNT): 1 Mar 26 04:30:43 TowerOfPower root: Memory transaction Tracker ID (RTId): 40 Mar 26 04:30:43 TowerOfPower root: Memory DIMM ID of error: 0 Mar 26 04:30:43 TowerOfPower root: Memory channel ID of error: 0 Mar 26 04:30:43 TowerOfPower root: Memory ECC syndrome: 5cfbfb00 Mar 26 04:30:43 TowerOfPower root: STATUS 8c0000400001009f MCGSTATUS 0 Mar 26 04:30:43 TowerOfPower root: MCGCAP 1c09 APICID 20 SOCKETID 1 Mar 26 04:30:43 TowerOfPower root: CPUID Vendor Intel Family 6 Model 44 ( @RobJ - FYI, FCP added output last month from mcelog if it's installed , and I'm impressed with the detail that it gives ) Edited March 30, 2017 by Squid Quote Link to comment
RobJ Posted March 30, 2017 Share Posted March 30, 2017 25 minutes ago, Squid said: ( @RobJ - FYI, FCP added output last month from mcelog if it's installed , and I'm impressed with the detail that it gives ) I saw you had added it, was glad to see it, and yes, I've been very pleased with the reporting it gives you, most of the time. I don't remember it, but I believe I've seen a report or 2 of other hardware subsystem MCE issues that were too cryptic for me, and couldn't find good advice online. But mostly it's great. SDguy, I believe it's reporting that your ECC RAM detected and corrected a memory error, so this looks harmless. But the fact it's having them, could possibly be related to your lockups in the past, a memory error that *couldn't* be corrected. You may want to try the PassMark Memtest on your RAM (it has ECC RAM support). Quote Link to comment
SDguy Posted March 30, 2017 Author Share Posted March 30, 2017 Thank you both! I had run memory check twice when first setting up the server without errors, but I suspected that this might be the case... I'll pull the suspected memory module and see if that fixes my problem. Thanks again! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.