Moussa Posted March 6, 2014 Share Posted March 6, 2014 Hi. I recently built my first unRAID server from these parts https://uk.pcpartpicker.com/user/moussekateer/saved/3DGU, and everything has been fine except for one issue. I keep seeing 'Blackbox kernel: mce: [Hardware Error]: Machine check events logged' in the syslog. Sometimes I go a few days without seeing anything and sometimes I see an error popup every few minutes or seconds. My array currently consists of two 3TB drives (no parity drive yet), both of which I have precleared with 3 cycles and saw no issues. I believe the errors sometimes correlate with heavy writing to the drives, but I cannot reproduce the errors on demand so it may just be a coincidence. I have made sure all the cables and parts inside the server are connected and seated properly so I don't believe it's a connection issue. I have also updated my BIOS to the latest version. I have run mcelog from the flash drive (before I eventually rebooted my server, when I had 20 or so of these errors) to investigate, and believe it's a internal hardware issue with the CPU? Please find the output attached, along with my syslog. I am running unRAID version: 5.0.5 Thank you for your help in advance. syslog-2014-03-06.txt mcelog_output.txt Link to comment
dlandon Posted March 6, 2014 Share Posted March 6, 2014 Hi. I recently built my first unRAID server from these parts https://uk.pcpartpicker.com/user/moussekateer/saved/3DGU, and everything has been fine except for one issue. I keep seeing 'Blackbox kernel: mce: [Hardware Error]: Machine check events logged' in the syslog. Sometimes I go a few days without seeing anything and sometimes I see an error popup every few minutes or seconds. My array currently consists of two 3TB drives (no parity drive yet), both of which I have precleared with 3 cycles and saw no issues. I believe the errors sometimes correlate with heavy writing to the drives, but I cannot reproduce the errors on demand so it may just be a coincidence. I have made sure all the cables and parts inside the server are connected and seated properly so I don't believe it's a connection issue. I have also updated my BIOS to the latest version. I have run mcelog from the flash drive (before I eventually rebooted my server, when I had 20 or so of these errors) to investigate, and believe it's a internal hardware issue with the CPU? Please find the output attached, along with my syslog. I am running unRAID version: 5.0.5 Thank you for your help in advance. It's an issue with the version of Linix in 5.05. Running V6 makes those go away. I used to get them also. I don't believe they mean anything. Link to comment
Moussa Posted March 6, 2014 Author Share Posted March 6, 2014 It's an issue with the version of Linix in 5.05. Running V6 makes those go away. I used to get them also. I don't believe they mean anything. I've read that they're indicating a hardware fault in one of the CPU caches, and these messages are the CPU indicating that it's successfully recovered from them with a parity check. If so, wouldn't that indicate a long term problem and I should RMA the CPU? I'd like to stay on v5.* until the plugins are updated if you say this is harmless, but I will try running v6 for a few days to see if any more errors pop up. Link to comment
DaleWilliams Posted March 6, 2014 Share Posted March 6, 2014 A forums search on MCE will show a lot of discussion around this question. The consensus seems to be that its harmless...AND will go away in v.6 Link to comment
Moussa Posted March 10, 2014 Author Share Posted March 10, 2014 Just to let others wondering know, I've been running the 6.0 beta for a few days now and the error has disappeared. Thanks. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.