Error message: Your server has detected hardware errors ~


Recommended Posts

So I got message "Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged". So I installed MCE Log and posted here. 

 

So a little background - I had a Unraid server running on an Asus Z10PE-D8 motherboard in supermicro SC846 24 bay chassis that has been running good for a few years. I was able tp pick up a 36 bay hard drive supermicro SC847-12 chassis from work with the same 4 RU footprint. The new chassis included a Supermicro X10DRH-iT motherboard, two platinum power supplies and three raid cards that originally came with the chassis. I added 64 gig of new memory and two used E5-2630 V3 processors (know good) and first boot I did have a slight memory error where the motherboard only saw 57 of the 64 gig. I re-snapped all the memory sticks and then the motherboard was reading all 64 gig of memory. I mainly use this server for Emby to host media library, Plex for recording TV and cable and then process and put into media folders. motion eye for security camera recording and host me small web server.

 

So yesterday I shut down the old chassis, removed from the rack, installed the new chassis, installed the drives and fired up the server. No error messages or issues other than I needed to clear an old raid configuration on one of the raid cards. I rebooted and Unraid came up fine and verified all the dockers came and and working good and all data was there and that parity was valid. I updated Unraid to 6.9.1 so I could use the Nvidia Plugin for transcoding on Emby and all seems to be working well.

 

This morning I got an error with Fix problems popup and I saw the above error message. Looking through the logs looks like it might be a RAM memory issue but I wants someone else to take a look to see if I missed anything. I have posted the diag log files.

 

Thanks,

Mike B.

bbtower1-diagnostics-20210314-1048.zip

Edited by mbppg
Link to comment
Mar 14 07:35:17 BBTower1 kernel: EDAC MC0: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x931565 offset:0xac0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0093 socket:1 ha:0 channel_mask:8 rank:0)

Your System Event log will hopefully have more info on finding the bad DIMM

Link to comment

Took all memory out, moved the sticks to different slots and checked again, that seems to have fixed the issue. Reading the correct amount of RAM and ran fix problems and no errors coming up and no memory errors in logs now, thanks.

Edited by mbppg
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.