Issues with RAM? Processor Overheat? Help :(


Recommended Posts

Hi! First time posting here since I have never experienced issues with my setup, but here is the story:

I wanted to upgrade my server, so having a low budget I went to aliexpress and buy one of the packages there, after all I found an Intel® Xeon® CPU E5-2665 0 @ 2.40GHz with 16GB of RAM at a reasonable price, I also have 12 TB on the server (3 4TB WD Red and a 4TB Parity) but the issue started when I put all together, first the server started to crash, I found this odd but after a while I checked the logs and saw that a memory was causing the issue, so I took one of the modules and then everything started to be kind of stable... 

 

The server starts normally, it can work without issues and with mild usage everything appears to be normal, but when I stress the system it start to have an error that says the following: 

 

Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 5: cc1831c000010093
Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: TSC 0 
Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: ADDR 302de00 
Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: MISC 4054d486 
Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: PROCESSOR 0:206d7 TIME 1685760117 SOCKET 0 APIC 0
Jun  2 19:41:57 Manati kernel: EDAC MC0: 24775 CE memory read error on CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x302d offset:0xe00 grain:32 syndrome:0x0 -  OVERFLOW area:DRAM err_code:0001:0093 socket:0 ha:0 channel_mask:8 rank:0)
Jun  2 19:41:57 Manati kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR

 

Weird part is that the system won't crash anymore, but the log start to get big and after a while it gets full... I am worried about my data so that is why I decided to come to the forum and request help.

 

I start testing the ram one by one, and all of them do the same, I have read that this maybe caused by overheating but according to the hardware monitor the processor stays when stressed between 50-55 celsius, but the weir part is that one of the sensors says that SYSTIN is 117 degrees, I also read that this maybe a faulty sensor, but at this point I don't know.

 

Please help. 

 

I attached the complete Log.

syslog.txt

Link to comment

You shouldn't even attempt to run any computer unless memory is working perfectly. Everything goes through RAM, the OS and other executable code, your data, everything. The CPU can't do anything with anything until it is loaded into RAM. 

 

Go to memtest86.com,  Test your memory before doing anything else with this computer. 

Link to comment

Thanks! I downloaded memtest86 and run it, to my surprise, after 12 hours of making the 4 runs, it passed without errors... but, once I started unraid, the error repeats itself... at this point I am clueless... apparently server runs, but after a while the log file % gets to 100% 

 

¿You think can be the PSU? Now that I'm thinking about it, I have a 450W Supply and according to a PSU Calculator, the PC needs 400W - 499W maybe I have to low resources for all the HDDs and the new CPU.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.