Jump to content

[Hardware error] CPU, Memory, PCIe


Recommended Posts

Good evening

 

I've been getting these errors below after upgrading from 6.8.3 to 6.9.2 recently. The server appears to be running fine but I'm worried as these errors were not noticed before the upgrade.

 

I have a AMD Radeon R7 passed through to a Windows 10 VM. I recently added more ECC Kingston RAM (Total 16GBx4).

 

MB: Asrock X570 Steel Legend

CPU: AMD Ryzen 7 3700X

 

Recent diagnostics attached. Any help is much appreciated.

 

Thanks

 

 

dhaka-diagnostics-20210813-2049.zip

Link to comment
4 hours ago, Squid said:
Aug 13 06:05:24 Dhaka kernel: [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
Aug 13 06:05:24 Dhaka kernel: EDAC MC0: 1 CE on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x5b7962 offset:0x880 grain:64 syndrome:0x1000)

Looks like a bad stick

I'll remove the new sticks and see. Could there be any long term consequences if I leave them there do you think? As the server appears to be running fine.

Link to comment
On 8/16/2021 at 6:54 PM, sfaruque said:

Could there be any long term consequences if I leave them there do you think? As the server appears to be running fine.

My opinion:

 

You purchased ECC memory and are using it so that if/when one starts going bad you can replace it.  Sure the errors are currently being corrected, but that is simply the stick telling you "replace me".  If you have ECC memory and chose to not replace the memory when errors begin to happen then I'd question why you even bought ECC memory and a compatible CPU / motherboard in the first place.

  • Like 1
Link to comment
8 hours ago, Squid said:

My opinion:

 

You purchased ECC memory and are using it so that if/when one starts going bad you can replace it.  Sure the errors are currently being corrected, but that is simply the stick telling you "replace me".  If you have ECC memory and chose to not replace the memory when errors begin to happen then I'd question why you even bought ECC memory and a compatible CPU / motherboard in the first place.

Very valid point. Noted. Thanks.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...