unJack Posted July 7, 2021 Share Posted July 7, 2021 Hi Guys, after setting up my new hardware with Unraid the last days, I'm pretty happy with everything! But today i discovered by accident that there are a lot of hardware errors reported in the syslog (btw, can those be reported via Notifications?). I'm concerned as I plan to store data on unRAID which must not be corrupted. Can somebody derive a possible root cause from those log messages? Is it a faulty CPU, MB or the ECC-RAM? Diagnostics: unraid-diagnostics-20210707-0906.zip Hardware: - be quiet! Pure Power 11 400W (80Plus Gold) - AORUS B550 AORUS ELITE V2, latest BIOS - AMD Ryzen™ 5 3600 (6c/12t) - MSI GeForce GT 710 1GD3H LP - 32GB ECC RAM (2x Mushkin DIMM 16 GB DDR4-2133 ECC) - M.2 NVMe 1TB (Apacer AS2280P4 1 TB, PCIe 3.0 x4, NVMe 1.3, M.2 2280) - M.2 NVMe 1TB (Apacer AS2280P4 1 TB, PCIe 3.0 x4, NVMe 1.3, M.2 2280) - Seagate BarraCuda 4 TB ST4000DM004 (SATA 6 Gb/s, 3,5") - Seagate BarraCuda 4 TB ST4000DM004 (SATA 6 Gb/s, 3,5") - GIGABYTE SSD 240 GB (SATA 6 Gb/s, 2,5") I could buy another 32GB (same 2x16 modules) on short notice, to test things out (and to keep if all 4 modules are fine). Quote Link to comment
JorgeB Posted July 7, 2021 Share Posted July 7, 2021 1 hour ago, unJack said: I could buy another 32GB (same 2x16 modules) on short notice, to test things out (and to keep if all 4 modules are fine). Looks like a RAM problem, you can also test the current DIMMs by using one at a time. 1 Quote Link to comment
unJack Posted July 8, 2021 Author Share Posted July 8, 2021 And again you were right. In the end I replaced one module after watching grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count closely, and that fixed it. Thanks Jorge! 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.