Auxilium Posted October 12, 2019 Share Posted October 12, 2019 unRaid already have edac, but for some reason is missing edac-util It would provide much easier way to troubleshoot RAM and PCIe issues. 1 Quote Link to comment
flaggart Posted October 16, 2019 Share Posted October 16, 2019 I would also like this. In its absence I have been referring to this post which suggests the same information is available without the tool. Quote Link to comment
realies Posted September 22, 2022 Share Posted September 22, 2022 On 10/16/2019 at 12:54 PM, flaggart said: I would also like this. In its absence I have been referring to this post which suggests the same information is available without the tool. Quote # grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count grep: /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count: No such file or directory Not seeing anything about EDAC in dmsg too. Does Unraid still support ECC? Quote Link to comment
ich777 Posted September 29, 2022 Share Posted September 29, 2022 On 9/22/2022 at 8:30 PM, realies said: Not seeing anything about EDAC in dmsg too. Does Unraid still support ECC? EDAC should usually load automatically if you CPU/Motherboard/Memory Controller supports it. However you can load it manually using: modprobe amd64_edac for AMD CPUs or for Intel 10th, 11th, 12th Gen processors with: modprobe igen6_edac If you get an error like "modprobe: ERROR: could not insert 'MODULNAME': No such device" then it's most likely the wrong module since this means that the module doesn't find a compatible hardware device. BTW you can get all available modules with this command: ls -la /lib/modules/*-Unraid/kernel/drivers/edac/ The main issue with EDAC is that it is really noisy at times and also can give you a lot of false positives and can ultimately drive you crazy... EDAC also reports PCIe errors from what I know and since not all PCIe devices follow the entire PCIe standard there can be many, many, many, maaaannnny issues at certain times and with certain hardware combinations. Quote Link to comment
pixeldoc81 Posted January 9 Share Posted January 9 (edited) Here is an example from my Server. Mainboard: Supermicro X10DRI-F CPU: Intel Xeon E5-2630 v4 RAM: 4x 8GB ECC Memory UNRAID: 6.12.6 # dmesg | grep EDAC [ 129.472264] EDAC MC: Ver: 3.0.0 [ 129.482175] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 129.482194] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 129.482213] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 129.482226] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 129.482234] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 129.482244] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 129.482251] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 129.482261] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 129.482269] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 129.482279] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 129.482286] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 129.482292] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 129.482296] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 129.482302] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 129.482307] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 129.482312] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 129.482317] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 129.482325] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 129.482333] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 129.482341] EDAC sbridge: Seeking for: PCI ID 8086:6f6c [ 129.482365] EDAC sbridge: Seeking for: PCI ID 8086:6f6d [ 129.482374] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 129.482377] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 129.482384] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 129.482387] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 129.482394] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 129.482399] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 129.482617] EDAC MC0: Giving out device to module sb_edac controller Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0 (INTERRUPT) [ 129.482623] EDAC sbridge: Ver: 1.1.2 # lsmod | grep edac sb_edac 24576 0 edac_core 65536 1 sb_edac # grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count /sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch2_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch3_ce_count:0 # lshw -class memory | grep ecc capabilities: ecc configuration: errordetection=multi-bit-ecc # mcelog --client Memory errors SOCKET 0 CHANNEL 0 DIMM 0 DMI_NAME "P1-DIMMA1" DMI_LOCATION "P0_Node0_Channel0_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 1 DIMM 0 DMI_NAME "P1-DIMMB1" DMI_LOCATION "P0_Node0_Channel1_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 2 DIMM 0 DMI_NAME "P1-DIMMC1" DMI_LOCATION "P0_Node0_Channel2_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 3 DIMM 0 DMI_NAME "P1-DIMMD1" DMI_LOCATION "P0_Node0_Channel3_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h One can get the Information without `edac-util`, but maybe it would be still nice to have. EDAC Kernel Module does load automatically. I don't know if UNRAID will trigger a warning via WebUI if ECC Errors are detected, but this may be OT. +1 Edited January 9 by pixeldoc81 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.