October 12, 20196 yr unRaid already have edac, but for some reason is missing edac-util It would provide much easier way to troubleshoot RAM and PCIe issues.
October 16, 20196 yr I would also like this. In its absence I have been referring to this post which suggests the same information is available without the tool.
September 22, 20223 yr On 10/16/2019 at 12:54 PM, flaggart said: I would also like this. In its absence I have been referring to this post which suggests the same information is available without the tool. Quote # grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count grep: /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count: No such file or directory Not seeing anything about EDAC in dmsg too. Does Unraid still support ECC?
September 29, 20223 yr On 9/22/2022 at 8:30 PM, realies said: Not seeing anything about EDAC in dmsg too. Does Unraid still support ECC? EDAC should usually load automatically if you CPU/Motherboard/Memory Controller supports it. However you can load it manually using: modprobe amd64_edac for AMD CPUs or for Intel 10th, 11th, 12th Gen processors with: modprobe igen6_edac If you get an error like "modprobe: ERROR: could not insert 'MODULNAME': No such device" then it's most likely the wrong module since this means that the module doesn't find a compatible hardware device. BTW you can get all available modules with this command: ls -la /lib/modules/*-Unraid/kernel/drivers/edac/ The main issue with EDAC is that it is really noisy at times and also can give you a lot of false positives and can ultimately drive you crazy... EDAC also reports PCIe errors from what I know and since not all PCIe devices follow the entire PCIe standard there can be many, many, many, maaaannnny issues at certain times and with certain hardware combinations.
January 9, 20242 yr Here is an example from my Server. Mainboard: Supermicro X10DRI-F CPU: Intel Xeon E5-2630 v4 RAM: 4x 8GB ECC Memory UNRAID: 6.12.6 # dmesg | grep EDAC [ 129.472264] EDAC MC: Ver: 3.0.0 [ 129.482175] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 129.482194] EDAC sbridge: Seeking for: PCI ID 8086:6fa0 [ 129.482213] EDAC sbridge: Seeking for: PCI ID 8086:6f60 [ 129.482226] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 129.482234] EDAC sbridge: Seeking for: PCI ID 8086:6fa8 [ 129.482244] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 129.482251] EDAC sbridge: Seeking for: PCI ID 8086:6f71 [ 129.482261] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 129.482269] EDAC sbridge: Seeking for: PCI ID 8086:6faa [ 129.482279] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 129.482286] EDAC sbridge: Seeking for: PCI ID 8086:6fab [ 129.482292] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 129.482296] EDAC sbridge: Seeking for: PCI ID 8086:6fac [ 129.482302] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 129.482307] EDAC sbridge: Seeking for: PCI ID 8086:6fad [ 129.482312] EDAC sbridge: Seeking for: PCI ID 8086:6f68 [ 129.482317] EDAC sbridge: Seeking for: PCI ID 8086:6f79 [ 129.482325] EDAC sbridge: Seeking for: PCI ID 8086:6f6a [ 129.482333] EDAC sbridge: Seeking for: PCI ID 8086:6f6b [ 129.482341] EDAC sbridge: Seeking for: PCI ID 8086:6f6c [ 129.482365] EDAC sbridge: Seeking for: PCI ID 8086:6f6d [ 129.482374] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 129.482377] EDAC sbridge: Seeking for: PCI ID 8086:6ffc [ 129.482384] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 129.482387] EDAC sbridge: Seeking for: PCI ID 8086:6ffd [ 129.482394] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 129.482399] EDAC sbridge: Seeking for: PCI ID 8086:6faf [ 129.482617] EDAC MC0: Giving out device to module sb_edac controller Broadwell SrcID#0_Ha#0: DEV 0000:ff:12.0 (INTERRUPT) [ 129.482623] EDAC sbridge: Ver: 1.1.2 # lsmod | grep edac sb_edac 24576 0 edac_core 65536 1 sb_edac # grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count /sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch2_ce_count:0 /sys/devices/system/edac/mc/mc0/csrow0/ch3_ce_count:0 # lshw -class memory | grep ecc capabilities: ecc configuration: errordetection=multi-bit-ecc # mcelog --client Memory errors SOCKET 0 CHANNEL 0 DIMM 0 DMI_NAME "P1-DIMMA1" DMI_LOCATION "P0_Node0_Channel0_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 1 DIMM 0 DMI_NAME "P1-DIMMB1" DMI_LOCATION "P0_Node0_Channel1_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 2 DIMM 0 DMI_NAME "P1-DIMMC1" DMI_LOCATION "P0_Node0_Channel2_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h SOCKET 0 CHANNEL 3 DIMM 0 DMI_NAME "P1-DIMMD1" DMI_LOCATION "P0_Node0_Channel3_Dimm0" corrected memory errors: 0 total 0 in 24h uncorrected memory errors: 0 total 0 in 24h One can get the Information without `edac-util`, but maybe it would be still nice to have. EDAC Kernel Module does load automatically. I don't know if UNRAID will trigger a warning via WebUI if ECC Errors are detected, but this may be OT. +1 Edited January 9, 20242 yr by pixeldoc81
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.