November 3, 20187 yr I built a new unraid system 12 months ago. Apart from losing network connection occasionally, it's been really good and I don't want to change to another system unless I can solve my problem. Freenas/Ubuntu Server etc don't really appeal. I installed 2 SSD drives as a cache pool but the second drive gave me errors so I am using it on my Windows PC. Since then, whatever I do I can not install another parity drive, cache drive or add a drive to the array. I have tried 2 new SSD drives, a WD 4TB Red drive, a WD 2TB Purple drive but all of them never get past the clear/pre clear stage. I have also tried changing the SATA cables several times. If I do a pre-clear a second time it finishes after a few minutes and says completed. I have the same problem as a previous poster - https://forums.unraid.net/topic/74099-where-is-the-mce-log/ My system is as follows:- ASRock - EP2C602-4L/D16 Intel® Xeon® CPU E5-2670 0 @ 2.60GHz 128 GB Multi-bit ECC From the System log:- Nov 3 10:58:39 Tower kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x33e2c offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:0 ha:0 channel_mask:1 rank:4) Nov 3 11:01:48 Tower kernel: mce: [Hardware Error]: Machine check events logged Nov 3 11:01:48 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR tower-diagnostics-20181103-1116.zip Really appreciate any help. Thanks
November 3, 20187 yr Community Expert First thing you need to do is resolve the MCE errors, possibly a bad DIMM, check system event log, it might identify the problem one. Also check for a bios update to see if it gets rid of these constant warnings spamming the log: Nov 3 00:09:42 Tower root: ACPI group processor / action LNXCPU:04 is not defined Nov 3 00:09:42 Tower root: ACPI group processor / action LNXCPU:05 is not defined Nov 3 00:09:42 Tower root: ACPI group processor / action LNXCPU:06 is not defined Nov 3 00:09:42 Tower root: ACPI group processor / action LNXCPU:07 is not defined Nov 3 00:09:42 Tower root: ACPI group processor / action LNXCPU:08 is not defined
November 5, 20187 yr Author Thanks for the info. I have installed the latest BIOS for tower-diagnostics-20181105-1213.zipthe mobo.
November 5, 20187 yr Community Expert You have a disk connected on the Marvell ports, they are a known problem on those boards and shouldn't be used.
November 5, 20187 yr Author Thanks for the help. I'll go and find out now what a Marvell port is. Cheers
November 6, 20187 yr Author I love you - can I have your babies !! I did a pre-clear and a clear and it's worked ater 3 weeks of pushing and shoving. Fantastic. Who would have thought the problem was with the Marvell thingy. One question. Because of all the messing with the drives, my new 4 TB drive has been added to the array, but it remains in the Historical Devices section as 'missing'. Am I safe to select Remove?
November 6, 20187 yr Community Expert 15 minutes ago, bramley4600 said: Am I safe to select Remove? Yes
Archived
This topic is now archived and is closed to further replies.