Jump to content

Disk problems


bramley4600

Recommended Posts

I built a new unraid system 12 months ago. Apart from losing network connection occasionally, it's been really good and I don't want to change to another system unless I can solve my problem. Freenas/Ubuntu Server etc don't really appeal.

I installed 2 SSD drives as a cache pool but the second drive gave me errors so I am using it on my Windows PC.

Since then, whatever I do I can not install another parity drive, cache drive or add a drive to the array. I have tried 2 new SSD drives, a WD 4TB Red drive, a WD 2TB Purple drive but all of them never get past the clear/pre clear stage.

I have also tried changing the SATA cables several times.

If I do a pre-clear a second time it finishes after a few minutes and says completed.

I have the same problem as a previous poster - https://forums.unraid.net/topic/74099-where-is-the-mce-log/

My system is as follows:-

ASRock - EP2C602-4L/D16

Intel® Xeon® CPU E5-2670 0 @ 2.60GHz

128 GB Multi-bit ECC

 

From the System log:-

 

Nov 3 10:58:39 Tower kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x33e2c offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:0 ha:0 channel_mask:1 rank:4)
Nov 3 11:01:48 Tower kernel: mce: [Hardware Error]: Machine check events logged
Nov 3 11:01:48 Tower kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR

 

 

 

tower-diagnostics-20181103-1116.zip

 

Really appreciate any help.

 

Thanks

Link to comment

First thing you need to do is resolve the MCE errors, possibly a bad DIMM, check system event log, it might identify the problem one.

 

Also check for a bios update to see if it gets rid of these constant warnings spamming the log:

 

Nov  3 00:09:42 Tower root: ACPI group processor / action LNXCPU:04 is not defined
Nov  3 00:09:42 Tower root: ACPI group processor / action LNXCPU:05 is not defined
Nov  3 00:09:42 Tower root: ACPI group processor / action LNXCPU:06 is not defined
Nov  3 00:09:42 Tower root: ACPI group processor / action LNXCPU:07 is not defined
Nov  3 00:09:42 Tower root: ACPI group processor / action LNXCPU:08 is not defined

 

Link to comment

I love you - can I have your babies !!  I did a pre-clear and a clear and it's worked ater 3 weeks of pushing and shoving. Fantastic.

Who would have thought the problem was with the Marvell thingy.

One question. Because of all the messing with the drives, my new 4 TB drive has been added to the array, but it remains in the Historical Devices section as 'missing'. Am I safe to select Remove?

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...