Jump to content

unRAID Server crashes within hours of starting. Disk 2 marked red.


extremeaudio

Recommended Posts

Fresh install. Disk 2 keeps getting a red cross.

 

I thought one disk was bad so put in a new one. After about an hour into parity sync, this one is with a red cross too. At this stage I cannot write to the array any longer either. I get the message "an unexpected error is keeping you from creating the folder ... error 0x8007045D" from my win 7 machine.

 

Log attached.

syslog.zip

Link to comment

The server is using a Giada Mi NAS25 board. I found it ideal since it is a budget board with 6 SATA ports.

 

Another peculiar thing that happens is - after disk 2 red balls, disk 3 and disk 4 also show as missing when I stop the array. If I reboot, disk 3 and disk 4 show up fine, only disk 2 is red balled.

 

Even when the disk was green and parity was being synced, the drive was not showing any temperature, just a *

 

I have tried re seating the sata cables to the hdd etc.

 

The drives have no data. It is a fresh tower.

 

 

Link to comment

Need more details about RAM Power Supply etc. Can be anyting, from bad cable upto broken motherboard.

 

Is the disk that redballs on the onboard  Marvell 88SE9230 controller? Maybe compatibility.

 

But I am just guessing.

 

All disks are on the onboard sata ports. Power supply is a 450W PSU, 4GB RAM.

 

Just curious, why are you running a beta and not the latest release?

 

I am using v6 stable. The machine in my signature is a different server, though actually that too is on the official release now, just not updated the signature.

 

Link to comment

Sorry, I didn't know that a board can have 2 controllers (I don't even know what the exact implication of that is either)

 

What do I need to set in the bios?

 

The disks show up just fine, work properly in the beginning and go bust a little later. I thought that if it was incompatible with the hardware I would probably face an issue right from the word go.

Link to comment

Reading from the syslog:

 

Nov  5 15:48:59 Server kernel: mdcmd (1): import 0 8,16 3907018532 ST4000VN000-1H4168_Z304AZ65

Nov  5 15:48:59 Server kernel: md: import disk0: [8,16] (sdb) ST4000VN000-1H4168_Z304AZ65 size: 3907018532

Nov  5 15:48:59 Server kernel: mdcmd (2): import 1 8,32 3907018532 ST4000VN000-1H4168_Z302F9AH

Nov  5 15:48:59 Server kernel: md: import disk1: [8,32] (sdc) ST4000VN000-1H4168_Z302F9AH size: 3907018532

Nov  5 15:48:59 Server kernel: mdcmd (3): import 2 0,0

Nov  5 15:48:59 Server kernel: md: disk2 removed

Nov  5 15:48:59 Server kernel: mdcmd (4): import 3 8,64 3907018532 ST4000VN000-1H4168_Z303T2YN

Nov  5 15:48:59 Server kernel: md: import disk3: [8,64] (sde) ST4000VN000-1H4168_Z303T2YN size: 3907018532

Nov  5 15:48:59 Server kernel: mdcmd (5): import 4 8,48 3907018532 ST4000VN000-1H4168_Z304AZ1H

Nov  5 15:48:59 Server kernel: md: import disk4: [8,48] (sdd) ST4000VN000-1H4168_Z304AZ1H size: 3907018532

 

It looks like disk2 is removed. Long before go is executed.

Seems you have a problem there. loose connector perhaps either power or datacable?

To be honest I am not an expert when it comes to reading syslogs.

 

Can you provide a diagnostics report? Tools-Diagnostics-Download.

 

There is a problem reported with the Marvell chipset here:

http://lime-technology.com/forum/index.php?topic=43109.msg411491#msg411491

 

Different situation, but might be worth a read.

All controllers seem to be in AHCI mode?

 

I would try a diiferent power connector from the PSU or a different data cable.

Link to comment

Oops. Spoke too soon. Disk 2, 3 and 4 are full of read errors. They are connected to 3 of the 4 differently colored set of sata ports. So I assume you are right about the Marvell incompatibility.

 

So at least we have narrowed down to what seems to be the problem - Disk 2 was going offline because of a bad cable and the others that were connected to the Marvell ports were problematic anyway!

 

Any ideas on how to fix this?

Link to comment

Would changing the board be the only way out of the situation? Disks 2, 3 and 4 are full of read errors in the log. Would have liked to retain this board.

 

You could try disabling the controller and adding a 4 port controller to the PCI-E port.

 

You can also try the suggestions given here:

 

http://lime-technology.com/forum/index.php?topic=40683.msg384314#msg384314

 

You might also check the PSU. What brand and type is it? 450W doesn't tell too much.

 

Are there also BIOS updates for the board?

Link to comment

Thanks a lot. Will check out that thread and see if it helps. I'm just intrigued because the drives add just fine, so they aren't completely unsupported I guess.

 

I doubt the power supply should be a problem. Its a CFI 5-bay case with the stock PSU. Have used it on a higher powered setups with no issue whatsoever.

Link to comment

Thanks a lot. Will check out that thread and see if it helps. I'm just intrigued because the drives add just fine, so they aren't completely unsupported I guess.

 

I doubt the power supply should be a problem. Its a CFI 5-bay case with the stock PSU. Have used it on a higher powered setups with no issue whatsoever.

 

Problem with Power supplies is that unRAID is unlike most other computer setups.  It does not require any any type of performance from a GPU ( which usually require a lot of 12 volt (amperes) to run, but it does require a a lot of power (amperes) on the 12 volt buss for the hard drives.  Many power supplies have dual 12v rails (it's cheaper)-- one for the GPU and one for the hard drives.  The recommendation for unRAID power supplies is to use a single 12v rail supply. 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...