Jump to content

[SOLVED] Hard drive failure or worse?


Recommended Posts

I was trying to move files to disk17 within the array and it failed couple of times. Then "the ball" went orange. I just made parity check couple days ago and everything was fine.

 

Shouldn't the parity be correct and I could just replace the failed drive and rebuild?

 

I had similar problems with the drive before, but I could write to drive after few retries. The syslog mentions ata4, is that the port on the motherboard?

 

As I have migrated data to another server that has only sata ports in use the parity check speed is about 100Mb/s and with my older that has the HW as in my sig, I get about 30-50MB/s. There are 22 drives including parity. They are 2-3TB WD Green drives mostly, parity also. They are connected to PCI-e 16x slots that are 8x each when both are connected.

 

Should I get a motherboard that has more sataports if I would like the parity check to be faster?

 

The newer server has AT5NM10T-I with Atom D525.

syslog-2014-03-19_DISK17_FAIL.zip

Link to comment

If a cabling problem causes a drive to be "red balled" and then you fix the underlying problem, the red ball does not go away. Read the wiki section again. It should explain the way to rebuild the disk and make your array happy again.

Link to comment

A parity check soon after the reconstruct is a good idea, but does not have to be done instantly. You can powerdown the server or whatever. I would recommend you grab a syslog before you shutdown or reboot though, as it might contain hints if the drive had troubles during the rebuild. Also, check its smart report to make sure you're not seeing new issues.

Link to comment

Just one dumb question, but does the unraid do any checks during rebuild or is it just pure write to the disk?

When rebuilding a disk, you will be writing to the disk being rebuilt, and reading from all the other disks.  No additional checks are done on the disk being written - unRAID expects the underlying OS to report a problem if a write fails.

Link to comment

Thanks itimpi for the clarification.

 

There now about an hour left of the parity check and now this shows up in the syslog "Mar 24 16:00:31 Tower kernel: NTFS driver 2.1.30 [Flags: R/W MODULE]".

 

What does that mean? I have not never seen that before.

It was in the OP syslog. Since it says NTFS driver I think it is safe to assume it is unrelated to your unRAID array, which is reiserFS
Link to comment

Yes, thank you!  8)

 

BTW, if there would be a write problem during rebuild (write error on data disk) would the parity also get unvalid?

 

Can you tell me why is the parity check so low with the hardware on my sig, there are 2-3TB drives if that's the case. But my new build with just 4 x 4TB drives (all SATA II ports) gets 100MB/s and this one does about half of that... why?

 

Is PCI-e x16 so bad with two SASLP's?

Link to comment

A parity check is limited to the speed of the slowest drives involved at any point in the check.  It's not likely your controllers that are limiting the speed => you probably have a drive (or 2) with relatively low-density platters, whereas your new 4TB drives are probably all 1TB/platter units.

 

If you post the exact make/model of all of your drives, I can check the areal density of them and provide more specific details.

 

Link to comment

The older 2TB drives are WD EARS and EARX. The EARS I think are the problem. But thats great news as I am replacing them with 4TB drives.

 

Agree the issue is likely the EARS units.  The early 2TB EARS drives were 500GB/platter (4 platters).  Later units were 667GB/platter (3 platters).    Either of those will be far slower than a modern 1TB/platter drive.

 

Note that when your parity check passes the 2TB point these drives are no longer involved.  If you're watching it at that point you should see a notable increase in the speed.

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...