ARGH! Double Drive Fail, one is the Parity Drive!


32 posts in this topic Last Reply

Recommended Posts

So I had two drives fail within a short time frame, before I could replace my parity drive, during the reboot, a second drive failed and started clicking. I tried so many things. I finally opened the drive, it is not pretty at all. The platters were pretty baddly scratched. I have not tested my parity drive yet. I am hoping that one will be salvagable. In a case of a double drive failure, which drive do you replace fist? The parity right? I have no idea what was on the failed drive. Is there a way of getting a list of files on the failed drive?

Link to post

I wonder is it possible to backup parity drive with CrashPlan? This would solve the issue of multiple drive failures.

 

That would only solve the issue of a dual disk failure, where one of the disks is the parity drive.  If you lost 2 data drives, you'd still be in the same place.

 

As Automatic mentioned, it's better to have a parity drive fail if you're going to have 2 drives fail at once.

Link to post

I wonder is it possible to backup parity drive with CrashPlan? This would solve the issue of multiple drive failures.

 

 

There are no filesystems or files to back up on the parity drive.

ddrescue is the only chance of saving some of the data.

 

 

If the parity drive is readable with a few bad sectors, those bad sectors will not be restored correctly on the rebuilt drive.

If the parity drive is unusable in any form, you've lost one data drive. Feel fortunate.. if it were Raid5 you would have lost the whole array.

I know it doesn't help, but I lost 5 of 20 drives and I was still able to access 75% of my data.

Link to post

I have not tried the ddrestore.

I did have another 1tb, I tried to port the platters over, but the platters are really in bad shape. Surprised the drive did not fail sooner. Seriously thinking of rebuilding a second unraid server, and porting over the data to new server. If I did, I would want to create a hardware raid 5 for my parity drive, and then the rest of the drives as is. Has anyone one attempted it?

 

I might be able to read the parity drive, as I can see it in ubuntu.

The data drive is gone gone gone, gone so long, its gone gone gone so long! Reminds me of a song...

 

Is there a link to the steps to the ddrestore option?

That would be too cool!

Link to post

There might be light!

 

Test Option: QUICK TEST

Model Number: WDC WD20 EARS-00MVWB0

Unit Serial Number:

Firmware Number: 

Capacity: 2000.40 GB

SMART Status: Not Available

Test Result: PASS

Test Time: 15:00:05, March 11, 2013

 

Test Option: QUICK TEST

Model Number: WDC WD20 EARS-00MVWB0

Unit Serial Number:

Firmware Number: 

Capacity: 2000.40 GB

SMART Status: Not Available

Test Result: PASS

Test Time: 15:00:05, March 11, 2013

 

If the smart test, fails or does not report, but the short test from WD passes, I should be able to read the data right?

Is there a way to force the parity drive back online if it does not have smart enabled?

 

 

Link to post
I would want to create a hardware raid 5 for my parity drive, and then the rest of the drives as is.
Genuinely curious here. Why? I can't think of a single reason to have a the parity volume as a RAID 5. RAID 0 maybe, for the increased write speed, but RAID 5? What am I missing?
Link to post

There is no reason to have the parity drive as another raid array. It is no more important then any other data drive. RAID0 on parity yields something like a 2-3% performance boost, but that's it.  However, it would help for allot of random I/O.

 

as far as ddrescue I've written a couple of posts about it's use. It's saved my butt big time.

 

Regarding western digital diagnostics. I cannot comment.

I would surmise, it is using the firmware's smart information.

 

My confidence level would be increased with a SMART long test via smartctl.

 

If that passes, then the drive is probably good.

 

As far as using the parity drive should it prove good.

See the "trust my parity" procedure to see how that works.

 

I tried it, it did not work for me because I tried to use it after a parity upgrade operation.

There's plenty of talent and knowledge on the board to assist.

 

There is hope if the parity drive is good. The SMART long test will take a couple of ours. possibly 3-4.

Link to post

I don't have all the answers on trusting the parity for the array.

 

here is something from the wiki

http://lime-technology.com/wiki/index.php/Make_unRAID_Trust_the_Parity_Drive,_Avoid_Rebuilding_Parity_Unnecessarily

I suggest you review the board for the procedure and possible gotcha's from other members.

 

have you pre-cleared a drive that will be used for rebuilding the failed drive?

 

Link to post

Please, for heavens sake, STOP trying things with that array!!!  Even the simple things your trying destroy more data.

Double drive failures are common because of the extra strain put on the good drives after the first one fails.  We can almost certainly recover the files, if you want to.

 

I'll even give you a special rate for the data recovery: www.data-medics.com

Link to post

r.e. Data-Medics ==>  Your web site provides a good overview, and I've marked it for future reference.  But I suspect that the odds of recovering data from the 2nd (non-parity) drive are actually fairly low, based on this comment:  "... I finally opened the drive, it is not pretty at all. The platters were pretty baddly scratched."

 

... opening a drive outside of a cleanroom environment is NOT a good idea !!  :)

 

r.e. the WD Diags => the short test is fairly useless.    Run the long test ... but the problem is that even if it passes, I don't know how to force UnRAID to treat it as a valid parity drive => that's what would need to happen to allow a rebuild of the failed data drive.

 

The best option is to simply replace both drives;  reset the array; rebuild parity;  and then restore your lost data from your backups.

 

Link to post

Can you provide the link to ddrescue procedure?

I am still stuck with the parity drive showing up as blue, and drive 1 as red.

I do not believe I can use the force parity drive back online, since I have a failed drive in the mix, at least not with what I read in the forums. If I can get the parity drive online, then replace the failed drive, for a rebuild, that would be amazing.

 

Looking at building my second unraid as we speak. I think I will port my data over, and reduce the number of drives in half, as I still have many 700GB dirves. Running 4.6 Unraid (If it ain't broke, don't play with it). It has done well up until now.

I replace two drives on average per year so far, understand I started a long time ago, and with 200GB and 400GB drives originally.

 

WeeboTech: The areca ARC-1222 PCIe x8 SATA controllers, can they also do 3-4TB drives? I saw some for sale at 200$ used.

Link to post

> Can you provide the link to ddrescue procedure?

Please search around the forum and the internet. I had to read allot of pages and all the links I saved were lost when hurricane sandy destroyed all my belongings.

 

If you cannot 'trust your parity' with the failed drive,  then there is nothing you can do. ddrescue will do nothing except copy a parity drive that is useless.

 

I would suggest you contact tom directly to see if there is some way to trust the parity drive and rebuild the failed drive. As far as the ARC-1222 I just received mine. I believe it will require a firmware update. I do not know yet.

I remember the ARC-1200 being able to access 3TB drives. But this is the least of your worries.

The Areca would not have saved your situation.  You had a double drive failure.  No one drive is more important then the other.

I suppose you could raid1 the parity drive, but then it will slow down your writes.

In the event of a parity drive failure, you could have some insurance, but it's not always the parity drive that goes during a rebuild.

In my case it wasn't.

 

Do you do the monthly parity checks?

Link to post
  • 8 months later...

This won't get your data back, but this is what I have done to protect me from double disk failure in addition to online backup.  Since most of my data is replaceable like my blu-ray backups, and the only data I really care if I lose is my pictures and home videos I have set a couple of drives to be excluded from all of my shares.  On those drives I created a "pictures 2" and "software 2" share that I don't broadcast in SMB.  I backup my pictures share to this one on a regular bases or when I upload new photos.  This way I can loose 2 drives or more depending on the drives,  and still have all my important data.  I find this more practical for me than building 2 unraid servers.

Link to post

This won't get your data back, but this is what I have done to protect me from double disk failure in addition to online backup.  Since most of my data is replaceable like my blu-ray backups, and the only data I really care if I lose is my pictures and home videos I have set a couple of drives to be excluded from all of my shares.  On those drives I created a "pictures 2" and "software 2" share that I don't broadcast in SMB.  I backup my pictures share to this one on a regular bases or when I upload new photos.  This way I can loose 2 drives or more depending on the drives,  and still have all my important data.  I find this more practical for me than building 2 unraid servers.

 

I do something with rsync for a lil extra backup.

I use rsync with the --link-dest="${BACKUPDIR}/${LAST_BACKUP_DATE}" option.

 

I backup my folders to dated directories. YYYYMMDD, use a script to find the most recent backup folder.

Then use the --link-dest= option with that dated folder name.

 

It links the next backup folder (today) to the prior one before doing the rsync.

 

Then the rsync from the source directories occur unlinking and overwriting the newer changed files.

This gives you a running directory of a full backup + the changes for that date.

What's cool is you use 1x the source directory size, then your backup only grows by the incremental changes over the course of the backup period.

 

You can then delete older directories and still keep your most important dated backups in place.

 

 

Because du takes the links into account.

A du down the tree shows the first backup as a full backup and only reports the changed files that are not linked.  This example shows I update almost a 100M a day.

I.E.

 

# du -hs 201306*

7.0G    20130601

67M    20130602

67M    20130603

70M    20130604

71M    20130605

70M    20130606

70M    20130607

71M    20130608

67M    20130609

67M    20130610

71M    20130611

76M    20130612

83M    20130613

 

Now if I remove the oldest full backup.

# rm -r 20130513

 

Shows the latest size is now 20130602

 

# du -hs 201306*

7.0G    20130602

67M    20130603

70M    20130604

71M    20130605

70M    20130606

70M    20130607

71M    20130608

67M    20130609

67M    20130610

71M    20130611

76M    20130612

83M    20130613

 

But in reality each directory is it's own full backup.

 

root@rgclws:/local/backups/npgvm3 # du -hs 20130604

7.0G    20130604

root@rgclws:/local/backups/npgvm3 # du -hs 20130605

7.0G    20130605

root@rgclws:/local/backups/npgvm3 # du -hs 20130606

7.0G    20130606

root@rgclws:/local/backups/npgvm3 # du -hs 20130608

7.0G    20130608

 

See my rsync_linked_backup script via google code page for ideas on how to do this.

 

Link to post

This way I can loose 2 drives or more depending on the drives,  and still have all my important data.  I find this more practical for me than building 2 unraid servers.

 

Backing up is certainly better than not backing up ... but there is a major flaw in this process => the backups are on the SAME PC.  A "zap" that kills the PC and wipes out a bunch of the drives could easily cause lost data (and it CAN and does happen).    I do exactly the same thing with my automated backup utility that runs at 4:00am every day ... it backs up each of our PC's to a 2nd drive on the same PC.    But it ALSO backs the PCs up to each other [e.g. my PC is backed up to a 2nd drive internally; then to a spare drive on my wife's PC;  my wife's PC is backed up to a 2nd drive internally; then to the extra drive on my PC;  etc.  (In fact they're both also then backed up to an UnRAID server, but I'm a true backup fanatic ... that is a bit of overkill)].

 

I'd at least mount the extra drives in external enclosures, so they've got an independent power source; or better yet, connect them to a different PC => doesn't have to be one that's on all the time ... just when you're running the backup.

 

Link to post
A "zap" that kills the PC and wipes out a bunch of the drives could easily cause lost data (and it CAN and does happen).

 

It happened to me.  Power supply failure.  One of the rails went to 24v.

Zapped everything in the machine. Almost a total loss.

The only thing that survived was the CPU.

Link to post

These pics are backups of my main pc which is also backed up with crash plan and an external that I keep at my locker at work.  So my unraid is a file server that I store one set of my backups on.  So this way I have an onsite backup as well as 2 offsite backups that way I have quick access to my files if something is to go wrong with my main PC.

 

Sent from my SCH-I545 using Tapatalk

 

Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.