HELP - Parity drive failure whilst rebuilding a drive


Recommended Posts

Hi,

 

I'm fearing the worst, but hoping for something salvageable...

 

Unraid 6.1.9:

 

I had a 5 disk array (+ parity + cache). (4Tb, 4Tb, 4Tb, 2Tb, 4Tb) + (4Tb + 750Gb)

 

Drive 5 failed (SMART report shows reallocated sectors etc,etc) and was completely inaccessible - so I bought new replacement drive and popped it into the server.

 

Unraid started rebuilding the drive - and it was all going perfectly when I left it (it had done just over 1Tb of the 4Tb). (I made sure none of the dockers were running and nothing else would have written to the array)

 

I've just got back from work expecting to find that everything was perfect to find that the Parity drive was showing with a big red cross with lots of errors (lots of reads and *no* writes as you would expect during a disk rebuild).  Shut it down and reseated the cables in the hope that I'd mislodged one whilst replacing the drive 5..

 

Unraid has restarted and can see the parity drive just fine - SMART report shows *ZERO* errors and in good health - so I suspect a simple cabling problem, although it obviously shows it with a big red X..

 

Drive 5 was still mid-rebuild.. and now shows as a orange triangle and as unmountable..

 

Is there anyway of telling Unraid that the Parity drive *is ok* and getting it to try and rebuild Disk 5 again - or have I just lost everything that was on it?

 

Chris

Link to comment

...Parity drive was showing with a big red cross with lots of errors (lots of reads and *no* writes as you would expect during a disk rebuild).

 

There are no parity writes during a rebuild.

 

Assuming the parity disk is really ok you can do this:

 

1) take a screenshot of your array

2) go to tools and click new config

3) reassign all disks, double check parity disk is in the parity slot

4) very important, check the box "parity is already valid" before starting array

5) start array, disk5 will still appear unmountable, it's ok for now.

6) stop array, unassign disk5 (select "no device")

7) start array, disk5 is going to be emulated and should now mount, check that all data is there

8) if all looks ok, stop array, reassign disk5 and start array to begin rebuild.

 

If parity disk has issues again it's probably bad, post diagnostics.

Link to comment

Hi,

 

Thanks for the advice..

 

Following and it all started out ok..

 

I got to 5) - started the array and Unraid thinks that disk 5 *IS* ok.

 

From a telnet connection I can indeed see /mnt/disk5 and a du -hs * errors on with "permission denied" on about 20 files across 5 directories.

 

Should I still stop the array, unassign disk 5 and start it again? Should at that point it be emulating drive 5 again?

 

If I stop, add back drive 5 and start will it really think it's still valid?

 

Not doing anything else until I know the best answer :)

 

Link to comment

Disk5 didn't completely rebuild, so it's going to have filesystem issues, hence the permission denied errors, assuming parity is ok you should continue with the other steps.

 

On step 7 all data should be available without any permission errors, if so, reassigning the same disk again on step 8 will start the rebuild of the data you see in step 7.

 

There's no problem using the same disk, when you unassign it it's "forgotten" by unRAID.

Link to comment

Hmm,

 

Doesn't look like is going to be quite as simple as I'd like.

 

Removed Disk 5 from the array as you suggested and started the array - drive 5 shows up a 'X', but it is being emulated.

 

Did the same du -hs * on /mnt/disk5 - and I get *more* (and different!) permission errors than I had before. So I'm thinking that  rebuilding the drive now is going to make matters worse..

 

The original disk 5 was sent off today for a warranty repair - so at some point in a couple of weeks or so I'm going to have a spare 4Tb drive. 

 

I'm *wondering* whether my best course of action would be 'shrink' the array to parity + disk1-4, rebuild parity from scratch, wait until I get the new drive, add that to the array, mount the 'possibly half built but perfectly happy' disk 5 as an 'unassigned drive', copy whatever copies from it onto the array and accept my losses? That might result in less than 4Tb of lost files.. Although might I be left with files that copy ok, but actually aren't "all there"?

 

If I wanted to do that (ie ditch drive 5 from the array), is that another 'New Config' (but don't say parity is ok)? 

 

Suggestions / Thoughts?

 

EDIT: Or do I put the 'rebuilt' drive 5 back in, run a reiserfsck (?) and accept that as good as it gets and rebuild parity again?

Link to comment

This means your parity is not 100% synced or it's damaged.

 

Can't guess which of the disks has more data, if you have the space you can run reiserfsk on the emulated disk and copy all its data to another disk, then do the same on the incomplete rebuilt disk using it inside or outside the array and compare both.

 

Unless old disk5 was completely dead would should have kept it until the rebuild was done, most times it's possible to recover most data if needed.

 

If you want to do a new config with or without disk5 it's the same procedure up to step 3, don't check the "parity is already valid" box, and when you start the array parity sync will begin.

Link to comment

Hey Jonnie,

 

Old 5 was very dead - not recognised by BIOS most of the time or unRaid when the BIOS did find it..  SMART report for it on the one occasion at the end when I could run it was very very unhappy..

 

If the du is anything to go by the real disk is 'happier' - the emulated disk really wasn't playing ball with du :)

 

Can I assign the new disk5 back into the array (so it's no longer emulated) and run reiserfsk on it? Any hints on the correct way of doing reiserfsk - I've not had to do it before..

 

If it repairs the filesystem enough I'll take my chances with the files that are on it - I know the drive itself is good (brandnew and precleared)

 

Thanks for all the advice so far - most appreciated!

 

Chris

Link to comment

This is what I would do, so you'd still have the option to rebuild or run reiserfsck on the emulated disk if needed:

 

-do another new config

-assign all disks except parity, leave parity slot empty for now

-start array in maintenance mode

-on the console/putty type:

 

reiserfsck --check /dev/md5

 

-follow intructions (see here https://lime-technology.com/wiki/index.php/Check_Disk_Filesystems#Drives_formatted_with_ReiserFS_using_unRAID_v5_or_later)

-if asked by reiserfsck to use --rebuild-tree do it

-when done, start array in normal mode and check disk5 data, if you're happy stop array, assign parity and start array to begin parity sync.

 

 

 

Unless you're parity was really out of sync or was somehow damaged I would expect to have better results with the emulated disk, so if the above results are not good and you want to try the emulated disk, repeat up to step7 from the list on my 1st post and run reiserfsck on the emulated disk, command is the same, start the array with the emulated disk on maintenance mode and run:

 

reiserfsck --check /dev/md5

 

-if this gives a better result then just stop array, reassign disk5 and start array to rebuild (with a spare disk you could try both options and then run a file compare util on both disks, all equal files on both disks should be ok, you'd need to check any different files).

 

 

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.