[SOLVED] Parity Failure - Read Errors


Recommended Posts

Good morning everyone.  

 

I am looking for advice.  I have been using unraid for the past 7 years.  I love the  product.

 

Last week I noticed that unraid had disabled my parity drive.  I purchased a new 10TB Ironwolf drive which arrived the next day.  Its currently in the pre-clear stage before I install it.

 

Last night a parity check ran which was unexpected as the former parity drive is actually not in the system.  I have the server run parity checks every month on the 1st.  This has worked out for me perfectly.

 

I have everything on my media server.  My internet router (PFSense VM) as well as Plex my media back end.   So far I am not experienceing any probelmes but I do have a concern.

 

After the parity read ran last night Unraid reported:

 

Last check completed on Tue 01 Mar 2022 08:48:10 AM PST (yesterday)
Finding 384 errors  Duration: 8 hours, 48 minutes, 9 seconds. Average speed: 94.7 MB/sec

Next check scheduled on Fri 01 Apr 2022 12:00:00 AM PDT
 Due in: 29 days, 16 hours, 56 minutes

 

so ... for the size of array I felt that was not too bad.  but before I commision the new parity drive into service, I want to make sure that the system is ok and the Parity will not be incorrect when its re-calculated.

Please find enclosed my Dignostics file.

 

Thank you for your help.

 

Sincerley,

 

SidebandSamurai

davyjones-diagnostics-20220302-0556.zip

Link to comment
26 minutes ago, SidebandSamurai said:

Last night a parity check ran which was unexpected as the former parity drive is actually not in the system.

Without parity Unraid does a read check, the errors are no sync errors, but read errors, from disk2 which is failing, since there's no parity best way to try and recover as much data as possible is to use ddrescue, then you can sync parity with the cloned disk.

Link to comment

Thank you very much for your quick reply.

 

So what you are saying ... before I install the new parity drive I need to address the issue with the read errors first?

 

Your article using DDRescue was excellent by the way.

 

Right now as it stands, the system is up and running without parity and with Drive 2 failing with read errors.

Edited by SidebandSamurai
Link to comment
27 minutes ago, SidebandSamurai said:

So what you are saying ... before I install the new parity drive I need to address the issue with the read errors first?

 

You can add parity first, but there will be read errors from disk2 during the sync, so Unraid won't be able to then correctly rebuild it.

Link to comment

The way I see it, you have 2 options. Attempt to build parity with the failing drive2 included, followed immediately be replacing drive2 and letting it rebuild from parity, or use the ddrescue to clone as much as possible from disk2 to a new drive, then use that new drive as disk2 and build parity from that set.

 

Since both paths end up at the same place, I think ddrescue is a much better option since it's designed to handle failing media, where the parity building process asks for the data and if the drive doesn't immediately provide it, it gives up and moves on. ddrescue utilizes a bunch of strategies to cajole the drive and extract every available bit.

 

How much would losing the data on drive2 hurt? If you have full backups and it's just a hassle, then I'd be tempted to give the parity build first a try. If you don't have good backups, ddrescue gives the best chance of intact recovery.

Link to comment
4 minutes ago, JonathanM said:

I think ddrescue is a much better option since it's designed to handle failing media, where the parity building process asks for the data and if the drive doesn't immediately provide it, it gives up and moves on. ddrescue utilizes a bunch of strategies to cajole the drive and extract every available bit.

This and it's possible to list the corrupt files, unlike with the other option, unless there are pre-existing checksums.

Link to comment

@SidebandSamurai

3 hours ago, JorgeB said:

disk2 which is failing

Since you seemed unaware of this, I assume you don't have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected.

 

Or you have been ignoring the notifications. You must take care of these problems as soon as they appear. Don't let an ignored problem become multiple problems and data loss.

 

You can also see SMART warnings for the drive on the Dashboard page, and by clicking on the drive to get to its Attributes. You actually have 4 different SMART warnings for that disk.

 

Also note that it's possible the only thing wrong with your original parity disk was a bad connection, but since it wasn't in diagnostics can't tell. If it is OK and not out-of-sync (nothing written to any array disks while it was disabled or missing) you could rebuild disk2 from that.

 

Link to comment

@trurl

Quote

Since you seemed unaware of this, I assume you don't have Notifications setup to alert you immediately by email or other agent as soon as a problem is detected.

yes guilty as charged.  I have since corrected this issue.  but its like closing the door after the horses have already left the barn.

 

Quote

Or you have been ignoring the notifications. You must take care of these problems as soon as they appear. Don't let an ignored problem become multiple problems and data loss.

Yes this is also the case.  Until recently it has been a money issue.  I now have started to have a little discretionary income which is why it went so long without addressing it.  It was only after the Parity drive was disabled that I pushed for the replacement drive that is now testing.  My wife has promised me a second 10T drive next payday.  She knows how important this system is to the family. 

 

Quote

Also note that it's possible the only thing wrong with your original parity disk was a bad connection, but since it wasn't in diagnostics can't tell. If it is OK and not out-of-sync (nothing written to any array disks while it was disabled or missing) you could rebuild disk2 from that.

No this was not a connection issue,  I am certain of that.  This system has been rock solid for 7 years.  Which means the 3T drives are the original drives in this system.  It has been in continuous use for the 7 years without problems. I believe they are 3T green drives if I remember correctly.  The cables I use all have locking tabs on them and I am using the 5x3 hotswap bay.  I have 3 of these bays in an mid tower case.  This was back when they made mid-tower cases with the ability to put that many drives in one case.

Link to comment

So I believe I will go the ddrescue route.  cloning the smaller drive to the bigger one is not an issue, after the clone is done will the drive show up as a 3T drive or a 10T drive in the array?

 

I will have to locate another drive to put in as a temporary because I only have one 10T drive and that has to be the parity drive.

Link to comment
11 minutes ago, SidebandSamurai said:

cloning the smaller drive to the bigger one is not an issue, after the clone is done will the drive show up as a 3T drive or a 10T drive in the array?

Unraid won't accept that since it requires the partition to use the full device capacity, but you can always copy data from the clone to other drives after mounting it with for example UD pluign.

Link to comment

@JorgeB

 

Quote

Unraid won't accept that since it requires the partition to use the full device capacity, but you can always copy data from the clone to other drives after mounting it with for example UD pluign.

So this is the Chicken before the egg scenario.  So If I clone the drive I will loose the whole array because at that point two drives will have now "failed".  Sick with me here.  The reason I say this is If I choose to clone this drive to a larger 10T drive the array will not accept the drive because the partition is not the full capacity of the drive. thus the array is now down.

 

This looks to me the only solution is to rebuild the parity just as it is then replace the failing drive.  Would this be correct?  or would I have to expand the partition on the new drive to occupy the entire disk.  THEN unraid will be sort of happy.  As I have not completely read your article on ddrescue I might have missed a step.

 

I don't have space on the remaining good drives, they are all full.

Link to comment
23 minutes ago, SidebandSamurai said:

they are all full

Another bad place to arrive at.

 

25 minutes ago, SidebandSamurai said:

loose the whole array

New Config will let you make a new array out of the remaining disks. Since each disk is independent in Unraid, they aren't lost.

 

1 hour ago, SidebandSamurai said:

not a connection issue,  I am certain of that.

Do you still have that original parity drive? Maybe it is in better condition than disk2.

3 hours ago, trurl said:

since it wasn't in diagnostics can't tell

 

 

Link to comment
14 hours ago, SidebandSamurai said:

So this is the Chicken before the egg scenario. 

If don't have enough space on the array on can do this:

 

-do a new config with the larger cloned disk and all the remaining disks

-start array to begin parity sync, the cloned disk will be unmountable due to unsupported partition layout

-wait for the parity sync to finish

-stop array

-unassign that disk

-start array, emulated disk will now have the correct 10TB partition created and should mount immediatly

-stop array

-re-assign the disk to rebuild on top, now using the new larger partition.

 

 

 

Link to comment
1 hour ago, SidebandSamurai said:

The method you stated above is only if the original parity drive is still good, right?

The method begins by syncing parity, and doesn't continue until after parity sync has completed.

 

Then the cloned disk is rebuilt from parity so it has the correct partition.

Link to comment

@trurl

Quote

The method begins by syncing parity, and doesn't continue until after parity sync has completed.

 

Then the cloned disk is rebuilt from parity so it has the correct partition.

Now this does not make any sense.  so let me ask you a question.  in unraid when you have a parity protected array, the parity must be the largest drive in the array.  Correct?

Going on that assumption, I can not use a 3T cloaned disk to a 10T drive because (and my apologies because you can see this in the original posting of my diagnostics) because the drive was removed from the system, My parity drive is only 3T.  Its serial number is W1F12655

 

My pre-clear has finished.  my 10T drive did 3 pre-clear cycles with no errors.

I have re-installed the failing 3T parity drive in exactly the same slot it came out of and re-ran the diagnostics.  Please advise.

 

Thank you for the time you have spent with me.  It is appreaciated.

davyjones-diagnostics-20220305-2322.zip

Edited by SidebandSamurai
Link to comment
5 hours ago, SidebandSamurai said:

serial number is W1F12655

yes that drive is failing

 

In order to clone the data disk, you need another disk.

 

If the other disk is the same size as the original data disk, then you don't have the problem of needing to rebuild the cloned disk to fix its partition size.

 

If the other disk is not the same size as the original data disk, then you need to rebuild it from parity after cloning to fix the partition size. That requires that you already have parity built on a new disk.

 

So, you can't do this without having another disk besides the one you precleared to use as parity.

Link to comment

Or you could just New Config and rebuild parity without that data disk, and worry about its contents later.

 

On 3/2/2022 at 12:06 PM, JonathanM said:

How much would losing the data on drive2 hurt? If you have full backups and it's just a hassle, then I'd be tempted to give the parity build first a try. If you don't have good backups, ddrescue gives the best chance of intact recovery.

Do you have another copy of anything important and irreplaceable?

 

Link to comment

Thanks for all your responses.  so the plan of action would be ...

  • Pull the failing disk 2
  • Pull the bad Parity drive.
  • Do a new config
  • build an array with the new 10T drive and the remeainng drives.
  • recover as much data from Drive 2
  • add a new 10T drive (yes the minimum drive will  now be 10T) to take the place of the old 3T drive
  • copy recovered data back to the array.

Would there be any other suggestions?

 

If I do a new config, what would happen with my VMs and Dockers, plugins and how everything is setup?

 

 

Link to comment
  • SidebandSamurai changed the title to [SOLVED] Parity Failure - Read Errors

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.