Jump to content

[Solved] Replacing data drive (dual parity)


Recommended Posts

One of my data drives started showing read errors in the dashboard/log, although SMART doesn't look too problematic to me.  The drive was one that previously seemed questionable, so I'm going to go ahead and replace it to be safe.

 

I have dual parity, and the replacement drives I have available are bigger than my 2nd parity drive.  I'm trying to decide what steps to take to upgrade the parity drive and replace the data drive.

 

Here's what I'm thinking:

 

  1. Stop array
  2. Unassign parity 2 (6TB)
  3. Replace data drive with new 8tb drive (same size as parity 1)
  4. Start array in maintenance mode to rebuild data drive
    • Since no writes should occur to other data drives or parity 1, parity 2 would remain valid if something goes wrong with the rebuild, right?
  5. Stop array
  6. Assign another new 8tb drive as parity 2
  7. Start array normally to rebuild parity 2
    • The old 6TB parity 2 would be used to expand capacity in some manner after I verify checksums

 

Not sure if I would need to start & stop the array between steps 2 and 3.

 

The other option I can think of is to replace parity 2 first while leaving the questionable data drive in place.  A small amount of read errors would be covered by parity 1, so the main downside would be if another drive starts showing issues in the same places as the questionable data drive.

 

Any issues with my plan or alternative approaches that might be better?

sf-unraid-diagnostics-20201207-1611.zip

Edited by fritzdis
Update title to [Solved]
Link to comment
1 minute ago, trurl said:

Stop/Start with a disk unassigned is really only needed when you are trying to rebuild to the same disk. If you assign a different disk to the slot it will know to rebuild.

Thanks.  I figured it would probably be fine, but wasn't sure if removing the 2nd parity drive while replacing a data drive would confuse things.

Link to comment

Bonus question:

 

The replacement drives are untested (purchased used on ebay).  Would you run a preclear first to test them or just replace?

 

I figure the rebuilds will write the entire drive, and I can follow-up with long SMART tests, so that will give them a decent workout.  I also have checksums for the data files that I can check afterward.

Link to comment

If you are not sure of the state of the drives it is probably a good idea to do a preclear to test them out before trying to use them in Unraid. Although you are correct in that the rebuild would write every sector it can be awkward to try and recover if the rebuild goes wrong due to issues with the new disk(s).   It is much easier to handle discovering this during a preclear.

  • Thanks 1
Link to comment

Yeah, you're right.  I was hoping to replace the data drive (disk 7) ASAP, but I'd rather make sure as best I can that the replacement is trustworthy.

 

I'm still running dual parity but with disk 7 emulated at the moment.  I'll keep writes to the array to a minimum.  Once the preclear finishes, I'll unassign parity 2 (too small for the replacement data drive) and assign the replacement to disk 7.

 

I just want to double-check - rebuilding disk 7 in maintenance mode will avoid all writes to the other data drives and parity 1, right?  That should mean the old parity 2 remains valid until the rebuild completes.

Link to comment
1 hour ago, fritzdis said:

I just want to double-check - rebuilding disk 7 in maintenance mode will avoid all writes to the other data drives and parity 1, right?  That should mean the old parity 2 remains valid until the rebuild completes.

Yes, in Maintenance mode the array disks are not mounted so there is no way new files can be written to them.

Link to comment

A single preclear pass went fine on one replacement drive (WD80EZAZ), but the other (WD80EMAZ) was not recognized.  I assumed it was a 3.3V issue and ordered Kapton tape to cover the pins.  After applying the tape, it still wasn't recognized, so I will need to test further to figure out what's going on.

 

In the meantime, I replaced the 6TB parity 2 with the WD80EZAZ, started in maintenance mode, and now I'm rebuilding parity 2 (with disk 7 still being emulated).  Once that's done, I can rebuild the disk 7 contents onto the 6 TB drive.

 

I believe these steps allow me to always retain one level of parity protection.  I'll verify checksums after the rebuilds are complete, and I'll probably run a long SMART test on the WD80EZAZ after that.  But I expect everything to proceed without issue at this point, assuming no drive failures.

 

Thanks for the help.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...