[SOLVED] Replace Data Drives with Larger Ones - Parity Check Errors


AgentXXL

Recommended Posts

I'm in the process at current of replacing 4 x 4TB drives (8.5years old each) with some new 10TB drives (shucked WD Elements). I followed the 'safer procedure' listed here in the Wiki.

 

https://wiki.unraid.net/Replacing_Multiple_Data_Drives_with_a_Single_Larger_Drive

 

The 1st drive rebuilt with no issues and since then I've been migrating data off the other 4TB drives to it and other free storage on the array. I've been using a combination of command-line, Krusader and the unBalance plugin to clear the remaining 3 drives. Once these drives are clear, I plan to try and follow this procedure:

 

1. Stop the array.

2. Unassign the 3 x 4TB drives to be removed from the array - set to 'No device'.

3. Power down the unRAID system, pull the 3 x 4TB drives and replace with 3 x new 10TB drives (all precleared).

4. Power up (array autostart set to No before power down) and choose Tools -> New Config.

5. Re-assign all drives to their appropriate slots and then start the array. This will launch a parity rebuild.

 

I was just wondering if there was a quicker way to do it, but from what I can tell, there's no fast way to replace an empty smaller drive with a precleared larger drive without doing the parity rebuild. I don't have any more available SATA connections so I think the method I'm following is my only choice. Any suggestions or thoughts? Thanks!

 

Dale

Link to comment

I'm replying to my own topic as I went ahead with the procedure in the 1st post. All seemed to go well except that the post-replacement parity resync detected thousands (273K+) of UDMA CRC errors on one of the new drives, the 1st WD 10TB one that I installed and used to migrate data off the old 4TB drives. The main tab of the unRAID webgui shows that drive (drive 18) as having 94078 errors and the message in the parity check section says:

 

Last check completed on Tue 17 Dec 2019 11:50:04 PM MST (yesterday), finding 93267 errors.
Duration: 1 day, 3 hours, 1 minute, 56 seconds. Average speed: 102.8 MB/sec

 

The drive is still marked good (no red X) and I've tried accessing some of the files on it with no issues seen (i.e. movies/video play properly). When the parity check completed it seems to have been set to 'Write corrections to parity', I assume because of the errors. As I'm fairly certain the drive itself and the data written to it is fine, I suspect the SATA cable might have been damaged when I installed the remaining 3 new 10TB drives.  All 4 new 10TB drives are connected with the same SFF-8087 mini-SAS to SATA forward breakout cable so I'll replace it with a new one.

 

Here's my question/dilemma: After I replace the cable, is the best solution to just start the array with the 'Write corrections to parity' option checked, or should I attempt another Tools -> New Config to rebuild the parity from scratch?

 

Diagnostics attached.

animnas-diagnostics-20191218-1229.zip

Link to comment
  • AgentXXL changed the title to Replace Data Drives with Larger Ones - Parity Check Errors
On 12/15/2019 at 12:33 PM, AgentXXL said:

I was just wondering if there was a quicker way to do it, but from what I can tell, there's no fast way to replace an empty smaller drive with a precleared larger drive without doing the parity rebuild.

Everything you did was completely unnecessary. You can always replace a smaller drive with a larger drive, no need to preclear, and no need to move anything off the smaller drive to begin with.

 

The wiki you followed was specifically for the purpose of ENDING UP with FEWER drives in your array. If all you wanted to do was replace smaller drives with larger drives then all you had to do was replace each with a larger drive and rebuild, one at a time.

 

Replacing a disk is what parity is all about. You can't replace a disk with a smaller disk, but other than that, it doesn't matter if the replacement is larger, if the replacement is clear, if the original is empty.

 

With that many errors it may be faster to rebuild parity.

 

 

Link to comment
1 minute ago, trurl said:

Everything you did was completely unnecessary. You can always replace a smaller drive with a larger drive, no need to preclear, and no need to move anything off the smaller drive to begin with.

 

The wiki you followed was specifically for the purpose of ENDING UP with FEWER drives in your array. If all you wanted to do was replace smaller drives with larger drives then all you had to do was replace each with a larger drive and rebuild, one at a time.

 

Replacing a disk is what parity is all about. You can't replace a disk with a smaller disk, but other than that, it doesn't matter if the replacement is larger, if the replacement is clear, if the original is empty.

 

With that many errors it may be faster to rebuild parity.

 

 

While I realize I could have let each 4TB drive replacement rebuild from parity, it took less time to move data off the remaining 4TB drives than it would have to rebuild each one onto the new 10TB replacements. Plus I used the opportunity to use the unBalance plugin to gather certain folders so that all of their content is on one drive only (an OCD thing of mine).

 

As for the preclear, I mentioned doing it only as a way to do an initial test of the drives before shucking them. Regardless, the drives (and the data on them) appear to be fine. I'm certain that the issues reported after the new config are cabling related so I'll go ahead and replace it and then run another parity check.... I assume just leaving the 'Write corrections to parity' option checked? Or am I better to do the Tools -> New Config route again to completely rebuild parity?

 

Link to comment
On 12/15/2019 at 12:33 PM, AgentXXL said:

3. Power down the unRAID system, pull the 3 x 4TB drives and replace with 3 x new 10TB drives (all precleared).

4. Power up (array autostart set to No before power down) and choose Tools -> New Config.

5. Re-assign all drives to their appropriate slots and then start the array. This will launch a parity rebuild.

 

I was just wondering if there was a quicker way to do it, but from what I can tell, there's no fast way to replace an empty smaller drive with a precleared larger drive without doing the parity rebuild.

OK, starting from #3, what you did was probably the fastest. Parity rebuild is required because an empty drive and a clear drive are not the same thing. In fact, the empty drives still had all of the bits of those files you had moved off, but they were no longer part of the empty filesystem. And parity still matched those empty drives with all of those bits from the moved-off files. But a clear drive is all zeros, so not the same as the empty drives, and not in sync with parity.

 

But it was only one parity rebuild for 3 replacements.

 

Parity rebuild at this point might be somewhat faster than parity check, since it doesn't actually have to check.

  • Like 1
Link to comment
3 minutes ago, trurl said:

Parity rebuild at this point might be somewhat faster than parity check, since it doesn't actually have to check.

That's my suspicion too... the full parity rebuild after doing a Tools -> New Config took about 27 hrs but I've had my monthly non-correcting parity checks take up to 45 hrs. I'll do the full parity rebuild again as soon as I shutdown and replace the cabling. Thanks!

Link to comment
1 hour ago, AgentXXL said:

monthly non-correcting parity checks take up to 45 hrs.

Do you have port multipliers or something? I usually estimate 2-3 hours per TB of parity to check. My 6TB parity is a little over 14 hours to check.

 

Maybe it will improve now that you have those old small disks out of the way.

Link to comment
1 hour ago, trurl said:

Do you have port multipliers or something? I usually estimate 2-3 hours per TB of parity to check. My 6TB parity is a little over 14 hours to check.

 

Maybe it will improve now that you have those old small disks out of the way.

No port multipliers: 6 x SATA from motherboard (all Intel SATA) and 16 from the LSI 9201-16i in IT mode. The old HGST 4TB drives were also 5400 rpm whereas the rest of the drives (and the 4 new 10TB replacements) are all 7200rpm. I know rotational speed doesn't always translate to higher performance, but having all drives the same won't hurt.

 

After replacing the cable I'm now 5% into the complete parity rebuild (used the Tools -> New Config method) and no errors (CRC or otherwise). As I said above, the data and the drives themselves are fine - it was just a bad SATA cable. That's one of the disadvantages that the LSI cards have - you can't just replace the cable for single drive as you need the SFF-8087 miniSAS to SATA breakouts (4 drives per cable). At least I have spare new cables on hand.

 

Thanks again!

Dale

 

Link to comment
  • AgentXXL changed the title to [SOLVED] Replace Data Drives with Larger Ones - Parity Check Errors
On 12/18/2019 at 8:01 PM, trurl said:

Those larger drives should also perform better simply due to increased density.

Not sure they made a huge difference but the full parity rebuild on my dual parity drives took 24hrs, about 3 - 4 hrs less than previously. That's 18 data drives and 2 parity drives. Regardless of time, the more important issue is that there were zero errors after replacing the SATA cable. My 168TB+ unRAID array is running quite nicely now.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.