Jump to content

Solved: Parity Swap nightmares


Recommended Posts

Edit: Thanks to everyone,

Edit2: tldr; disks are failing, parity swap procedure is losing configuration data before array is started. Parity rebuild takes 9 or more hours. My last post in this thread explains how I reduced parity rebuild times.

I've been having drives intermittently fail/recover (Smart errors) in my unraid trial for aver a week now. I ordered some 3TB drives to replace them, but they are set to come in at different times. Saturday after I received the first, but the others won't arrive until Tuesday. No big deal. I went to preclear the first drive and received errors that it is missing an unraid MBR. I'm not sure what that's about or why it would be a requirement to test and wipe a drive, but I moved on.

 

After setting the drive to replace the currently emulated drive, I receive and error that it is too large and I need to make it my parity drive. I look up the partiy swap, and it's a bit annoying because I'll have to be offline for a while, but it's only 80 GBs, so I didn't think it would be too bad. It took 10 hours to copy the parity data. 

 

Okay, so clearly it's doing some sort of imaging. I switched my drives back to the original configuration at this point real quick to make sure the old one still shows as failing before attempting to salvage it with a pre-clear. I shows as disabled and emulated, but not failing. When I switch back to the new configuration with my 3TB drive set as parity, it now displays that it is a new disk.

 

I decide to take advantage of the fact that I can preclear my new drive and call the 10 hours lost time. The Pre clear finishes Monday Morning. I begin the parity copy agaid. 9 hours later, it completes. The Parity drive I copied from says disabled emulated, and my 3TB says it's a valid parity disk now.

 

I switch the old, now emulated parity disk in the UI, so I can try to preclear it, and immediately I get an error that the array is not valid. I put it back where it was, and both parity disks say new and it wants me to copy again. 

 

Does anyone have any ideas why this process has been so painful or what could be going on? I've had my servers and VMs down for going on 3 days now, just waiting on copies and disk maintenance. Is parity swap on unraid usually this unreliable? I've got 10 days before purchasing, and I'm really leaning towards no, after this experience. 

Edited by herringms
Adding thanks at top, Adding tldr to help anyone searching
Link to comment

Parity swap isn't difficult if your hardware is working well and you follow the procedure exactly. It's very unclear from your description if any of that is true or in fact if parity swap was even appropriate. 

 

Do you have any important data on any of the involved disks? Note that parity itself has no data.

 

You probably should have asked for help before now. Maybe Diagnostics would give us something to go on.

 

Go to Tools-Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post. 

  • Like 1
Link to comment

Edit: thank you for your time

 

Initial configuration here:

image.png.098febf02616b6f348f81a54c9600963.png

 

This would be nice

image.png.1efaf4ee3d888525346c4c0d822de3ae.png

 

But:

image.thumb.png.d2e7c7765e1aa6dece6508d2e5e28edf.png

 

So this is necessary:

image.png.57e32500ed2c82a1e45dd8dfaed44a7b.png

image.thumb.png.46a4c54ee9794aac1bc4bb7ff089ea49.png

 

However, after it's complete and everything says it's valid, If I attempt to change the disk configuration in any way, it goes back to the original configuration. Rebooting the system also reverts to the original configuration.

I can send you screenshots 10 hours from now showing this, but the goal was to avoid another 10 hour copy for 80GBs of data.

 

 

To answer the other questions. Yes, there are VMs I would like to keep. They are on Disk 1. Disk 1 has had several smart errors as well.

 

unraidmc-diagnostics-20200601-1906.zip

Edited by herringms
Link to comment

Why are your drives so hot? These disks seem pretty old. Did you test any of these disks before attempting to build an array with them?

 

The SMART attributes for disk1 don't look bad. What SMART errors were you referring to exactly?

 

Your VMs and dockers should be on cache for better performance.

 

Since each data disk in the array is an independent filesystem, your disk1 data should be OK. If disk2 doesn't have anything important on it, I would just New Config with disk1 assigned as is, assign any other disks as needed, and rebuild parity.

  • Like 1
Link to comment
14 hours ago, herringms said:

avoid another 10 hour copy for 80GBs of data.

No that shouldn't happen. You must have been having hardware issues, if only bad connections.

 

After you New Config, if parity build is going slowly post new diagnostics before rebooting so we can take a look. The entire 3TB parity build shouldn't take more than 10 hours.

  • Like 1
Link to comment
6 hours ago, herringms said:

However, after it's complete and everything says it's valid, If I attempt to change the disk configuration in any way, it goes back to the original configuration. Rebooting the system also reverts to the original configuration.

That's normal, parity swap procedure needs to be done from start to finish without any assignment changes, if there are any after the copy (or a reboot) it will be reset and you need to start over.

  • Like 1
Link to comment
12 hours ago, trurl said:

No that shouldn't happen. You must have been having hardware issues, if only bad connections.

 

After you New Config, if parity build is going slowly post new diagnostics before rebooting so we can take a look. The entire 3TB parity build shouldn't take more 10 hours.

Hey, I really appreciate your time and looking into this. I also appreciate the tips.

 

I rethought the problem and realized I could drop disk 2 from the array, let it rebuild, and then drop the parity disk and let parity rebuild on the 3TB. This worked and only took about 6 hours. My other drives should be here soon to replace the again failing disk 2. 

 

 

To answer the other questions and satisfy curiosity,

Both disks have had the array rebuilt multiple times. There have been several distinct smart errors and notifications for disk 1 and 2, but not for the parity drive. I had a partner remoting into one of the vms confirming the poor performance when I was seeing the notifications in the Unraid UI. (90-100% CPU utilization within a CPU isolated VM, that seems almost certainly related to a bad sector saturating the processor cache for a couple of hours. Unraid reported that core at 12% while the VM reported 90-100).

 

They're hot because they've been writing for days, but more-so because I live in Texas and my AC is out. The house gets to be around 85+, and even moving air over them with ceiling, tower and filter fans going constantly over them it doesn't do a lot (warranties take forever to fulfill).

 

 

 

Link to comment
7 hours ago, johnnie.black said:

That's normal, parity swap procedure needs to be done from start to finish without any assignment changes, if there are any after the copy (or a reboot) it will be reset and you need to start over.

Appreciate the clarification, that makes sense. A warning or tooltip suggesting that would have been nice. It's good to know that's the only problem. I am slightly concerned that after the second copy, I got a green icon on the [new] 3tb parity drive, with the [old] 1tb parity drive in disk 2 saying it's contents were emulated, which is what gave me the confidence to switch assignments without booting up, but that's probably a symptom of having done the procedure twice without starting the array and I doubt many people would experience the same set of conditions that led to that.

Link to comment

While tee-tee jorge answered my questions (Thanks again), I've come across an option that would have helped in reducing parity rebuild times. This is for anyone viewing this later looking for a way to reduce that time

 

I have had reason to rebuild my parity several more times since the drives came in. This was primarily due to testing performance and isolation of different methods of sharing data (direct disk access, disk shares, user shares, raw disk pass through using vbox commands, etc.). 

 

Using the feature below has reduced parity build to around 4-5 hours. Edit (around 160MBps, which is about as close to theoretical max as I could hope.)

 

In Disk Settings, switch Tunable (md_write_method) to reconstruct write. As my configuration only has a few disks, this is a complete win.

link: https://wiki.unraid.net/Tips_and_Tweaks#:~:text=Turn on Reconstruct Write,-(Highly recommended!&text=A new mode has been,the read then the write).
This page was last modified on 4 September 2016

Turn on Reconstruct Write

(Highly recommended! Often called 'Turbo Write' )

Problem: Writing to parity protected data drives has always been slow, because it requires reading the old data and parity blocks, calculating the new parity, wait for platter rotation to bring the block location back around, then finally writing the data and parity blocks. This is known as 'read-modify-write' (RMW for short). A new mode has been added to unRAID called 'reconstruct write', where the data is immediately written, all of the other data disks are read, and parity is calculated then written. There is no wait for platter rotation! And no double I/O to each drive (first the read then the write). Rather than modifying parity as before, it's building it fresh.

Discussion: There's an upside and a downside to this change. The upside is it provides a huge performance boost to writes directly to array drives! Users often see speeds between 2 and 3 times as fast! (That's why it's sometimes referred to as 'Turbo Write'!) The downside is that ALL of the array drives must be spun up for EVERY write! So it's a trade-off between write speed and drives staying spun down.

Suggested fix: go to Settings then Disk Settings, and change Tunable (md_write_method) to reconstruct write

Note: the tunable option Auto currently just sets reconstruct write to off

Tip status: highly recommended! it's a fairly new feature, but so far, no reports of problems

work in progress

Edited by herringms
Link to comment
  • herringms changed the title to Solved: Parity Swap nightmares
17 minutes ago, itimpi said:

The Reconstruct Write (Turbo Write) should have no effect on Parity build times.    It should only (under the right circumstances) speed up writing to array data disks.   I have no idea why it seems to have helped with parity build.

That confused me too. Parity rebuilding already uses all disks.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...