fritzdis Posted December 8, 2020 Share Posted December 8, 2020 (edited) One of my data drives started showing read errors in the dashboard/log, although SMART doesn't look too problematic to me. The drive was one that previously seemed questionable, so I'm going to go ahead and replace it to be safe. I have dual parity, and the replacement drives I have available are bigger than my 2nd parity drive. I'm trying to decide what steps to take to upgrade the parity drive and replace the data drive. Here's what I'm thinking: Stop array Unassign parity 2 (6TB) Replace data drive with new 8tb drive (same size as parity 1) Start array in maintenance mode to rebuild data drive Since no writes should occur to other data drives or parity 1, parity 2 would remain valid if something goes wrong with the rebuild, right? Stop array Assign another new 8tb drive as parity 2 Start array normally to rebuild parity 2 The old 6TB parity 2 would be used to expand capacity in some manner after I verify checksums Not sure if I would need to start & stop the array between steps 2 and 3. The other option I can think of is to replace parity 2 first while leaving the questionable data drive in place. A small amount of read errors would be covered by parity 1, so the main downside would be if another drive starts showing issues in the same places as the questionable data drive. Any issues with my plan or alternative approaches that might be better? sf-unraid-diagnostics-20201207-1611.zip Edited December 10, 2020 by fritzdis Update title to [Solved] Quote Link to comment
trurl Posted December 8, 2020 Share Posted December 8, 2020 1 hour ago, fritzdis said: One of my data drives started showing read errors Which disk specifically? Quote Link to comment
trurl Posted December 8, 2020 Share Posted December 8, 2020 Nevermind, I see it is disk7. Quote Link to comment
trurl Posted December 8, 2020 Share Posted December 8, 2020 1 hour ago, fritzdis said: Not sure if I would need to start & stop the array between steps 2 and 3. Stop/Start with a disk unassigned is really only needed when you are trying to rebuild to the same disk. If you assign a different disk to the slot it will know to rebuild. Quote Link to comment
fritzdis Posted December 8, 2020 Author Share Posted December 8, 2020 3 minutes ago, trurl said: Nevermind, I see it is disk7. Yep. Here's the devices list if it helps: Quote Link to comment
fritzdis Posted December 8, 2020 Author Share Posted December 8, 2020 1 minute ago, trurl said: Stop/Start with a disk unassigned is really only needed when you are trying to rebuild to the same disk. If you assign a different disk to the slot it will know to rebuild. Thanks. I figured it would probably be fine, but wasn't sure if removing the 2nd parity drive while replacing a data drive would confuse things. Quote Link to comment
trurl Posted December 8, 2020 Share Posted December 8, 2020 As long as you have at least one parity disk with valid parity you can rebuild a data disk so it doesn't matter if parity2 is missing during rebuild. Then after the data disk is rebuilt assigning another disk to parity2 will rebuild parity2. 1 Quote Link to comment
fritzdis Posted December 8, 2020 Author Share Posted December 8, 2020 Perfect. I'll add the new disks to the server and get started. Quote Link to comment
fritzdis Posted December 8, 2020 Author Share Posted December 8, 2020 Bonus question: The replacement drives are untested (purchased used on ebay). Would you run a preclear first to test them or just replace? I figure the rebuilds will write the entire drive, and I can follow-up with long SMART tests, so that will give them a decent workout. I also have checksums for the data files that I can check afterward. Quote Link to comment
itimpi Posted December 8, 2020 Share Posted December 8, 2020 If you are not sure of the state of the drives it is probably a good idea to do a preclear to test them out before trying to use them in Unraid. Although you are correct in that the rebuild would write every sector it can be awkward to try and recover if the rebuild goes wrong due to issues with the new disk(s). It is much easier to handle discovering this during a preclear. 1 Quote Link to comment
fritzdis Posted December 8, 2020 Author Share Posted December 8, 2020 Yeah, you're right. I was hoping to replace the data drive (disk 7) ASAP, but I'd rather make sure as best I can that the replacement is trustworthy. I'm still running dual parity but with disk 7 emulated at the moment. I'll keep writes to the array to a minimum. Once the preclear finishes, I'll unassign parity 2 (too small for the replacement data drive) and assign the replacement to disk 7. I just want to double-check - rebuilding disk 7 in maintenance mode will avoid all writes to the other data drives and parity 1, right? That should mean the old parity 2 remains valid until the rebuild completes. Quote Link to comment
itimpi Posted December 8, 2020 Share Posted December 8, 2020 1 hour ago, fritzdis said: I just want to double-check - rebuilding disk 7 in maintenance mode will avoid all writes to the other data drives and parity 1, right? That should mean the old parity 2 remains valid until the rebuild completes. Yes, in Maintenance mode the array disks are not mounted so there is no way new files can be written to them. Quote Link to comment
fritzdis Posted December 10, 2020 Author Share Posted December 10, 2020 A single preclear pass went fine on one replacement drive (WD80EZAZ), but the other (WD80EMAZ) was not recognized. I assumed it was a 3.3V issue and ordered Kapton tape to cover the pins. After applying the tape, it still wasn't recognized, so I will need to test further to figure out what's going on. In the meantime, I replaced the 6TB parity 2 with the WD80EZAZ, started in maintenance mode, and now I'm rebuilding parity 2 (with disk 7 still being emulated). Once that's done, I can rebuild the disk 7 contents onto the 6 TB drive. I believe these steps allow me to always retain one level of parity protection. I'll verify checksums after the rebuilds are complete, and I'll probably run a long SMART test on the WD80EZAZ after that. But I expect everything to proceed without issue at this point, assuming no drive failures. Thanks for the help. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.