lishpy Posted September 24, 2023 Share Posted September 24, 2023 I initiated a parity swap yesterday following this procedure: https://docs.unraid.net/legacy/FAQ/parity-swap-procedure/ I checked on the progress this morning and it was at 96% for the copy procedure, so I expected things to be done shortly there after and initiate the disk rebuild. I just checked progress in the array again, and the array operation page is now back to where I started, saying I can initiate the copy again. According to the documentation, that COPY button should be a START button to start the array and initialize the data rebuild. What's even more confusing, checking my server notifications, about the time I expected the copy function to complete, I have a notification saying the following: "Disk 4, is being reconstructed and is available for normal operation" Disk 4 is the old parity drive that is being turned into a data disk. I can't see any indication that disk 4 is actually being reconstructed though, I have no ability to start the array, and the array devices page shows the new parity drive and new disk as new devices. Did this somehow not complete successfully and I need to run the copy again? I don't see any indication of any failures and the notification message this morning makes me think it completed successfully, but now I don't know what state the array is in. I don't see anything in the logs out of the ordinary right now, the disks are spun down. Log says the copy completed successfully, but below says slot 0 is wrong import. Sep 24 12:29:29 Tower emhttpd: copy: disk4 to disk0 completed Sep 24 12:29:55 Tower kernel: md: unRAID driver removed Sep 24 12:29:55 Tower emhttpd: shcmd (39876): /sbin/modprobe md-mod super=/boot/config/super.dat Sep 24 12:29:55 Tower kernel: md: unRAID driver 2.9.27 installed Sep 24 12:29:55 Tower kernel: mdcmd (1): import 0 sdf 64 13672382412 0 WDC_WUH721414ALE6L1_Y6G3NGSC Sep 24 12:29:55 Tower kernel: md: import disk0: (sdf) WDC_WUH721414ALE6L1_Y6G3NGSC size: 13672382412 Sep 24 12:29:55 Tower kernel: md: import_slot: 0 wrong Sep 24 12:29:55 Tower kernel: mdcmd (2): import 1 sdd 64 7814026532 0 WDC_WD80EDAZ-11TA3A0_VGGYLR9G Sep 24 12:29:55 Tower kernel: md: import disk1: (sdd) WDC_WD80EDAZ-11TA3A0_VGGYLR9G size: 7814026532 Sep 24 12:29:55 Tower kernel: mdcmd (3): import 2 sdc 64 7814026532 0 WDC_WD80EFAX-68LHPN0_7SGLHX6C Sep 24 12:29:55 Tower kernel: md: import disk2: (sdc) WDC_WD80EFAX-68LHPN0_7SGLHX6C size: 7814026532 Sep 24 12:29:55 Tower kernel: mdcmd (4): import 3 sdh 64 3907018532 0 ST4000DM000-1F2168_Z304H4Z2 Sep 24 12:29:55 Tower kernel: md: import disk3: (sdh) ST4000DM000-1F2168_Z304H4Z2 size: 3907018532 Sep 24 12:30:05 Tower kernel: mdcmd (5): import 4 sdg 64 7814026532 0 WDC_WD80EFAX-68LHPN0_7SGLWD9C Sep 24 12:30:05 Tower kernel: md: import disk4: (sdg) WDC_WD80EFAX-68LHPN0_7SGLWD9C size: 7814026532 Sep 24 12:30:05 Tower kernel: md: import_slot: 4 replaced Also probably unrelated, but if you hover over the orange triangle on the array operation page, it says "Started, array unprotected" but on the dashboard I see all disks as offline. Looking for next steps to help troubleshoot. Array operation: https://imgur.com/id9jLUc Array disks: https://imgur.com/hukGTbI Quote Link to comment
dboonthego Posted September 25, 2023 Share Posted September 25, 2023 Do another copy and it will probably resolve itself. I couldn't reproduce the issue for someone else. Did you power down in step5? I don't think it matters, but wondering if you did. Quote Link to comment
lishpy Posted September 25, 2023 Author Share Posted September 25, 2023 1 hour ago, dboonthego said: Do another copy and it will probably resolve itself. I couldn't reproduce the issue for someone else. Did you power down in step5? I don't think it matters, but wondering if you did. I just kicked off a new copy so we'll see, I'll report back. I did power down at Step 5 because I pulled the data drive I'm replacing and put the new parity in it's drive bay. Quote Link to comment
itimpi Posted September 25, 2023 Share Posted September 25, 2023 2 hours ago, lishpy said: I did power down at Step 5 because I pulled the data drive I'm replacing and put the new parity in it's drive bay That could explain your issue. The parity swap procedure is meant to run to completion without the system being shutdown or rebooted. Maybe this needs to be clarified in the instructions? Quote Link to comment
dboonthego Posted September 25, 2023 Share Posted September 25, 2023 11 hours ago, itimpi said: That could explain your issue. The parity swap procedure is meant to run to completion without the system being shutdown or rebooted. Maybe this needs to be clarified in the instructions? I ran through a test yesterday and also shutdown simply because I followed the steps. Difference for me was I already had an unassigned disk larger than parity in the system and didn't physically alter the hardware. I had no issue performing the parity swap/data rebuild. Quote Link to comment
JorgeB Posted September 25, 2023 Share Posted September 25, 2023 If you shutdown in the middle of the procedure, or make any other array change, you will need to start over. Quote Link to comment
dboonthego Posted September 25, 2023 Share Posted September 25, 2023 1 hour ago, JorgeB said: If you shutdown in the middle of the procedure By "procedure" you're referring to the actual parity copy or data disk rebuild and not the documented procedure, right? Shutting down (step5) after unassigning the data disk to add new larger parity is fine. Quote Link to comment
lishpy Posted September 25, 2023 Author Share Posted September 25, 2023 Exactly, the only time I shut down was before I initiated the copy procedure. Plus if you see above the logs say the copy succeeded, but then seemed to not work correctly. Quote Link to comment
JorgeB Posted September 26, 2023 Share Posted September 26, 2023 11 hours ago, dboonthego said: By "procedure" you're referring to the actual parity copy or data disk rebuild and not the documented procedure, right? Correct, after you start the parity copy you cannot interrupt until the data disk is rebuilt. Quote Link to comment
lishpy Posted September 26, 2023 Author Share Posted September 26, 2023 The copy completed and now I was given the ability to start the data rebuild. The SOP should be updated to reflect the reboot in step 5 being an issue if that's truly the cause. Since it's before the copy procedure is initiated it's still not clear that's the issue. Otherwise there's a bug here considering the two threads that followed the SOP to a T and failed in the same way only to be successful on the redo. Quote Link to comment
dboonthego Posted September 26, 2023 Share Posted September 26, 2023 (edited) 2 hours ago, lishpy said: The SOP should be updated to reflect the reboot in step 5 being an issue if that's truly the cause. Since it's before the copy procedure is initiated it's still not clear that's the issue. I followed the same steps and did not experience the same behavior you and the other guy did. When I shut down in step5, I simply powered back up. I didn't physically change any disks as I already had them connected. Most people don't have a warm spare ready which is probably why it's written to shutdown. Not sure what caused this, but I highly doubt it's related to step5. Edited September 26, 2023 by dboonthego Quote Link to comment
JorgeB Posted September 27, 2023 Share Posted September 27, 2023 13 hours ago, lishpy said: The SOP should be updated to reflect the reboot in step 5 being an issue if that's truly the cause. Rebooting before starting parity copy is not a problem, like the one on step 5, after starting the parity copy you cannot reboot or interrupt it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.