Jump to content

Help: Array stuck in "copy" mode, after using swap method to add a larger parity drive


Go to solution Solved by trurl,

Recommended Posts

Hello,

 

I'm following the instruction of "The parity swap procedure" to add a larger new parity drive, and move the old parity drive to a data drive. https://wiki.unraid.net/The_parity_swap_procedure#:~:text=This procedure is strictly for,building parity will immediately begin.

 

After finishing the 27 hour "copy" process, the array status show "Stopped. Upgrading disk/swapping parity" which is expected according to the document.

 

However, it does not show the "START" button. Instead, it is still presenting the "COPY" button with copy information/checkbox.

 

Screenshot below

 

unraid-copy.thumb.png.eb8685a45f566ce87d0a992f4b977160.png

 

Quickly checked the system log. Nothing out of ordinary.

 

I'm a bit stuck now, not sure what to do next. Any help would be appreciated, thanks.

 

--

Carlos

Link to comment

I accidentally reboot the server.

 

Now, the array is reverted to the same state before copying, since no data were wrote to any data drive and old parity drive.

 

I guess I will simply do two-steps upgrade, first upgrade the parity drive, then data drive.

 

Thanks for helping anyway.

Link to comment
  • 7 months later...

Problems reading both disks 4 and 5, both look like disk problems not connection problems, but SMART for 4 is the worst of the two. Disk4 also unmountable.

 

I didn't notice anything in syslog to indicate parity copy ever started, but I probably just missed it.

 

Screenshot seems to be waiting for you to check the box to enable the copy button.

Link to comment
57 minutes ago, trurl said:

Problems reading both disks 4 and 5, both look like disk problems not connection problems, but SMART for 4 is the worst of the two. Disk4 also unmountable.

 

I didn't notice anything in syslog to indicate parity copy ever started, but I probably just missed it.

 

Screenshot seems to be waiting for you to check the box to enable the copy button.

@trurl / @JorgeB So Sorry. I uploaded the diagnostics of the wrong Unraid Server.

Here's the correct one. 

the-ark-diagnostics-20231217-0745.zip

Link to comment

Do you have Notifications configured to alert you immediately by email or other agent as soon as a problem is detected?

8 minutes ago, trurl said:

You mean your other Unraid server has all those problems I mentioned?

Don't let one unnoticed problem become multiple problems and data loss.

 

Do any disks on either server show SMART warnings on the Dashboard page? Disks 4 and 5 on the other server definitely should, I haven't examined SMART for the large number of disks on the server that is the topic of this thread.

Link to comment
55 minutes ago, trurl said:

You mean your other Unraid server has all those problems I mentioned?

 

Yes, that’s another server (Cronos). I’m working on fixing that one as well.
 

But the server that I experienced this bug on is “the Ark” ignore “Cronos” for this bug thread.

 

on “the Ark”, I  followed the docs for parity swap to add a larger new parity drive, and move the old parity drive to replace another failing drive.
 

https://docs.unraid.net/legacy/FAQ/parity-swap-procedure/

 

After a ~48 hour "copy" process, I watched the copy process go from 98 to 100%. Then a couple minutes later the array updated to show "Stopped. Upgrading disk/swapping parity" which is expected according to the document.

 

However, it does not show the "START" button. Instead, it is still presenting the "COPY" button with copy information/checkbox. (See screenshot above)

Link to comment

New Disk 18, sdc, is just CRC (connection) error. I usually just acknowledge the occasional CRC error, maybe reseat the cable, investigate further if they increase rapidly. Many of your drives have CRC errors that you must have already acknowledged and they haven't increased since.

 

Is that unassigned Dev1 sds? Looks like the drive that was originally sdt, serial ending 1413, when you booted disconnected and reconnected as sds since it now has that serial and there is no sdt connected. It was showing critical medium errors all thru syslog including during copy. That one has pending sectors.

 

The other drive that was throwing critical medium errors was sdo but not immediately clear which drive that was at the time since sdo is in syslog with different serial numbers at different times. sdo started out as disk18 with serial ending 7227, but sdo was unassigned with serial ending 2P9T when the diagnostics were taken.

 

Obviously original disk18 was part of the parity swap, but wouldn't have been read during parity copy nor used during disk18 rebuild.

 

Doesn't look like original disk18 is still connected but it was when you booted.

 

Have you been doing "hotswap" during any of this?

 

Were any of these other disks I mention involved in the parity swap?

 

 

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...