[SOLVED] Help! Disabled drive during disk replacment


Recommended Posts

Hi guys I started the disk upgrade process for one of the 4T drives in my array replacing with a larger 8T drive. I pre-cleared the new drive first attaching as an external USB and followed the advice given by Pete in the following thread:

 

Ran parity check first, stopped the array, unassigned the old drive (Disk 1 in my picture), power down, removed Disk1 and installed the new drive at the same place, powered back on, assigned the new drive to Disk1 and started the array. 

Now my Disk1 is listed as "emulated" and data re-build is running.. however another fully working disk I have in the array (Disk5) is now marked as "disabled" with a message saying "unmountable file system" (See the attached picture of my array status)

 

What has gone wrong here? Did I lose data? What to do from here? My Disk5 has xfs file system and it still shows as xfs in the main page but with the above message..

Please help! Thanks

 

 

 

Array.jpg

diagnostics-20200621-1611.zip

Edited by kolla
Link to comment

OK data-rebuild is complete now.. and here are the main page summaries "before" and "after" my disk upgrade attempt for Disk1.

The "Used" column of 2 pictures seems to show the replaced drive may have correctly re-build what was on Disk1 removed drive. (All sizes seem to match)

Only Disk5 is now marked "unmountable: No File System". Does this mean Disk5 is the only one I need to worry about now? 

Appreciate any suggestions ..

 

before.jpg

after.jpg

Edited by kolla
Link to comment
8 hours ago, johnnie.black said:

Single parity can't emulated 2 disks, disk5 dropped offline so there's no SMART, check connections and post new diags.

 

I assume you still have old disk1 intact?

Yes I have the Disk1 so that data can potentially be copied into the replaced drive later if needed. I'm worried about the data in Disk5. Connections seem ok and I can feel the disk spinning. I haven't shut down the server  or done anything further yet fearing re-start might wipe out Disk5? Is it better to try and run XFS repair on Disk5 now without rebooting?

Link to comment
6 minutes ago, kolla said:

I haven't shut down the server  or done anything further yet fearing re-start might wipe out Disk5?

Shutting should be perfectly safe.

 

6 minutes ago, kolla said:

Is it better to try and run XFS repair on Disk5 now without rebooting?

No point, server has two invalid disks that can't be emulated by the single parity.

Link to comment
5 minutes ago, johnnie.black said:

Connections on disk5, shutdown the server, check connections even if it's just removing and re-connecting the cables, power back on and post new diags, disk is still offline on the diags posted.

Ok got it. Here's what I was thinking .. If Disk5 was offline during last boot shouldn't I possibly still have the data preserved in that disk?

If I do a shutdown now, re-connect cables and power back on then assuming Disk5 comes back alive then wouldn't that kickoff a parity write (invalid in my case) wiping off any good data in Disk5? Please correct my understanding.. Thanks again.

Link to comment
22 hours ago, kolla said:

If Disk5 was offline during last boot shouldn't I possibly still have the data preserved in that disk?

Disk wasn't offline during the boot, it dropped offline during the rebuild, and again, since single parity can't emulate 2 disks both disks currently have invalid data, now either disk5 died making recovery more difficult, or more likely it's just a connection problem, but we need the SMART to have a better idea, nothing you can do before rebooting/power cycling to see what's what.

Link to comment
On 6/22/2020 at 11:30 AM, johnnie.black said:

Connections on disk5, shutdown the server, check connections even if it's just removing and re-connecting the cables, power back on and post new diags, disk is still offline on the diags posted.

Thanks! I did exactly as you suggested... shutdown, unplug and plug back in the cables for Disk5, power back on. Disk5 still shows as "Unmountable: No File System". Glance at syslog seems to indicate a corruption?? 

Full diag attached.

diagnostics-20200623-1407.zip

Link to comment

Disk5 looks healthy, since disk1 suffered some corruption during the rebuild when disk5 dropped offline, easiest way forward would be:

 

-Re-connect old disk1 (you can disconnect the new one if needed)

-I would recommend replacing cables on disk5 just to rule them out if there are any more issues with it.

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) including old disk1, so that you have the array as it was before the disk upgrade.

-Start array to begin parity sync

-All disks should mount correctly, if they don't post new diags (never format)

 

When the sync is done you can try the upgrade again.

  • Thanks 1
Link to comment
10 hours ago, johnnie.black said:

Disk5 looks healthy, since disk1 suffered some corruption during the rebuild when disk5 dropped offline, easiest way forward would be:

 

-Re-connect old disk1 (you can disconnect the new one if needed)

-I would recommend replacing cables on disk5 just to rule them out if there are any more issues with it.

-Tools -> New Config -> Retain current configuration: All -> Apply
-Check all assignments and assign any missing disk(s) including old disk1, so that you have the array as it was before the disk upgrade.

-Start array to begin parity sync

-All disks should mount correctly, if they don't post new diags (never format)

 

When the sync is done you can try the upgrade again.

Followed these steps and also replaced cables in Disk5. All disks mounted OK with new config and Parity sync is in progress. That should go on for another day or so.. There was a warning message about crc error count 8 in Disk5. Is that something to worry about?

Link to comment
1 hour ago, kolla said:

There was a warning message about crc error count 8 in Disk5. Is that something to worry about?

That is usually a bad SATA cable and likely the reason the disk dropped offline earlier, but as long as it doesn't increase anymore it's fine, and it shouldn't if you replaced the cables as suggested.

  • Like 1
Link to comment

Thanks. So far everything seems to have gone as expected. Parity sync completed with no errors. Replaced Disk1 and all disks have mounted ok this time. Array is re-building now. that's another 20hrs or so... will post update later..

Link to comment

Good news at last! Array re-build with the replacement for Disk1 has finished successfully with no issues reported. Parity says valid with 0 errors.

At first glance it appears I have not lost any data, which is a pleasant surprise to me considering the situation. I'll take a closer look to confirm no data loss later.. but first want to give a big thank to johnnie.black for guiding me through to safety. 👏👏 Cheers!! 🥂

Link to comment
  • kolla changed the title to [SOLVED] Help! Disabled drive during disk replacment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.