kolla Posted June 21, 2020 Share Posted June 21, 2020 (edited) Hi guys I started the disk upgrade process for one of the 4T drives in my array replacing with a larger 8T drive. I pre-cleared the new drive first attaching as an external USB and followed the advice given by Pete in the following thread: Ran parity check first, stopped the array, unassigned the old drive (Disk 1 in my picture), power down, removed Disk1 and installed the new drive at the same place, powered back on, assigned the new drive to Disk1 and started the array. Now my Disk1 is listed as "emulated" and data re-build is running.. however another fully working disk I have in the array (Disk5) is now marked as "disabled" with a message saying "unmountable file system" (See the attached picture of my array status) What has gone wrong here? Did I lose data? What to do from here? My Disk5 has xfs file system and it still shows as xfs in the main page but with the above message.. Please help! Thanks diagnostics-20200621-1611.zip Edited June 21, 2020 by kolla Quote Link to comment
kolla Posted June 22, 2020 Author Share Posted June 22, 2020 (edited) OK data-rebuild is complete now.. and here are the main page summaries "before" and "after" my disk upgrade attempt for Disk1. The "Used" column of 2 pictures seems to show the replaced drive may have correctly re-build what was on Disk1 removed drive. (All sizes seem to match) Only Disk5 is now marked "unmountable: No File System". Does this mean Disk5 is the only one I need to worry about now? Appreciate any suggestions .. Edited June 22, 2020 by kolla Quote Link to comment
JorgeB Posted June 22, 2020 Share Posted June 22, 2020 Single parity can't emulated 2 disks, disk5 dropped offline so there's no SMART, check connections and post new diags. I assume you still have old disk1 intact? Quote Link to comment
kolla Posted June 22, 2020 Author Share Posted June 22, 2020 8 hours ago, johnnie.black said: Single parity can't emulated 2 disks, disk5 dropped offline so there's no SMART, check connections and post new diags. I assume you still have old disk1 intact? Yes I have the Disk1 so that data can potentially be copied into the replaced drive later if needed. I'm worried about the data in Disk5. Connections seem ok and I can feel the disk spinning. I haven't shut down the server or done anything further yet fearing re-start might wipe out Disk5? Is it better to try and run XFS repair on Disk5 now without rebooting? Quote Link to comment
JorgeB Posted June 22, 2020 Share Posted June 22, 2020 6 minutes ago, kolla said: I haven't shut down the server or done anything further yet fearing re-start might wipe out Disk5? Shutting should be perfectly safe. 6 minutes ago, kolla said: Is it better to try and run XFS repair on Disk5 now without rebooting? No point, server has two invalid disks that can't be emulated by the single parity. Quote Link to comment
kolla Posted June 22, 2020 Author Share Posted June 22, 2020 I see. Ok so since my Disk5 is not disconnected what is the best thing to do now? Quote Link to comment
JorgeB Posted June 22, 2020 Share Posted June 22, 2020 9 hours ago, johnnie.black said: check connections and post new diags. Quote Link to comment
kolla Posted June 22, 2020 Author Share Posted June 22, 2020 (edited) 1 hour ago, johnnie.black said: Sorry.. didn't get this. what connections do you mean? 😕 Anyway here is the new diagnostics.. diagnostics-20200622-1121.zip Edited June 22, 2020 by kolla Quote Link to comment
JorgeB Posted June 22, 2020 Share Posted June 22, 2020 9 minutes ago, kolla said: what connections do you mean? Connections on disk5, shutdown the server, check connections even if it's just removing and re-connecting the cables, power back on and post new diags, disk is still offline on the diags posted. Quote Link to comment
kolla Posted June 22, 2020 Author Share Posted June 22, 2020 5 minutes ago, johnnie.black said: Connections on disk5, shutdown the server, check connections even if it's just removing and re-connecting the cables, power back on and post new diags, disk is still offline on the diags posted. Ok got it. Here's what I was thinking .. If Disk5 was offline during last boot shouldn't I possibly still have the data preserved in that disk? If I do a shutdown now, re-connect cables and power back on then assuming Disk5 comes back alive then wouldn't that kickoff a parity write (invalid in my case) wiping off any good data in Disk5? Please correct my understanding.. Thanks again. Quote Link to comment
JorgeB Posted June 23, 2020 Share Posted June 23, 2020 22 hours ago, kolla said: If Disk5 was offline during last boot shouldn't I possibly still have the data preserved in that disk? Disk wasn't offline during the boot, it dropped offline during the rebuild, and again, since single parity can't emulate 2 disks both disks currently have invalid data, now either disk5 died making recovery more difficult, or more likely it's just a connection problem, but we need the SMART to have a better idea, nothing you can do before rebooting/power cycling to see what's what. Quote Link to comment
kolla Posted June 23, 2020 Author Share Posted June 23, 2020 On 6/22/2020 at 11:30 AM, johnnie.black said: Connections on disk5, shutdown the server, check connections even if it's just removing and re-connecting the cables, power back on and post new diags, disk is still offline on the diags posted. Thanks! I did exactly as you suggested... shutdown, unplug and plug back in the cables for Disk5, power back on. Disk5 still shows as "Unmountable: No File System". Glance at syslog seems to indicate a corruption?? Full diag attached. diagnostics-20200623-1407.zip Quote Link to comment
JorgeB Posted June 24, 2020 Share Posted June 24, 2020 Disk5 looks healthy, since disk1 suffered some corruption during the rebuild when disk5 dropped offline, easiest way forward would be: -Re-connect old disk1 (you can disconnect the new one if needed) -I would recommend replacing cables on disk5 just to rule them out if there are any more issues with it. -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) including old disk1, so that you have the array as it was before the disk upgrade. -Start array to begin parity sync -All disks should mount correctly, if they don't post new diags (never format) When the sync is done you can try the upgrade again. 1 Quote Link to comment
kolla Posted June 24, 2020 Author Share Posted June 24, 2020 Thank you! Will post an update later .. Quote Link to comment
kolla Posted June 24, 2020 Author Share Posted June 24, 2020 10 hours ago, johnnie.black said: Disk5 looks healthy, since disk1 suffered some corruption during the rebuild when disk5 dropped offline, easiest way forward would be: -Re-connect old disk1 (you can disconnect the new one if needed) -I would recommend replacing cables on disk5 just to rule them out if there are any more issues with it. -Tools -> New Config -> Retain current configuration: All -> Apply -Check all assignments and assign any missing disk(s) including old disk1, so that you have the array as it was before the disk upgrade. -Start array to begin parity sync -All disks should mount correctly, if they don't post new diags (never format) When the sync is done you can try the upgrade again. Followed these steps and also replaced cables in Disk5. All disks mounted OK with new config and Parity sync is in progress. That should go on for another day or so.. There was a warning message about crc error count 8 in Disk5. Is that something to worry about? Quote Link to comment
JorgeB Posted June 24, 2020 Share Posted June 24, 2020 1 hour ago, kolla said: There was a warning message about crc error count 8 in Disk5. Is that something to worry about? That is usually a bad SATA cable and likely the reason the disk dropped offline earlier, but as long as it doesn't increase anymore it's fine, and it shouldn't if you replaced the cables as suggested. 1 Quote Link to comment
kolla Posted June 25, 2020 Author Share Posted June 25, 2020 By the way when the parity build is done do I simply unassign Disk1 from the array, shut down, replace the Disk1 physical disk with the replacement disk from before without needing to do anything else on it? Quote Link to comment
JorgeB Posted June 25, 2020 Share Posted June 25, 2020 Yes, but no need to unassign it first, just shutdown, replace disk, power on, assign it and start array to rebuild. Quote Link to comment
kolla Posted June 25, 2020 Author Share Posted June 25, 2020 Thanks. So far everything seems to have gone as expected. Parity sync completed with no errors. Replaced Disk1 and all disks have mounted ok this time. Array is re-building now. that's another 20hrs or so... will post update later.. Quote Link to comment
kolla Posted June 26, 2020 Author Share Posted June 26, 2020 Good news at last! Array re-build with the replacement for Disk1 has finished successfully with no issues reported. Parity says valid with 0 errors. At first glance it appears I have not lost any data, which is a pleasant surprise to me considering the situation. I'll take a closer look to confirm no data loss later.. but first want to give a big thank to johnnie.black for guiding me through to safety. 👏👏 Cheers!! 🥂 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.