April 3, 20242 yr Hoping against hope that I'm not screwed here... The background to the current crisis is that I swapped out an old 12TB drive for a new 18 TB one. When I booted the server back up, one of the existing 18TB drives was showing as disabled, even though nothing had changed on this. This was likely caused by a cable issue which I fixed. I then commenced a data rebuild on both drives (the new 18TB drive which was replacing the old 12TB and the existing 18TB drive which was effectively replacing itself). All was going fine and the contents of both drives were being emulated. We then suffered a power cut in the house when the data rebuild was about 5% done. On rebooting, the two drives that were being rebuilt are now shown as 'unmountable: unsupported or no file system'. Strangely the data rebuild has restarted (from 0%) and the GUI was showing writes to those drives (even though they are presumably not mounted). The real kicker, though, is the array size has now shrunk by 36TB and the drives do not appear to be emulated anymore (ie the contents are not accessible by the network share). No errors are reported on any drives, including the two parity disks. Am I screwed or is there a way to recover any of this data? I still have the old 12TB drive I replaced which may have data on, but obviously not for the 18TB drive that was replacing itself. Diagnostics zip attached. Thanks! tower-diagnostics-20240403-1958.zip
April 3, 20242 yr Community Expert Check filesystem on both disks, run it without -n, and if it asks for it use -L
April 3, 20242 yr Author 7 minutes ago, JorgeB said: Check filesystem on both disks, run it without -n, and if it asks for it use -L Thanks, Jorge. So shall I stop the array (it is currently started with the data rebuild paused) and restart in maintenance mode so I can do a filesystem check, using: xfs_repair -v /dev/mdX ?
April 3, 20242 yr Community Expert 4 minutes ago, djhavor said: Thanks, Jorge. So shall I stop the array Yes, you need to start it in maintenance mode, as mentioned in the instructions.
April 3, 20242 yr Author Thanks, Jorge. That's now done on both disks (using the sdc1 -and sdd1 mount points). The GUI option for xfsrepair was not available so I used the commandline. Both disks reported errors so i reran with -L. What should I do next Edited April 3, 20242 yr by djhavor
April 3, 20242 yr Community Expert Solution 42 minutes ago, djhavor said: (using the sdc1 -and sdd1 mount points). That's not how you should do it, since that won't be fixing the emulated disks, and if the disk were not being emulated it would make parity out of sync. The GUI option should work, but if it doesn't, start the array in maintenance mode and type xfs_repair -v /dev/md13p1 and xfs_repair -v /dev/md14p1
April 3, 20242 yr Author 13 minutes ago, JorgeB said: That's not how you should do it, since that won't be fixing the emulated disks, and if the disk were not being emulated it would make parity out of sync. Ah, my mistake. Sorry about that. Will be more careful with your instructions. I've now run the above commands and as expected they revealed errors which were fixed with -L.
April 3, 20242 yr Author Thanks, Jorge. Array now started (not in maintenance mode). Diags attached. The array now shows the correct total size (ie the 'missing' 36TB is back!). tower-diagnostics-20240403-2251.zip
April 3, 20242 yr Author 11 minutes ago, JorgeB said: Looks OK, let the rebuild continue. Will do - thanks so much! I'll report back in a day or two when it's done.
April 17, 20242 yr Author Just wanted to say the rebuild went off OK and all data was present on the rebuilt disks. Thanks to Jorge for all the help! Marked as solved.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.