Power cut during data rebuild; possible data loss


Recommended Posts

Hoping against hope that I'm not screwed here...

 

The background to the current crisis is that I swapped out an old 12TB drive for a new 18 TB one. When I booted the server back up, one of the existing 18TB drives was showing as disabled, even though nothing had changed on this. This was likely caused by a cable issue which I fixed. I then commenced a data rebuild on both drives (the new 18TB drive which was replacing the old 12TB and the existing 18TB drive which was effectively replacing itself).


All was going fine and the contents of both drives were being emulated. We then suffered a power cut in the house when the data rebuild was about 5% done.

 

On rebooting, the two drives that were being rebuilt are now shown as 'unmountable: unsupported or no file system'. Strangely the data rebuild has restarted (from 0%) and the GUI was showing writes to those drives (even though they are presumably not mounted).

 

The real kicker, though, is the array size has now shrunk by 36TB and the drives do not appear to be emulated anymore (ie the contents are not accessible by the network share).

 

No errors are reported on any drives, including the two parity disks.

 

Am I screwed or is there a way to recover any of this data? I still have the old 12TB drive I replaced which may have data on, but obviously not for the 18TB drive that was replacing itself.

 

Diagnostics zip attached.

 

Thanks!

tower-diagnostics-20240403-1958.zip

Link to comment
7 minutes ago, JorgeB said:

Check filesystem on both disks, run it without -n, and if it asks for it use -L


Thanks, Jorge. So shall I stop the array (it is currently started with the data rebuild paused) and restart in maintenance mode so I can do a filesystem check, using:

 

xfs_repair -v /dev/mdX

 

?

Link to comment
Posted (edited)

Thanks, Jorge. That's now done on both disks (using the sdc1 -and sdd1 mount points). The GUI option for xfsrepair was not available so I used the commandline.

 

Both disks reported errors so i reran with -L.

 

What should I do next

Edited by djhavor
Link to comment
42 minutes ago, djhavor said:

(using the sdc1 -and sdd1 mount points).

That's not how you should do it, since that won't be fixing the emulated disks, and if the disk were not being emulated it would make parity out of sync.

 

The GUI option should work, but if it doesn't, start the array in maintenance mode and type

 

xfs_repair -v /dev/md13p1

and

xfs_repair -v /dev/md14p1

 

 

Link to comment
13 minutes ago, JorgeB said:

That's not how you should do it, since that won't be fixing the emulated disks, and if the disk were not being emulated it would make parity out of sync.

 

Ah, my mistake. Sorry about that. Will be more careful with your instructions.

 

I've now run the above commands and as expected they revealed errors which were fixed with -L.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.