Jump to content

Interrupted parity check resulted in xfs corruption and huge data loss


Recommended Posts

It is started when unRAID decided to disable one of the drives while parity checking due to overheating.
Drive entered a "disabled" state and I decided to reboot.
After reboot, everything seemed normal, my data looked good. But unRAID wanted to have parity check once again, and during the process, the array stopped due to xfs error (I've learned this from logs). 
I've rebooted once again and stopped parity checking. At this moment I still see my data intact.
So I decided to enter maintenance mode and fix the filesystem.
I run the xfs_ck without the "-n" flag, but it found a huge amount of errors (log attached). After starting the array I've realized that half of my data is lost.
I shut the system down and started it with only Parity drive connected (in hope that it may still have data). No luck, same picture.
At the same time, I've connected the main drive to the Windows machine (don't ask why I don't know). Just connected via USB-SATA adapter, no actions were performed, except going to "Disk management" to make sure it is detected. Windows did something with it and now unRAID sees this (main) drive as a "New device".

 

Is there any way I can recover my data?

xfs_chk.log

Edited by Gordon01
Link to comment
2 minutes ago, Gordon01 said:

only Parity drive connected (in hope that it may still have data)

Parity has no data and can't even help provide the data for a missing disk without all the other disks.

 

3 minutes ago, Gordon01 said:

Windows did something with it

What did Windows do? Unraid filesystems can't be read on Windows and if Windows did anything to the disk Unraid probably can't read it now either. Putting an Unraid disk into another system is almost always a very bad idea, and unless the other system can work with Linux filesystems, pointless.

 

You really should have asked for advice before doing anything. From your description it seems like you have made things worse.

 

Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Link to comment
12 minutes ago, trurl said:

Parity has no data and can't even help provide the data for a missing disk without all the other disks.

I have only two disks: Disk 1 and Parity. If I don't have Disk 1 connected, I still have my data (in emulated mode or something like this.

 

12 minutes ago, trurl said:

What did Windows do?

No idea. I haven't asked it to do anything.

 

12 minutes ago, trurl said:

ou really should have asked for advice before doing anything.

In the beginning, I've only had corrupted xfs due to unRAID disabled it in the middle of parity checking. That didn't look like a serious problem for me.

unraid-diagnostics-20210319-1946.zip

Edited by Gordon01
Link to comment
1 hour ago, Gordon01 said:

corrupted xfs due to unRAID disabled it in the middle of parity checking

Unraid doesn't disable a disk due to corruption, or corrupt a disk due to disabling it.

 

Unraid only disables a disk when a write to it fails. It sometimes happens that a failed read will make Unraid try to get the data from the parity calculation and try to write it back to the disk, then if that write fails it becomes disabled. But only a failed write will disable a disk.

 

Unmountable (filesystem corruption) is a separate and independent condition from a disabled disk. A disk can be unmountable but not disabled, a disk can be disabled but the emulated disk is mountable, or a disk can be both disabled and unmountable.

 

SMART for parity looks OK.

 

SMART for disk sdb serial ending D0DV also looks OK. Was this disk1?

 

Syslog seems to indicate emulated disk1 mounted cleanly, but then you stopped the array and it wasn't started when those diagnostics were captured.

 

Start the array with only parity assigned then post new diagnostics.

 

 

Link to comment
3 hours ago, trurl said:

Unraid only disables a disk when a write to it fails. It sometimes happens that a failed read will make Unraid try to get the data from the parity calculation and try to write it back to the disk, then if that write fails it becomes disabled. But only a failed write will disable a disk.

I've seen a few times that unraid disables disk then it's temperature hits critical temperature treshold. After lifting temperature limit in settings it never happened again. I suppose it happened now because of sunny days and warm weather. It was really hot in my apartment.

 

3 hours ago, trurl said:

SMART for disk sdb serial ending D0DV also looks OK. Was this disk1?

Yes, this is correct.

 

3 hours ago, trurl said:

Syslog seems to indicate emulated disk1 mounted cleanly

Looks like my only option now is to restore deleted files xfs somehow...

 

3 hours ago, trurl said:

Start the array with only parity assigned then post new diagnostics.

Please find attached.

 

Thank you!

Alexander.

 

unraid-diagnostics-20210320-0126.zip

Link to comment
9 hours ago, Gordon01 said:

I've seen a few times that unraid disables disk then it's temperature hits critical temperature treshold.

That's not possible.

 

Actual disk might still be OK, you can try mounting it with UD, note that the array needs to be stopped or you need to change the XFS UUID first, or it won't mount.

Link to comment
21 hours ago, Gordon01 said:

have only two disks: Disk 1 and Parity. If I don't have Disk 1 connected, I still have my data (in emulated mode or something like this.

Because of the way that Parity is calculated for Parity1, with a single data disk, the information stored on the Parity disk is a mirror of that data disk. 

Link to comment
  • 3 months later...
On 3/20/2021 at 10:50 AM, JorgeB said:

That's not possible.

 

Actual disk might still be OK, you can try mounting it with UD, note that the array needs to be stopped or you need to change the XFS UUID first, or it won't mount.

I think this is possible, probably due to bug in unRAID. I've seen this behaviour a few times on this machine. 

Yes, I can mount this disk but it does not contain all the data due to filesystem corruption. 

This is the very awkward situation because I have two disks in mirror but can't restore the data.

 

 

What tools to restore xfs you can recommend?

Or just send it to proffesional restore studio?

Link to comment
1 minute ago, Gordon01 said:

I think this is possible, probably due to bug in unRAID. I've seen this behaviour a few times on this machine. 

The disk itself might generate errors if it gets very hot, but Unraid never disables a disk because it's hot, that's not possible, it only disables a disk if there are write errors.

 

2 minutes ago, Gordon01 said:

Yes, I can mount this disk but it does not contain all the data due to filesystem corruption. 

If the disk mounts it should show the data, if there was serious filesystem corruption it wouldn't mount, but you can still run xfs_repair, if the disk is unassigned it would be:

 

xfs_repair -v /dev/sdX1

 

Replace X with correct letter.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...