Over 12TB of data suddenly GONE!


Recommended Posts

Today being Sunday I was doing weekly housekeeping on my server and noticed that 2 directories on my server were suddenly empty!! The directory /downloads and TV/TV are empty, every file, sub folder etc GONE!  I have no drives in a degraded state, no errors showing on any of the drives, I'm currently running a parity check to be sure but highly doubt it will find anything as one just ran on Nov 1ST and everything was fine. I have only 3 user accounts on this server, the root account, a backup account and a generic account. I control the passwords to all 3, no one else knows them and they are 16 char, randomly generated passwords so again, not something someone could get easily.. No other data is missing or affected, just those 2 directories, The only apps on the server that access them are sabnzbd, radaar, sonaar and delugevpn, I use the binhex containers for all of the above but they also have access to a number of other directories that are fine..  I access the /TV/TV from my PC as well as 3 MiTV boxes in my house, they all use the generic account with a password to access it and I rarely access the /downloads directory from my pc only.

 

I'm not sure what to make of it at this point, it's VERY odd it's everything inside these 2 directories but the directories themselves are still there and just fine and the free/used space remains exactly the same as if the files were still there.  I was watching TV last night until after midnight so I KNOW at least some of it was still  there at that point. Is there a log or some other journal that tracks file deletions that I am unware of that might show what happened?

 

Thanks, and sorry for the long rambling post I'm really scratching my head at this point as to what the heck happened and frankly slightly nervous about starting anything back up again until I figure out exactly what the heck happened!

 

**EDIT**

 

I can see that at least some, but hopefully all of the files are actually still there if I use krusader and look at the individual disks..  I'm just not sure why the directory within the share shows empty or how to actually fix it so that the files that are on the individual disk start showing within said directory on the actual share..

 

**EDIT 2**

 

I see the log is filling up with :

 

Nov 8 17:13:48 Tower kernel: BTRFS error (device md4): parent transid verify failed on 87638016 wanted 25935 found 25928
Nov 8 17:13:48 Tower kernel: BTRFS error (device md4): parent transid verify failed on 87638016 wanted 25935 found 25928
Nov 8 17:13:48 Tower kernel: BTRFS error (device md4): parent transid verify failed on 87638016 wanted 25935 found 25928

 

Just being spammed, I assume this means something happened to device md4, after I figure out which device  md4 is would a btrfs scrub command fix the issue if anyone knows?  I'm a bit hesitant to just start trying stuff.  While I do have backups of everything, it would be a REAL pain to restore 12TB+ of data...

Edited by rclifton
Link to comment

Attached is a copy of the diagnostics.  I ran a BTRFS scrub on drive 4 with the "repair corrupted blocks" option checked.  It found 1 error and said it was uncorrectable..  I'm still seeing :

Nov 9 00:00:42 Tower kernel: BTRFS error (device md4): parent transid verify failed on 87638016 wanted 25935 found 25928
Nov 9 00:00:42 Tower kernel: BTRFS error (device md4): parent transid verify failed on 87638016 wanted 25935 found 25928

Spamming the log and at this point am hoping someone else has dealt with this in the past and has some suggestions..

 

Thanks

tower-diagnostics-20201108-2357.zip

Link to comment
5 hours ago, JorgeB said:

This error is fatal, it means there are some missing writes, you'll need to reformat that disk, some recovery options here if needed.

I've got a spare drive, can I just pull this drive and replace it with the spare and then rebuild from parity?  I recently moved my server into a new case and I'm wondering if something happened during that move (I very briefly attempted to use a different controller card and think that might have caused this).  If I can just remove and replace the drive, I'll reformat it on another system and then add it back to the server and run preclear on it to see if there is actually a real issue with the drive or if I caused it..

Link to comment

I'm not sure at this point if it would be easier to copy as much of the array as possible onto some USB drives and then just nuke it and start over.  Or nuke it and reload my backup, which is on a system I literally just moved to my sister's house a few weeks ago and will probably take a week at least to download...   Sometimes this is all a little to much like actual work lol...

Link to comment
42 minutes ago, trurl said:

Have you tried any of the recovery options mentioned?

 

I have not yet, most of them if I'm reading right, simply copy the data to another drive.  I plan to do that later this afternoon and then run the check --repair command and if it fails I'll nuke it all and copy it all back over..  Either way it looks like I'll be copying all the data, I just don't have enough USB drive's for a complete backup So I plan to pick some up later this afternoon..

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.