Phantom data loss experienced


Recommended Posts

Hi All.

 

TLDR- Ive lose 2TB of data after a disk failure and parity rebuild even though i know parity was good only a few days prior to losing the disk. I want to try to understand what happened, and to see if the data can be recovered.

 

Ive been a long time user, since version 4.3 at least. Love the product and so great to see it evolving into the beast it has become with docker support et al. However, I have myself a bit of a situation here, with phantom data loss of 1+ TB.

 

------- Background 

 

First some background. I had largish array (14TB) in my home office that had a mixture of work and personal data. Now I have an external office, and more staff so I wanted to get my personal stuff off that unraid, and onto another new one I built for home.

 

The existing unraid: is built one a mix of old components on gigabyte mobo with intel i3 4GB ram, with an Intel PCI-E 1000tx NIC and a PCI-E Intel RS2WC080 RAID Controller in IT mode, sporting 8 sata interfaces. It has a total of 7 disks, including a 3TB parity drive. The server hasnt skipped a beat in years, and all disks were green in the old array before i started tinkering, but admittedly they were getting on in age.

 

The new unraid: is built on a HP Microserver, with a total of 4 disks, 5TB parity.

 

The data set: The data set I want to migrate is around 6.3TB

 

--------------- The Project

 

The plan was, I was going to move my private data from work unraid to the newly built home unraid. Simple as right? The issue was, I actually wanted to utilize a few disks from the existing server. Seeing as the total space required for the work unraid would be reduced by the amount of private data I was taking away meaning there was going to be a lot of free space on the existing array which would be a waste. 

 

Now here comes the obligatory preface, i work in IT for a living, i do systems admin and I understand the concepts of parity etc, and I thought i understood enough about how unraid worked to get a bit creative here. First problem, was that due to high water distribution my private data was spread all over the place, rather than concentrated on one or two disks I could earmark to swap out to the new array.

 

So I figured I would, and did do, this:

 

Stage 1: Rearrange data on old array

-Check all disks are green in old array, screenshot or main screen

-Parity check old array, no errors

-Take a compute all of all the existing shares, screenshot, calculate private share data for migration

-Use unBalance to distribute all work data off a specific disk, in this case, a 3TB disk 7

-Use unBalance to gather a portion of my private shares to the same 3TB disk 7, enough to fill the 3TB disk

 

Stage 2: Move disk 7 to new array

-Remove disk 7 from old array

-Install disk 7 in the new blank unraid array

-Started the new array and commenced the initial parity build

-Checked the data in the new array, and it was all there. All my user shares that I had gathered also automatically populated

 

Stage 3: Rebuild old array

-Spun up old server with disk 7 missing

-Check statuses of all disks. DIsk7 is obviously missing, but oddly Disk 3 also shows up missing? No biggie, as I had been in the case with my hands removing disk 7 so I could easily have disturbed the cables. I powered down and reseated all the sata and power connectors and powered back on. Disk 3 shows up again 

-Commence a Rebuild parity on old array which is now reduced capacity by the removal of disk 7

 

------The Problem

 

The parity rebuilt OK, and everything seems fine, then a few days later I go to look for some data on the old array, a backup chain that i needed for work, and its just gone, plain gone. Confused, I reviewed the disk health, and I noticed that Disk 3 is set to disabled. It has thousands of reads, 0 writes, and thousands of errors, just slihtly less total errors than total reads. The disk isnt red, though, its to disabled not much more info provided. Now, this is where I start becoming concerned.

 

If this disk is disabled, then as far as I understand Unraid will/should emulate the contents from the healthy parity, and I should still be able to access the data even if the disk is missing, or failed just at reduced speed. After troubleshooting disk 3, changing the sata inferface and testing it in other ports, i replaced disk 3 and rebuilt it from parity. Checked the array after rebuild, all disks green, and the data is still missing.

 

It was then that I reviewed the screenshots of the old array, that I took before the project started so I could record all disk assignments and sizes etc, and disk 3 has gone from 2.1 TB used, to 123gb used. So, im missing like 1.8TB or data on disk 3. I know 100% this 1.8TB was NOT gathered by unbalance in the previous stages. This disk contained a large amount of data for a single share of backup data that I did not touch in previous steps.

 

It's not the end of the world, but im a little upset as its going to take time to rebuild that data as it was all the backup chains we use to restore to servers and workstations to our customized out of box states. For testing purposes I plugged the old disk 3 into the array after the rebuild, and it shows as new, and wants to format. Which i obviously didnt do.


Im wondering what the heck happened and where to go from here. I suspect the disk 3 was f*****d during the parity rebuild, but in a way that it wasnt throwing parity errors, or smart errors for that matter as it was showing green and the rebuild process has no error messages.

 

I figure I can attach the disk that through the errors and mount it in any linux environment? Like an ubuntu live cd, and see whats there. Anybody got any pointers, or data recovery tools that support XFS?

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Edited by paulylah
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.