Data rebuild on disk1 results in empty disk1 and full disk2?


Recommended Posts

Hello everyone.  I'm helping a client recover from a failed hard drive using unRAID 4.7, and we've run into some very odd behavior.  I'm not sure if I've done something wrong, or if this might be a bug in unRAID.  First of all, disk1 (a 1 TB WD Green EARS drive) started having issues and reported reiserfs errors in the syslog.  We ran reiserfsck and the subsequent rebuild tree command when prompted.  The check completed normally, but the rebuild tree failed.  At this point, the array looked like this:

 

click for full size

OoDNql.png

 

disk1 appears unformatted because it has been unmounted from the array prior to the reiserfsck process.

 

I decided that the drive should be RMA'd and that we should work on restoring parity protection to the array ASAP.  We cleared off the cache drive (a 1 TB WD EADS) and assigned it as the replacement for disk1.  Everything looked normal (blue dot on disk1, green on all others), so we proceeded with the data rebuild of the failed disk1 onto the former cache drive.  The data rebuild completed successfully.  We then ran a parity check, which also completed successfully.  The array now looks like this:

 

click for full size

Icgrnl.png

 

Notice the difference in disk2's free space between the two screenshots.  In the first, disk2 has just under 4 GB of free space.  In the second screenshot, disk2 has 0 free space.  The data rebuild should not have affected the data on disk2 what-so-ever.  So what happened?

 

The other major issue is that disk1 after the data rebuild is now completely empty!  unRAID seems to have rebuilt an empty disk onto the replacement disk.  As the disk was unmounted prior to the data rebuild, is this normal behavior?  Should I have re-mounted the disk prior to starting the data rebuild?  I didn't think it mattered, but apparently it does.

 

The original disk1 is still installed in the server, just unassigned.  We used unMenu to try to mount the disk and see if we could recover the data off of it.  When attempting to mount the disk, unMenu returns this error:

mount error:
/dev/sdd1 mounted on /mnt/disk/sdd1
Using command: mount -r -o noatime,nodiratime -t reiserfs /dev/sdd1 /mnt/disk/sdd1 2>&1

mount: /dev/sdd1: can't read superblock

Is there some other way to mount the disk and recover the data from it, or should we resort to reiserfs data recovery software?

 

Two syslogs attached.  One contains the data rebuild and subsequent parity check.  The second is after a reboot (so it is shorter and cleaner) and reflects the current state of the server.  Please let me know if there's any more information I can provide.

syslog_contains_data_rebuild_and_parity_check.txt

syslog_after_reboot.txt

Link to comment

Without ever hearing of another case of a bleed-over effect, of a data rebuild of one drive affecting the contents of another drive, I would have to completely discount that possibility - until someone can prove otherwise.

 

Disk 2 only had about 4GB left, and that is too close to full to matter.  It would not take many writes for the Reiser file system to indicate no space left.  Unlike FAT and older file systems, there is not a one-to-one correspondence between the indicated free space and how much you can actually store.  Modern file systems (including ReiserFS) use expanding B-tree structures, increasing metadata storage, and tail-packing methods that distort how much space is really required to store new data.  Adding even a very small file might trigger a file system need for more metadata space.  I personally think it may be dangerous to use the last 1% of any drive, or at least possibly problematic.  You would know better than me of course, but is it possible that an addon might have still been active and might try to save data to Disk 2?

 

The Disk 1 problem seems clear, as you indicated that the rebuild tree failed, and it could not find a superblock, so since it cannot even begin to load the file system, the drive would look completely empty.  The new Disk 1 is just a copy of the old Disk 1.  You did not indicate what the drive issues were, but it is possible that whatever physical drive issues there were, they could have affected the ability of the rebuild tree operation to complete successfully.  For example, bad media that occurred in a file system structure such as a directory could cause files to disappear.  Try the rebuild tree on the new drive.  You will probably have to start with a rebuild_sb command first, to restore a superblock.  (Of course as you know, you would never need to replace a drive with just Reiser file system errors, and no physical problems.)

Link to comment

Here's the output of running reiserfsck --check on the replacement disk1:

 

jRnb6.png

 

It looks to me like it found no errors, and so it didn't suggest running --fix-fixable, --rebuild-tree, or --rebuild-sb.  As such, I did not run either of those commands.

 

Syslog attached.  Should I run --rebuild-sb?  If so, is this the correct syntax:

 

reiserfsck --rebuild-sb /dev/md1

 

Or is it time to start data recovery on the original disk1?

syslog-2011-12-11.txt

Link to comment

Here's the output of running reiserfsck --check on the replacement disk1:

 

It looks to me like it found no errors, and so it didn't suggest running --fix-fixable, --rebuild-tree, or --rebuild-sb.  As such, I did not run either of those commands.

 

Syslog attached.  Should I run --rebuild-sb?  If so, is this the correct syntax:

 

reiserfsck --rebuild-sb /dev/md1

 

I'm not an expert at this, I'd rather defer to others with more experience with reiserfsck and data recovery within Reiser partitions, but I can make a few comments.

 

If I'm interpreting the results of your "reiserfsck --check" run, this looks like a freshly formatted drive with only 2 directories and no real data.  The check probably finished very quickly, because it only had a brand new and empty file system!  This is similar to having a Windows drive that is full of files, and accidentally running a Quick Format on it.  You would only see about 2 folders (a root folder and a Recycled folder) and no data.  You could checkdisk it successfully (and quite quickly!), but you could not see all of the files whose content you KNOW is still somewhere on that drive.  The --fix-fixable option won't help you here, because there is nothing to fix in this brand new and empty file system.  I don't know for sure, but I don't think that the --rebuild-sb option is useful any more either, because you now have a fresh new superblock, or the check would have failed immediately.  If you don't get better advice from others, I would try the --rebuild-tree option, which should ignore the current file system, and examine the entire disk for files and folders, and rebuild a file system with as many as it can find (but probably a lot of funny filenames and strange folder locations).

 

Or is it time to start data recovery on the original disk1?

 

As to which disk to attempt recovery, the original disk would be better *IF* it is safely readable and writable to the entire surface.  If not, the replacement disk should be a nearly exact copy, containing everything the original had.

Link to comment

Here's the output of running reiserfsck --check on the replacement disk1:

 

It looks to me like it found no errors, and so it didn't suggest running --fix-fixable, --rebuild-tree, or --rebuild-sb.  As such, I did not run either of those commands.

 

Syslog attached.  Should I run --rebuild-sb?  If so, is this the correct syntax:

 

reiserfsck --rebuild-sb /dev/md1

 

I'm not an expert at this, I'd rather defer to others with more experience with reiserfsck and data recovery within Reiser partitions, but I can make a few comments.

 

If I'm interpreting the results of your "reiserfsck --check" run, this looks like a freshly formatted drive with only 2 directories and no real data.  The check probably finished very quickly, because it only had a brand new and empty file system!  This is similar to having a Windows drive that is full of files, and accidentally running a Quick Format on it.  You would only see about 2 folders (a root folder and a Recycled folder) and no data.  You could checkdisk it successfully (and quite quickly!), but you could not see all of the files whose content you KNOW is still somewhere on that drive.  The --fix-fixable option won't help you here, because there is nothing to fix in this brand new and empty file system.  I don't know for sure, but I don't think that the --rebuild-sb option is useful any more either, because you now have a fresh new superblock, or the check would have failed immediately.  If you don't get better advice from others, I would try the --rebuild-tree option, which should ignore the current file system, and examine the entire disk for files and folders, and rebuild a file system with as many as it can find (but probably a lot of funny filenames and strange folder locations).

 

Or is it time to start data recovery on the original disk1?

 

As to which disk to attempt recovery, the original disk would be better *IF* it is safely readable and writable to the entire surface.  If not, the replacement disk should be a nearly exact copy, containing everything the original had.

You could use the --rebuild-tree and ask it to scan the entire disk.

 

reiserfsck --rebuild-tree --scan-whole-partition /dev/md1

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.