Jump to content

ran mkfs.btrfs on my parity drive


jsbarde

Recommended Posts

So by accident (not paying attention when reading instructions) I ran mkfs.btrfs -f /dev/sdb1 on my parity drive.

 

What do I need to do to undo this?

 

I'm not a linux guy so I have no idea where to start, so far unraid doesn't seem to care but I suspect it's going to be a major issue soon.

 

Link to comment

Run a full correcting parity check. There will probably be a large handful of corrected errors right at the start of the check where the new filesystem was initialized overwriting the parity information, and then it will probably settle down. After the check has completed, you should be back to normal. If you have any data drives that you suspect may act up, don't do anything until you inspect and/or post for inspection smart reports on all your drives. Right now you are not protected from a failing drive, as the rebuild would be corrupted.

Link to comment

I actually do have 1 drive that's having some issues. Part of what I was doing today was preclearing a 3tb red drive to replace a 2tb green drive that was having issues.

 

The smart report for the failing drive is attached.

 

Also to make matters worse I'm getting the following errors in the syslog

/usr/bin/tail -f /var/log/syslog 2>&1
Nov 16 21:37:32 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 62985 does not match to the expected one 2
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 460718117. Fsck?
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [444 446 0x0 SD]
Nov 16 21:37:32 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 36020 does not match to the expected one 1
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 468160837. Fsck?
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [352 356 0x0 SD]
Nov 16 21:37:32 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 28913 does not match to the expected one 1
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 470811105. Fsck?
Nov 16 21:37:32 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [347 350 0x0 SD]
Nov 16 21:37:32 Homestar02 shfs/user: shfs_open: open: /mnt/disk1/Video/Television/Futurama/Season 3/Futurama S03E03 - -The Cryonic Woman.mkv (30) Read-only file system
Nov 16 21:37:38 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 62985 does not match to the expected one 2
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 460718117. Fsck?
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [444 447 0x0 SD]
Nov 16 21:37:38 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 62985 does not match to the expected one 2
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 460718117. Fsck?
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [444 446 0x0 SD]
Nov 16 21:37:38 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 36020 does not match to the expected one 1
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 468160837. Fsck?
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [352 356 0x0 SD]
Nov 16 21:37:38 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 28913 does not match to the expected one 1
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 470811105. Fsck?
Nov 16 21:37:38 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [347 350 0x0 SD]
Nov 16 21:37:42 Homestar02 shfs/user: shfs_open: open: /mnt/disk1/Video/Television/Futurama/Season 3/Futurama S03E06 - -Bendless Love.mkv (30) Read-only file system
Nov 16 21:37:46 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 62985 does not match to the expected one 2
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 460718117. Fsck?
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [444 447 0x0 SD]
Nov 16 21:37:46 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 62985 does not match to the expected one 2
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 460718117. Fsck?
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [444 446 0x0 SD]
Nov 16 21:37:46 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 36020 does not match to the expected one 1
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 468160837. Fsck?
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [352 356 0x0 SD]
Nov 16 21:37:46 Homestar02 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 28913 does not match to the expected one 1
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 470811105. Fsck?
Nov 16 21:37:46 Homestar02 kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [347 350 0x0 SD]

I'll hold off on doing anything until I hear back, I have a feeling I might be in trouble :)

WD_2tb_green_sdm.txt

Link to comment

If this were me, my first order of business would be to rsync the drive in question (failing smart checks) to another spare drive somewhere.

I.E. Save the data by any means without writing to the array in any manner.

Realize that with all those pending sectors the drive is going to kick out allot of errors.

It might even kick the drive out of the array. If that's the case ddrescue is your friend.

 

 

After that I "might" try to correct parity either by a correcting parity check or just re-create it.

Then I would replace the failing drive with the newly pre-cleared drive.

 

 

You may or may not be able to correct/re-create parity. I.E. the drive may get kicked out of the array.

Even if it works, some of the data may be suspect.

Hopefully you have some form of hash sums on the files to verify integrity.

Link to comment

I actually do have 1 drive that's having some issues. Part of what I was doing today was preclearing a 3tb red drive to replace a 2tb green drive that was having issues. off on doing anything until I hear back, I have a feeling I might be in trouble :)

The problem is that you have mangled the contents of your parity drive, so at the moment you cannot rebuild a problem disk.  As you did this outside unRAID control then unRAID may not realise the parity is invalid.

 

My recommendation at this stage would be to do a 'new config' and add back all the drives EXCEPT the problem drive (if you have a replacement it could be added at this point).  This will get your array up and running and you an now do a pairty sync to get the data on those drives protected again.

 

You can now try and recover the contents of the problem drive that has Reiserfs errors by running the reiserfsck utility against the drive.  If that does not succeed then I am afraid it is quite likely that the contents of that drive will be lost (and hopefully you have backups you can get the data back from).  If the reiserfsck manages to fix the problem you can then proceed to copy data off the drive. 

 

Once you have finished any data recovery you can then (if desired) put the drive through a pre-clear cycle to see if the drive really is failing.

Link to comment

There's quite the number of sector errors on that drive.

  5 Reallocated_Sector_Ct   0x0033   140   140   140    Pre-fail  Always   FAILING_NOW 1148
196 Reallocated_Event_Count 0x0032   025   025   000    Old_age   Always       -       175
197 Current_Pending_Sector  0x0032   200   197   000    Old_age   Always       -       137
198 Offline_Uncorrectable   0x0030   200   198   000    Old_age   Offline      -       23

 

Recreating the array without the bad drive is a good option as there is no way to recover from the parity corruption.

You can then try to rsync the data from the bad drive to somewhere on the array or a spare drive.

 

I had a problem like this once. I was so paranoid that I ddrescued the drive from one to the other.

Skipped all the bad sectors, then ddrescued the drive in reverse. I was able to recover all but one sector.

Then ran the reiserfsck on the newly recovered drive.

So 512 bytes somewhere is probably randomly corrupt, but the rest of the data was retrieved.

 

EDIT

 

Part of what I was doing today was preclearing a 3tb red drive to replace a 2tb green drive that was having issues.

If you have 'OTHER' 2TB drives in the array that match this failing drive, You might want to rebuild the array without the new 3TB drive.

Then do a parity check.

If all is good, upgrade one of the other 2TB drives with the new 3TB drive.

 

 

This will yield spare space in your array of 1TB and a now free 2TB drive that you can attempt to use ddrescue on (if you choose to go that route). It's a bit of work.

Link to comment

I can stand to loose some data on this drive, just just movies and tv. So I don't have to get crazy on it.

 

Let me make sure I understand the steps:

 

1. I stop the array and remove the 2tb drive that's having errors.

 

2. I restart the array and force the parity to update (how do I do that, is it just a regular parity check from the web interface?)

 

3. Once it's rebuilt I can add in my new 3tb drive and then copy the data (that I can get) from my bad 2tb drive. (I'm assuming this is from the command line, what's the best way to do that?)

 

And then I should be good as new.

 

 

 

Link to comment

Recreating the array without the bad drive is a good option as there is no way to recover from the parity corruption.

I think given the specific set of circumstances at play, that statement is a little too broad. All he did was set up a BTRFS format, I doubt it wrote more than a few MB at most.

 

I agree with copying as much as possible off the suspect drive without messing with anything first, then, I'd physically remove the failing drive from the array and see how badly the parity emulated drive is mangled. It may be that a simple reseirfs check and rebuild would get things back to normal, and the drive could just be rebuilt from there.

 

No point in throwing away parity without at least checking to see how badly it's mangled, especially when there is a possible failed drive.

Link to comment

Recreating the array without the bad drive is a good option as there is no way to recover from the parity corruption.

I think given the specific set of circumstances at play, that statement is a little too broad. All he did was set up a BTRFS format, I doubt it wrote more than a few MB at most.

 

I agree with copying as much as possible off the suspect drive without messing with anything first, then, I'd physically remove the failing drive from the array and see how badly the parity emulated drive is mangled. It may be that a simple reseirfs check and rebuild would get things back to normal, and the drive could just be rebuilt from there.

 

No point in throwing away parity without at least checking to see how badly it's mangled, especially when there is a possible failed drive.

 

This is why my first suggestion was to copy as much data as possible first.

 

Emulating the drive to see how badly it is mangled is option too.

We've seen people recover with reiserfs when it scans the raw data blocks.

However this would be on an emulated drive.

It surely would be an interesting, albeit long, test.

 

If it were my array, I wouldn't do anything until I was able to copy off what I could to another drive.

Link to comment

I can stand to loose some data on this drive, just just movies and tv. So I don't have to get crazy on it.

 

Let me make sure I understand the steps:

 

1. I stop the array and remove the 2tb drive that's having errors.

 

2. I restart the array and force the parity to update (how do I do that, is it just a regular parity check from the web interface?)

 

3. Once it's rebuilt I can add in my new 3tb drive and then copy the data (that I can get) from my bad 2tb drive. (I'm assuming this is from the command line, what's the best way to do that?)

 

And then I should be good as new.

 

 

If you want to throw away current parity and loose any possible chance of recovering (even the mangled data).

Then capture your definitions of what drive serials are where from the main screen.

do a NEW config in utils.

 

 

add each data drive. PARITY IS LAST, do not add parity drive until everything is verified.

 

 

You can start the array and only mount data drives, verify everything mounts and is where you expect it.

Once you are happy, you can stop the array, assign parity and let'er rip!

 

This throws away the modified parity.

 

Other options are

1. remove failing drive, emulate it and see what you can recover.

or

2. Replacing failing drive, letting corrupt parity rebuild the failed drive on a new drive and possibly attempt to fsck to retrieve the potentially unmangled data.

 

In order to fsck on the array, it has to be in maintenance mode. Started but not mounted.

 

If you can mount some usb drive or esata drive and rsync off what you can, you might be able to save some data ahead of time.

Link to comment

Ok, so this is how we ended up.

 

During the process at some point it appears I triggered a parity check, which I noticed later and canceled. Not sure if that helped or hurt.

 

I then backed up everything I could get from the bad 2tb drive, it appears it was most of it. This took quite a while.

 

I then shut down the server and put in the new precleared 3tb drive in place of the 2tb drive.

 

Then I rebuilt the data from the 2tb drive with the parity. (this took quite a while too)

 

I still had reiserfs errors so I did a check and then had to do a rebuild-tree. (that also took a decent amount of time)

 

There are quite a few files in the lost+found folder but they all seem to work (at least from a fast check) and it appears that drive was rebuilt quite nicely too.

 

Even after all my screw-ups unraid still did a nice job of protecting my data.

 

Thanks to everyone for all there help and suggestions!  :D

 

 

 

Link to comment
  • 1 year later...

Sorry to revive thread, but...  I was wondering about this BTRFS and Parity....  Is there a reason why the parity drive can't or won't handle BTRFS?  Sounded like a good idea until I searched and found this thread...

 

Only handle bitrot on parity, so if I get a bitrot on one of the array drives, the parity drive would write the correct value and handle the bitrot on the array drives, no?

Link to comment

Sorry to revive thread, but...  I was wondering about this BTRFS and Parity....  Is there a reason why the parity drive can't or won't handle BTRFS?  Sounded like a good idea until I searched and found this thread...

 

Only handle bitrot on parity, so if I get a bitrot on one of the array drives, the parity drive would write the correct value and handle the bitrot on the array drives, no?

 

The parity drive doesn't have a file system, so the reason it "can't handle BTRFS" is that applying a file system to the parity drive invalidates the parity.

Link to comment

Sorry to revive thread, but...  I was wondering about this BTRFS and Parity....  Is there a reason why the parity drive can't or won't handle BTRFS?  Sounded like a good idea until I searched and found this thread...

 

Only handle bitrot on parity, so if I get a bitrot on one of the array drives, the parity drive would write the correct value and handle the bitrot on the array drives, no?

Parity drive does not work like that!  There is NO file system on the parity drive.  It works at the sector level and is independent of the file systems on the data disks.  You should read up the description of how it works in the unRAID wiki.

Link to comment

Yeah, I was afraid of that.. Im going to probably put a couple of drives aside for my "data" and "backup" and set them to BRTFS and the rest for Movies/TV Shows and leave them XFS which a bitrot probably would have no negative effect, maybe a pause or click noise?  Im not too familiar on how unRAID handles bitrot on a streaming file..

 

Link to comment

At this point you should use XFS for everything.  As far as I know, nobody here has detected a single instance of bitrot.  If you are concerned about it, use the Dynamix File Integrity plugin to detect it and restore from backups if you find it. Once unRAID 6.2 is available there may be more options.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...