OK, this looks bad. What do I do next?

Karyudo · December 21, 2016

That seems to have worked just like you said it would. Thanks! Now at 21.4 GB complete; estimated 20 hours and 44 minutes to rebuild.

Just to confirm: if it says it's rebuilding (and the percentage complete is rising), then I don't have to do anything about formatting, correct? It's recreating the disk bit-by-bit, so the formatting will happen implicitly by the disk rebuild process?

JorgeB · December 21, 2016

Format is never part of the rebuild, disk will remain unmoutable but we'll deal with that later, after the rebuild finishes.

John_M · December 21, 2016

I'm pleased to see you're making progress.

Is that bug worth reporting, or is it such an obscure edge case that it's unlikely to be repeatable?

Karyudo · December 22, 2016

Done rebuilding! Looks successful to me—what a relief. Diagnostic file attached.

Sooo... What's next??

unRAID-diagnostics-20161221-2150.zip

JorgeB · December 22, 2016

You're going to try btrfs restore first, it can take longer but it's non destructive, disk6 is almost empty so you'll use it as a destination, with the array started drop to the CLI and create a folder:

mkdir /mnt/disk6/restored

Then:

btrfs restore -v /dev/sdf1 /mnt/disk6/restored

If it works you'll see the list of files being restored, it will take some time as all data from disk5 (about 2TB) are copied to disk6, if it doesn't post all the output.

Karyudo · December 22, 2016

The four words of your guidance that I don't understand: "drop to the CLI." I took a look at every tab in the GUI, and I don't see "CLI" or "console" or "command line" or anything of that sort.

Actually, that's not accurate: I understand all the words; I just don't know how or where to find the command line interface. I did a Google search, but all I got was old references to SSH and Telnet and stuff that's obviously irrelevant.

As an aside, one of the reasons I chose unRAID instead of something like Amahi, was so that I didn't have to type a bunch of Linux commands to get stuff done. This is, of course, a special and specific case where I'm happy to poke around under the hood a bit. But you're going to have to dumb it way down, I'm afraid.

JorgeB · December 22, 2016

You need to SSH into the server, google Putty.

JorgeB · December 22, 2016

Forgot to tell that you that if you have a monitor and keyboard or IPMI on your server you can use those instead of SSH.

Karyudo · December 22, 2016

OK, got puTTY. I think I'm logged in as root/<none>. Tried the btrfs command. Doesn't look good/right. Screenshot attached!

JorgeB · December 22, 2016

Not good, try:

btrfs restore -u 1 -v /dev/sdf1 /mnt/disk6/restored

and

btrfs restore -u 2 -v /dev/sdf1 /mnt/disk6/restored

Karyudo · December 22, 2016

Same sort of result:

root@Shinagawa:~# btrfs restore -u 1 -v /dev/sdf1 /mnt/disk6/restored
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super
root@Shinagawa:~# btrfs restore -u 2 -v /dev/sdf1 /mnt/disk6/restored
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super

JorgeB · December 22, 2016

Post the output of:

btrfs-show-super /dev/sdf1

btrfs-show-super /dev/sdf1 -i 1

btrfs-show-super /dev/sdf1 -i 2

Karyudo · December 22, 2016

root@Shinagawa:~# btrfs-show-super /dev/sdf1
superblock: bytenr=65536, device=/dev/sdf1
---------------------------------------------------------
csum                    0x7ad5cbd3 [DON'T MATCH]
bytenr                  65536
flags                   0x1
                        ( WRITTEN )
magic                   _BHRfS_M [match]
fsid                    d46e6de7-79a4-4361-8f9c-e9cafe14eb1b
label
generation              2367
root                    2269025857536
sys_array_size          97
chunk_root_generation   2425
root_level              1
chunk_root              180224
chunk_root_level        1
log_root                0
log_root_transid        0
log_root_level          0
total_bytes             8001563168768
bytes_used              2171806806016
sectorsize              4096
nodesize                16384
leafsize                16384
stripesize              4096
root_dir                6
num_devices             1
compat_flags            0x0
compat_ro_flags         0x0
incompat_flags          0x161
                        ( MIXED_BACKREF |
                          BIG_METADATA |
                          EXTENDED_IREF |
                          SKINNY_METADATA )
csum_type               0
csum_size               4
cache_generation        2367

root@Shinagawa:~# btrfs-show-super /dev/sdf1 -i 1
superblock: bytenr=67108864, device=/dev/sdf1
---------------------------------------------------------
csum                    0xdab4e31d [DON'T MATCH]
bytenr                  67108864
flags                   0x1
                        ( WRITTEN )
magic                   _BHRfS_M [match]
fsid                    d46e6de7-79a4-4361-8f9c-e9cafe14eb1b
label
generation              2367
root                    2269025857536
sys_array_size          97
chunk_root_generation   2425
root_level              1
chunk_root              180224
chunk_root_level        1
log_root                0
log_root_transid        0
log_root_level          0
total_bytes             8001563168768
bytes_used              2171806806016
sectorsize              4096
nodesize                16384
leafsize                16384
stripesize              4096
root_dir                6
num_devices             1
compat_flags            0x0
compat_ro_flags         0x0
incompat_flags          0x161
                        ( MIXED_BACKREF |
                          BIG_METADATA |
                          EXTENDED_IREF |
                          SKINNY_METADATA )
csum_type               0
csum_size               4
cache_generation        2367
uuid_tree_generation    2367
dev_item.uuid           449d6b3d-4445-4f28-a251-15c26d3cec92
dev_item.fsid           d46e6de7-79a4-4361-8f9c-e9cafe14eb1b [match]
dev_item.type           0
dev_item.total_bytes    8001563168768
dev_item.bytes_used     2208171032576
dev_item.io_align       4096
dev_item.io_width       4096
dev_item.sector_size    4096
dev_item.devid          1
dev_item.dev_group      0
dev_item.seek_speed     0
dev_item.bandwidth      0
dev_item.generation     0

root@Shinagawa:~# btrfs-show-super /dev/sdf1 -i 2
superblock: bytenr=274877906944, device=/dev/sdf1
---------------------------------------------------------
csum                    0x2733b52c [DON'T MATCH]
bytenr                  274877906944
flags                   0x1
                        ( WRITTEN )
magic                   _BHRfS_M [match]
fsid                    d46e6de7-79a4-4361-8f9c-e9cafe14eb1b
label
generation              2367
root                    2269025857536
sys_array_size          97
chunk_root_generation   2425
root_level              1
chunk_root              180224
chunk_root_level        1
log_root                0
log_root_transid        0
log_root_level          0
total_bytes             8001563168768
bytes_used              2171806806016
sectorsize              4096
nodesize                16384
leafsize                16384
stripesize              4096
root_dir                6
num_devices             1
compat_flags            0x0
compat_ro_flags         0x0
incompat_flags          0x161
                        ( MIXED_BACKREF |
                          BIG_METADATA |
                          EXTENDED_IREF |
                          SKINNY_METADATA )
csum_type               0
csum_size               4
cache_generation        2367
uuid_tree_generation    2367
dev_item.uuid           449d6b3d-4445-4f28-a251-15c26d3cec92
dev_item.fsid           d46e6de7-79a4-4361-8f9c-e9cafe14eb1b [match]
dev_item.type           0
dev_item.total_bytes    8001563168768
dev_item.bytes_used     2208171032576
dev_item.io_align       4096
dev_item.io_width       4096
dev_item.sector_size    4096
dev_item.devid          1
dev_item.dev_group      0
dev_item.seek_speed     0
dev_item.bandwidth      0
dev_item.generation     0

JorgeB · December 22, 2016

All 3 superblocks fail checksum, try ignoring the error to see if it continues:

btrfs restore -i -v /dev/sdf1 /mnt/disk6/restored

Karyudo · December 22, 2016

Nope:

root@Shinagawa:~# btrfs restore -i -v /dev/sdf1 /mnt/disk6/restored
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Could not open root, trying backup super

JorgeB · December 22, 2016

One more thing before using more drastic measures, if if restore didn't work this should also, but no harm in trying:

mkdir /x

mount -o recovery,ro /dev/sdf1 /x

Karyudo · December 22, 2016

root@Shinagawa:~# mkdir /x
root@Shinagawa:~# mount -o recovery,ro /dev/sdf1 /x
mount: wrong fs type, bad option, bad superblock on /dev/sdf1,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.

JorgeB · December 22, 2016

Both these options are destructive so it's best to unassign that disk, start the array so you can rebuild it again if one doesn't work and you want to try the other or a different option.

So after doing the above, you can try:

btrfs rescue chunk-recover /dev/sdf1

This will take a long time, the whole disk is scanned.

If that doesn't work, and as a last resort:

btrfs check --repair /dev/sdf1

You can also wait a while to see if someone else has any more suggestions, the trouble with BTRFS is that the recovery tools are not very good and no one here has much experience with them, BTW, was there a reason why you chose BTRFS for your data disks? Most here recommend XFS for the array.

Karyudo · December 22, 2016

It's sort of incredible to me that a power anomaly (not sure if surge or cut) of so little time that my desktop was completely unaffected has screwed my server so badly!

I'll unassign and get started shortly, after I eat some breakfast (I'm in Vancouver, PST time zone).

I chose BTRFS because I'd read (I forget where; probably not unRAID sources, it sounds like) that it's where file systems are headed, with more redundancy and robustness than anything before it.

https://www.youtube.com/watch?v=rX7wtNOkuHo

Thanks for your attention to this problem! I do hope we're able to get it sorted soon....

Karyudo · December 22, 2016

Well, that didn't take any time at all:

root@Shinagawa:~# btrfs rescue chunk-recover /dev/sdf1
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
read super block error
recover prepare error
Chunk tree recovery failed

Karyudo · December 22, 2016

Neither did this:

root@Shinagawa:~# btrfs check --repair /dev/sdf1
enabling repair mode
ERROR: superblock checksum mismatch
ERROR: superblock checksum mismatch
No valid Btrfs found on /dev/sdf1
Couldn't open file system

I am getting seriously disillusioned with the so-called redundancy that unRAID apparently does not actually have, even though it says it does. I had TWO parity drives, and still this is a huge fail?!?!? From a power problem that didn't affect the other three PCs in the house?!?! (And what the hell did the "successful" parity check and repair do for 14 or 15 hours?!?)

Not impressed. AT ALL. This is the third supposedly-robust server I've had (original WHS, WHS 2011 with Drive Bender, and now unRAID). All have failed, with data loss; unRAID has promised the most and failed the hardest.

(I am, however, impressed with the forum community, and your help especially, johnnie.black!)

JorgeB · December 22, 2016

unRAID protects you from a disk failure (two with dual parity), not from file system corruption, as parity is updated any rebuild will have the same problem.

You can try mounting the old disk 5 with the unassigned devices plugin, if it mounts (or the BTRFS restore works) you should be able to recover most data.

When this is over recommend converting your data disks to XFS.

Karyudo · December 22, 2016

I'll try mounting the original Disk 5 with the UD plugin. If it mounts (and it disappoints me to think that it probably won't), then how do I recover the data? Does that take PuTTY again, or is there a plugin?

How does one go about converting to XFS? I imagine that takes an extra drive, and 14+ hours per 8TB drive? Mount a new disk, format it XFS, and then...?

I have to admit that while I'm hugely appreciative of your time and effort on my behalf, I am really, REALLY angry and disappointed at the moment. I thought I had a nice, new, near-perfect, robust system set up, and I'd be able to enjoy the holidays catching up on TV and movies with my wife. Instead, it turns out that when push comes to shove, my new system is just as sh!tty as any sh!tty thing I've ever had before. Somehow, most other people seem to be able to run years on half-assed setups without issue; in my case, my longest run without problems with unRAID or the new hardware it's running on has been about seven weeks. Now, I understand that's not all unRAID's fault. Stuff happens—I get it. But at the moment, since stuff seems to happen to my servers at a frequency far outside of the norm, I've got a major case of the "WHY ME?"s.

JorgeB · December 22, 2016

I'll try mounting the original Disk 5 with the UD plugin. If it mounts (and it disappoints me to think that it probably won't), then how do I recover the data? Does that take PuTTY again, or is there a plugin?

If the disk mounts (before or after repairing) you can access it like any other share using your desktop.

How does one go about converting to XFS? I imagine that takes an extra drive, and 14+ hours per 8TB drive? Mount a new disk, format it XFS, and then...?

See the thread below with help converting from reiserfs to xfs, the procedure the same.

http://lime-technology.com/forum/index.php?topic=37490.0

Karyudo · December 22, 2016

Some success, I think: the old Disk 5 (-P64M) has mounted!

Do I now start the array without anything assigned in the now-empty Disk 5 slot? (I guess I probably do, in order to see any shares...?)

Is there any reason to not copy files from old Disk 5 to the (now unprotected) array?

Since the Disk 5 rebuild didn't work, I guess that physical disk (-LCHC) can be formatted and added to the array? Is there a way to skip preclearing and or clearing (since you mentioned every single sector of the 8TB was written to as part of the rebuild)?

I also imagine this means that Parity (currently no disk assigned) and Parity 2 (-GBMN) are useless from here on out, and the next step after copying files would be to run a parity check/update?

OK, this looks bad. What do I do next?

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation