--rebuild-tree not working?


Recommended Posts

Hi Guys,

 

My saga continues, however, I wanted to start a new thread since the last one was all over the place. After swapping PCI slots, things seem to have gotten better. 6 of the 7 unmountable drives mounted. The 7th is md11, which is still red balled. I ran a disk check on it and it told me that the previous --rebuild-tree didn't complete. I tried running it again. It ran for about 6 hours, but still doesn't seem to be complete.

 

Here's the full rebuild tree output:

 

https://pastebin.com/mkQTUarr

 

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

 

So confused...

Link to comment

If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option.

 

If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying.  At the bottom of the report, it indicates a bad block, so that has to be fixed first.

Link to comment

If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option.

 

If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying.  At the bottom of the report, it indicates a bad block, so that has to be fixed first.

 

Thanks very much for clarifying and my apologies for the confusion with the new thread. That said, I've attached my Diagnostics zip. I'm unfamiliar with bad blocks. How exactly should I proceed? Do I simply run disk check again with the -B parameter?

 

I believe I already resolved the hardware issues unrelated to the drive itself -- pinpointed to a bad PCI slot for the expansion card. I moved the card to a new slot and that fixed all my unmountable drives, except this one with is red balled.

tower-diagnostics-20160628-1310-bad-block.zip

Link to comment

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

 

So confused...

 

You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has  errors, reiserfsck will fail, you could still be having issues with the controller.

 

Edit to add:

Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk.

Link to comment

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

 

So confused...

 

You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has  errors, reiserfsck will fail, you could still be having issues with the controller.

 

Edit to add:

Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk.

 

So after fixing the controller -- to the best of my knowledge -- I've run check disk on all drives. With the exception of Disk11, all have passed with No Corruptions Found... Where should I go from here?

 

Edit to add:

I understand I'm in a tight spot, I'm just unclear on how to proceed. If Disk11 is unrecoverable, so be it. If there's something I should do to further test the controller, let me know. If I should be doing something about Disk11 and those bad blocks, let me know. I'm just unclear on how to proceed in all regards... I believe my parity is shot to hell at this point...

Link to comment

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

 

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

Link to comment

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

 

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

 

Hmmmm So here's what I tried:

 

1. Stop Array

2. Unmount Disk11

3. Start Array

4. Mount Disk11 via Unassigned Devices plugin. It shows up as Hitachi...

5. Try to read disk. I get a Windows Security pop up asking for Network Password to connect to TOWER. I try root / blank but it doesn't accept those credentials...

- I'm able to read from all other mounted drives

 

Link to comment

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

 

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

 

How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now

Link to comment

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

 

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

 

How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now

 

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

Link to comment

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

 

You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then:

 

reiserfsck --check /dev/sdX1

 

X=assigned letter

Link to comment

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

 

You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then:

 

reiserfsck --check /dev/sdX1

 

X=assigned letter

 

OK, it's running now. Isn't this going to give me the same error, eventually, about the bad blocks?

Link to comment

It shouldn't, the previous errors were from disk10, now it's only reading disk11.

 

Sorry, so close but still a bit confused...

 

I emptied the contents off of Drive10. That's still assigned at the moment. Drive11 is red balled.

 

How do I go about removing both drive 10 and 11 from the arrive? It gives me errors about too many wrong and/or missing disks. I understand the logic of what we're doing, just uncertain about the right away to do it...

 

Do I remove Drive 11, and then create a New Config... and then stop the array, remove Drive 10, and create a new config again?

Link to comment

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array.

 

Link to comment

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity will begin when first starting the array.

so we're hoping that by simply removing the bad data from disk 10, the check on disk 11 will complete?

 

Sent from my HTC6535LVW using Tapatalk

 

 

Link to comment

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array.

 

So interestingly, there were no errors found when I did a disk check on disk11 via /dev/sds1

 

What do I do now? It still says unmountable in the GUI. Should I run rebuild tree?

 

root@Tower:~# reiserfsck --check /dev/sds1

reiserfsck 3.6.24

 

Will read-only check consistency of the filesystem on /dev/sds1

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Tue Jun 28 15:53:41 2016

###########

Replaying journal: Done.

Reiserfs journal '/dev/sds1' in blocks [18..8211]: 0 transactions replayed

Checking internal tree..  finished

Comparing bitmaps..finished

Checking Semantic tree:

finished

No corruptions found

There are on the filesystem:

        Leaves 481945

        Internal nodes 2980

        Directories 114175

        Other files 180943

        Data block pointers 462166750 (0 of them are zero)

        Safe links 0

###########

reiserfsck finished at Tue Jun 28 16:16:39 2016

 

 

Link to comment

Strange, it should also mount in UD, but never mind, now you can do a new config:

 

-take a screenshot of current array

-tools > new config

-reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot

-start array to begin parity sync

Link to comment

Strange, it should also mount in UD, but never mind, now you can do a new config:

 

-take a screenshot of current array

-tools > new config

-reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot

-start array to begin parity sync

 

Wait, what? The former disk11 is assigned as my cache disk right now. I can't assign it to disk11. What am I missing? Shouldn't I copy all the data off it -- while assigned as the cache drive, first -- and then essentially just assign it as a new disk, once I shrink the array to remove disk 10 and the drive formerly known as disk11?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.