--rebuild-tree not working?

newoski · June 27, 2016

Hi Guys,

My saga continues, however, I wanted to start a new thread since the last one was all over the place. After swapping PCI slots, things seem to have gotten better. 6 of the 7 unmountable drives mounted. The 7th is md11, which is still red balled. I ran a disk check on it and it told me that the previous --rebuild-tree didn't complete. I tried running it again. It ran for about 6 hours, but still doesn't seem to be complete.

Here's the full rebuild tree output:

https://pastebin.com/mkQTUarr

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

So confused...

trurl · June 28, 2016

Not sure starting a new thread is helpful, but you should have at least included a link to any other threads you have posted that might be relevant.

RobJ · June 28, 2016

If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option.

If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying. At the bottom of the report, it indicates a bad block, so that has to be fixed first.

newoski · June 28, 2016

If you're starting a new thread, you have to assume we don't know anything about your situation (and I don't), and provide your diagnostics (Need help? Read me first!), especially when you are going to mention unmountable drives, red balls, and using the --rebuild-tree option.

If --rebuild-tree didn't finish, then there must be a hardware issue, and you have to solve that first, no point in retrying. At the bottom of the report, it indicates a bad block, so that has to be fixed first.

Thanks very much for clarifying and my apologies for the confusion with the new thread. That said, I've attached my Diagnostics zip. I'm unfamiliar with bad blocks. How exactly should I proceed? Do I simply run disk check again with the -B parameter?

I believe I already resolved the hardware issues unrelated to the drive itself -- pinpointed to a bad PCI slot for the expansion card. I moved the card to a new slot and that fixed all my unmountable drives, except this one with is red balled.

tower-diagnostics-20160628-1310-bad-block.zip

JorgeB · June 28, 2016

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

So confused...

You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail, you could still be having issues with the controller.

Edit to add:

Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk.

newoski · June 28, 2016

Weirder still, once that process completed, or didn't, disk10 popped up with 127 read errors while the server was in Maintenance mode...

So confused...

You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail, you could still be having issues with the controller.

Edit to add:

Just looked at your diags and Disk10 has pending sectors, so it needs to be replaced, problem is that you already have a disabled disk.

So after fixing the controller -- to the best of my knowledge -- I've run check disk on all drives. With the exception of Disk11, all have passed with No Corruptions Found... Where should I go from here?

Edit to add:

I understand I'm in a tight spot, I'm just unclear on how to proceed. If Disk11 is unrecoverable, so be it. If there's something I should do to further test the controller, let me know. If I should be doing something about Disk11 and those bad blocks, let me know. I'm just unclear on how to proceed in all regards... I believe my parity is shot to hell at this point...

JorgeB · June 28, 2016

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

newoski · June 28, 2016

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

Hmmmm So here's what I tried:

1. Stop Array

2. Unmount Disk11

3. Start Array

4. Mount Disk11 via Unassigned Devices plugin. It shows up as Hitachi...

5. Try to read disk. I get a Windows Security pop up asking for Network Password to connect to TOWER. I try root / blank but it doesn't accept those credentials...

- I'm able to read from all other mounted drives

newoski · June 28, 2016

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now

newoski · June 28, 2016

You can try mount/fix actual disk11, mount it with unassigned devices plugin and if needed run reiserfsck directly on that disk, since SMART for it looks OK.

If that's successful then you can move all the data you can from disk10 to other disks if you have the space, after that you can do a new config without disk10 (or with a new one in its place).

How do I go about running reiserfs directly on the disk11? I've got data transferring from disk10 to the rest of the array, now

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

JorgeB · June 28, 2016

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then:

reiserfsck --check /dev/sdX1

X=assigned letter

newoski · June 28, 2016

Data transfer, complete. Luckily, I have most of it stored elsewhere for this disk. Now I just need an explanation of how to run the reiserfck stuff on the unmounted Disk11

You have to use the unassigned devices plugin, or temporarily assign it to your cache slot (make sure filesystem is set to reiser), then:
reiserfsck --check /dev/sdX1
X=assigned letter

OK, it's running now. Isn't this going to give me the same error, eventually, about the bad blocks?

JorgeB · June 28, 2016

It shouldn't, the previous errors were from disk10, now it's only reading disk11.

newoski · June 28, 2016

It shouldn't, the previous errors were from disk10, now it's only reading disk11.

Sorry, so close but still a bit confused...

I emptied the contents off of Drive10. That's still assigned at the moment. Drive11 is red balled.

How do I go about removing both drive 10 and 11 from the arrive? It gives me errors about too many wrong and/or missing disks. I understand the logic of what we're doing, just uncertain about the right away to do it...

Do I remove Drive 11, and then create a New Config... and then stop the array, remove Drive 10, and create a new config again?

JorgeB · June 28, 2016

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array.

newoski · June 28, 2016

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity will begin when first starting the array.

so we're hoping that by simply removing the bad data from disk 10, the check on disk 11 will complete?

Sent from my HTC6535LVW using Tapatalk

JorgeB · June 28, 2016

You're using reiserfsck on a emulated disk, unRAID reads all other disks plus parity to create its content, if disk10 or any other disk has errors, reiserfsck will fail...

You are now checking the actual disk, not the emulated one like before.

newoski · June 28, 2016

If reiserfsck recovers disk11, and only after that, you have to do a new config (Tools > New config), reassign all disks (take a screenshot before), including the hopefully recovered disk11 and leaving out disk10 (or using a new one in its place), a parity sync will begin when first starting the array.

So interestingly, there were no errors found when I did a disk check on disk11 via /dev/sds1

What do I do now? It still says unmountable in the GUI. Should I run rebuild tree?

root@Tower:~# reiserfsck --check /dev/sds1

reiserfsck 3.6.24

Will read-only check consistency of the filesystem on /dev/sds1

Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Tue Jun 28 15:53:41 2016

###########

Replaying journal: Done.

Reiserfs journal '/dev/sds1' in blocks [18..8211]: 0 transactions replayed

Checking internal tree.. finished

Comparing bitmaps..finished

Checking Semantic tree:

finished

No corruptions found

There are on the filesystem:

Leaves 481945

Internal nodes 2980

Directories 114175

Other files 180943

Data block pointers 462166750 (0 of them are zero)

Safe links 0

###########

reiserfsck finished at Tue Jun 28 16:16:39 2016

JorgeB · June 28, 2016

Does it mount?

newoski · June 28, 2016

Does it mount?

No. Should I run rebuild tree?

EDIT: It gives me the same problem when I mount it via Unassigned Devices. Trying to access the drive in Windows results in asking me for a username/password, but root etc doesn't work

JorgeB · June 28, 2016

Where is the disk, cache or unassigned devices plugin?

newoski · June 28, 2016

Where is the disk, cache or unassigned devices plugin?

Unassigned Devices

newoski · June 28, 2016

Where is the disk, cache or unassigned devices plugin?

CACHE WORKED! Woohooo. Hoping this is progress??

What do I do now? Copy all the data of the drive and then shrink the array??

JorgeB · June 28, 2016

Strange, it should also mount in UD, but never mind, now you can do a new config:

-take a screenshot of current array

-tools > new config

-reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot

-start array to begin parity sync

newoski · June 28, 2016

Strange, it should also mount in UD, but never mind, now you can do a new config:

-take a screenshot of current array

-tools > new config

-reassign all disks including disk11 (leave out old disk10), double check parity disk is in the parity slot

-start array to begin parity sync

Wait, what? The former disk11 is assigned as my cache disk right now. I can't assign it to disk11. What am I missing? Shouldn't I copy all the data off it -- while assigned as the cache drive, first -- and then essentially just assign it as a new disk, once I shrink the array to remove disk 10 and the drive formerly known as disk11?

--rebuild-tree not working?

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation