Failed drive report during replacement of another drive


Recommended Posts

Ok I get to console and see the following attached picture1--this came up after one of the initial steps I believe.  So I putty'd in and was able to login that way and got the output shown as picture2.

 

Edit:  I should add that doing anything in console in this top picture didn't respond--could not get a login prompt etc.

 

picture1.JPG

picture2.JPG

Edited by talmania
Clarification
Link to comment

Superblock is damaged, this is not normal, could be the result of the problems with the original disk9 or there were changes to the array after the disk13 upgrade attempt, any change, like dockers writing to the array, etc will cause problems.

 

You can still rebuild the superblock and see how it goes, follow the instructions from here:

 

https://wiki.lime-technology.com/Check_Disk_Filesystems#Rebuilding_the_superblock

Link to comment

Ok thanks johnnie--I'm following the wiki and crossing my fingers.  Didn't change the array status at all--hope that was correct.  Currently status:

 

root@Deed:~# reiserfsck --rebuild-sb /dev/md9
reiserfsck 3.6.24
Will check superblock and rebuild it if needed
Will put log info to 'stdout'
Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
Did you use resizer(y/n)[n]: n
rebuild-sb: wrong block count occured (854657433), fixed (488378624)
rebuild-sb: wrong bitmap number occured (26083), fixed (14905)
rebuild-sb: wrong free block count occured (791198636), zeroed
Reiserfs super block in block 16 on 0x909 of format 3.6 with standard journal
Count of blocks on the device: 488378624
Number of bitmaps: 14905
Blocksize: 4096
Free blocks (count of blocks - used [journal, bitmaps, data, reserved] blocks): 0
Root block: 130410427
Filesystem is clean
Tree height: 5
Hash function used to sort names: "r5"
Objectid map size 8, max 972
Journal parameters:
        Device [0x0]
        Magic [0x63a76705]
        Size 8193 blocks (including 1 for journal header) (first block 18)
        Max transaction length 1024 blocks
        Max batch size 900 blocks
        Max commit age 30
Blocks reserved by journal: 0
Fs state field: 0x1:
         some corruptions exist.
sb_version: 2
inode generation number: 2232
UUID: a150194f-d4db-4f12-8220-a6300f0b8386
LABEL:
Set flags in SB:
        ATTRIBUTES CLEAN
Mount count: 205
Maximum mount count: 30
Last fsck run: Fri Nov  5 23:24:38 2010
Check interval in days: 180
Is this ok ? (y/n)[n]: y
The fs may still be unconsistent. Run reiserfsck --check.
root@Deed:~# reiserfsck --check
Usage: reiserfsck [mode] [options]  device
Modes:
  --check                       consistency checking (default)
  --fix-fixable                 fix corruptions which can be fixed without
                                --rebuild-tree
  --rebuild-sb                  super block checking and rebuilding if needed
                                (may require --rebuild-tree afterwards)
  --rebuild-tree                force fsck to rebuild filesystem from scratch
                                (takes a long time)
  --clean-attributes            clean garbage in reserved fields in StatDatas
Options:
  -j | --journal device         specify journal if relocated
  -B | --badblocks file         file with list of all bad blocks on the fs
  -l | --logfile file           make fsck to complain to specifed file
  -n | --nolog                  make fsck to not complain
  -z | --adjust-size            fix file sizes to real size
  -q | --quiet                  no speed info
  -y | --yes                    no confirmations
  -f | --force          force checking even if the file system is marked clean
  -V                            prints version and exits
  -a and -p                     some light-weight auto checks for bootup
  -r                    ignored
Expert options:
  --no-journal-available        do not open nor replay journal
  -S | --scan-whole-partition   build tree of all blocks of the device
root@Deed:~# reiserfsck --check /dev/md9
reiserfsck 3.6.24
Will read-only check consistency of the filesystem on /dev/md9
Will put log info to 'stdout'
Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Tue Aug  1 14:18:50 2017
###########
Replaying journal: Done.
Reiserfs journal '/dev/md9' in blocks [18..8211]: 0 transactions replayed
EDIT:  It's continuing to run--worried that it hung up there on 0 transactions replayed.
 
Edited by talmania
More info
Link to comment

reiserfsck will probably still find errors, but since they should be fixable and the replacement disk is new, so you don't overwrite anything, might as well rebuild first then finish fixing the filesystem, specially if it needs --rebuild-tree, since that would take much longer on the emulated disk.

 

Keep old disk9 intact for now, some data (or most of it with some luck) should still be salvageable if needed.

Link to comment

Evening update:  came back home from the office and found the following in the summary of the log:


 

Quote

 

Replaying journal: Done.
Reiserfs journal '/dev/md9' in blocks [18..8211]: 0 transactions replayed

Zero bit found in on-disk bitmap after the last valid bit.
Checking internal tree..  /  1 (of  19)/  1 (of  93)/  1 (of  87)block 8211: The number of items (6) is incorrect, should be (1)
 the problem in the internal node occured (8211), whol/ 36 (of  93)/142 (of 170)block 195496755: The level of the node (40014) is not correct, (1) expected
 the problem in the internal node occured/  9 (of  19)/130 (of 130)/114 (of 115)bad_stat_data: The objectid (841) is marked free, but used by an object [834 841 0x0 SD (0)]                             finished
Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.
Bad nodes were found, Semantic pass skipped
2 found corruptions can be fixed only when running with --rebuild-tree
###########
reiserfsck finished at Tue Aug  1 16:08:26 2017
###########
root@Deed:~#
root@Deed:~#

 

 

Then I unmounted the array, assigned the new disk9 and brought the array online and now I see the following:

 

 

picture3.JPG

Link to comment
17 hours ago, johnnie.black said:

 

Wait until the rebuild finishes then run reiserfsck --rebuid-tree, that will take several hours but it should fix it.

 Ok I ran it and it completed....then had to stop the array and restart it.  It allowed disk9 to be mounted and I can now see the disk share but nothing is there except "lost and found".  Can't browse the share with windows but can under the gui.  Tons of folders in there and files as well of the correctish sizes I presume.  Assume I have to open them up to read them etc?  They are named with simple numeric sequences--see attached picture.

 

Attached is the output of the --rebuild-tree.

 

 

 


 

picture4.JPG

rebuild-tree.txt

Link to comment
2 minutes ago, talmania said:

 

Nope...sounds like I need to be!

Yep. I'd advise researching a little, but in a nutshell, don't copy between /mnt/diskX locations and /mnt/user locations or vice-versa. Use either disk only or user only locations in copy operations, don't mix them.

 

I only piped up because you mentioned accessing disk9 directly. If you want to copy using disk9 as a source, make sure your destination is also disk9 but another folder, or another diskX, not a user share.

 

It doesn't effect all operations, but until you understand why it happens and what exactly causes your files to disappear if you do it wrong, just don't do it.

Link to comment
Just now, jonathanm said:

Yep. I'd advise researching a little, but in a nutshell, don't copy between /mnt/diskX locations and /mnt/user locations or vice-versa. Use either disk only or user only locations in copy operations, don't mix them.

 

I only piped up because you mentioned accessing disk9 directly. If you want to copy using disk9 as a source, make sure your destination is also disk9 but another folder, or another diskX, not a user share.

 

It doesn't effect all operations, but until you understand why it happens and what exactly causes your files to disappear if you do it wrong, just don't do it.

 

Actually windows gave me a permissions error and when I started poking around I realized I was under disk9/lost+found and not \\tower\lost+found.  I'm assuming all I have to do is move the files from \\tower\lost+found to their respective \\tower\sharename and I'll be set no?  Or is a more complex move needed?  Usershare to usershare if I'm not mistaken...

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.