Dying Hard drive??


Recommended Posts

i've had a disk_dsbl error a few weeks ago (syslog below 140307) tried all he suggested things and got it back up and running without issues, or so i thought.

 

(came back home last night and found sab not running and newznab+ not updating.

after some looking thru the syslog and such, noticed mysql crashed, damaging the NN db. nevermind those though lol. )

 

what i found in the syslog while looking thru it was a bunch of errors again regarding my drive 4 which was the one that disabled itself a few weeks ago.

 

so again i've read thru the forum and got to the reiserchk option instructions. followed that also and got a "Bad root block 0. (--rebuild-tree did not complete)"  in the end

 

and now i'm scared to keep pushing buttons and figure it would be better to ask the professionals for help :)

 

i've got all my logs, from the initial DISK_DSBL until now which i've attached, the date is in front, for sequencing reasons. drive 4 is the .....7475 serial number drive if browsing the syslog

 

my guess is its time for a new drive or RMA i possible..

140307-logs-DISK_DSBL.zip

140318-logs-drive-errors.zip

Link to comment

and now i'm scared to keep pushing buttons and figure it would be better to ask the professionals for help :)

OMG! I wish my boss was like that...everytime he gets frustrated and can't figure something out, he just starts randomly clicking his mouse on random buttons on the screen..."I'm Sure One of These Will Do What I Want!"  click - -  click - - - click

Link to comment

and now i'm scared to keep pushing buttons and figure it would be better to ask the professionals for help :)

OMG! I wish my boss was like that...everytime he gets frustrated and can't figure something out, he just starts randomly clicking his mouse on random buttons on the screen..."I'm Sure One of These Will Do What I Want!"  click - -  click - - - click

 

Lol, oh I know that all too well from work. Except it's not just a computer, but entire milling machine. Gotta love computer challenged coworkers lol

 

Tapatalk on Samsung GS3

 

 

Link to comment

The SMART report looks good. The previous errors indicate a bad or loose SATA cable or SATA connection, port, or adaptor. See Check Disk Filesystems in my sig.

 

i've tried the check disk filesystem page you sig links too and i've attached the copy of the screen in the second file.

 

short story ... this is what i get

something tells me that "bad root block 0"cant be good?

 

root@unRaid:~# reiserfsck --check /dev/sdb

reiserfsck 3.6.24

 

Will read-only check consistency of the filesystem on /dev/sdb

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Wed Mar 19 03:09:08 2014

###########

Replaying journal: No transactions found

Zero bit found in on-disk bitmap after the last valid bit.

Checking internal tree..

 

Bad root block 0. (--rebuild-tree did not complete)

 

Aborted (core dumped)

 

Link to comment

To do a reiserfsck check you should ideally have the array started in maintenance mode and then check the /dev/md? device where ? corresponds to the disk number in the array.  Doing it that way allows any parity to be maintained.

 

If you MUST check using the raw device names (which invalidates any parity) then you should be checking partition 1 (e.g. /dev/sdb1) and not the disk as a whole as you seem to have tried.

Link to comment

To do a reiserfsck check you should ideally have the array started in maintenance mode and then check the /dev/md? device where ? corresponds to the disk number in the array.  Doing it that way allows any parity to be maintained.

 

If you MUST check using the raw device names (which invalidates any parity) then you should be checking partition 1 (e.g. /dev/sdb1) and not the disk as a whole as you seem to have tried.

 

it is in maintenance mode, i just didnt understand the md? portion clearly.

tnx for the help.

check finished and said to run rebuild-tree.

we'll see what happens when i get back from work, running for the last hour or so, 2TB drive, should be done when i get back.

Link to comment

Looks like it completed with success...

 

there were 2 files in the lost+found, but nothing worth keeping.

so i deleted the directory, stopped array and restarted array in regular mode. lets see what happens....

 

tnx for the help.

 

i'll post back if anything changes

 

root@unRaid:~# reiserfsck --rebuild-tree /dev/md4
reiserfsck 3.6.24

*************************************************************
** Do not  run  the  program  with  --rebuild-tree  unless **
** something is broken and MAKE A BACKUP  before using it. **
** If you have bad sectors on a drive  it is usually a bad **
** idea to continue using it. Then you probably should get **
** a working hard drive, copy the file system from the bad **
** drive  to the good one -- dd_rescue is  a good tool for **
** that -- and only then run this program.                 **
*************************************************************

Will rebuild the filesystem (/dev/md4) tree
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
Replaying journal: Done.
Reiserfs journal '/dev/md4' in blocks [18..8211]: 0 transactions replayed
###########
reiserfsck --rebuild-tree started at Wed Mar 19 11:42:49 2014
###########

Pass 0:
####### Pass 0 #######
Loading on-disk bitmap .. ok, 292744791 blocks marked used
Skipping 19389 blocks (super block, journal, bitmaps) 292725402 blocks will be r      ead
0%...                                                         left 0, 19470 /seccc    5768 directory entries were hashed with "r5" hash.
        "r5" hash is selected
Flushing..finished
        Read blocks (but not data blocks) 292725402
                Leaves among those 290105
                Objectids found 5811

Pass 1 (will try to insert 290105 leaves):
####### Pass 1 #######
Looking for allocable blocks .. finished
0%....20%....40%....60%....80%....100%                          left 0, 55 /sec
Flushing..finished
        290105 leaves read
                290069 inserted
                36 not inserted
        non-unique pointers in indirect items (zeroed) 777
####### Pass 2 #######

Pass 2:
0%....20%....40%....60%....80%....100%                           left 0, 4 /sec
Flushing..finished
        Leaves inserted item by item 36
Pass 3 (semantic):
####### Pass 3 #########
... s/TV/Thundercats/Season 2/Thundercats s02e48 return to thundera pt III .avivpf-10680: The file [2314 2358] has the wrong block count in the StatData (359096) - corrected to (157616)
... 03)/Season 03/Battlestar Galactica (2003) - S03E18 - The Son Also Rises.avivpf-10680: The file [4872 4886] has the wrong block count in the StatData (712616) - corrected to (278704)
...  (2003)/Season 03/Battlestar Galactica (2003) - S03E19 - Crossroads (1).avivpf-10680: The file [4872 4874] has the wrong block count in the StatData (711640) - corrected to (656872)
/Videos/TV/Battlestar Galactica (2003)/Season 03rebuild_semantic_pass: The entry [4872 4879] ("Battlestar Galactica (2003) - S03E13 - Taking a Break from All Your Worries.avi") in directory [4805 4989] points to nowhere - is removed
/Videos/TV/Battlestar Galactica (2003)/Season 03rebuild_semantic_pass: The entry [4872 4880] ("Battlestar Galactica (2003) - S03E15 - A Day in the Life.avi") in directory [4805 4989] points to nowhere - is removed
/Videos/TV/Battlestar Galactica (2003)/Season 03vpf-10650: The directory [4805 4989] has the wrong size in the StatData (3336) - corrected to (3160)              /Season 04/Videos/TV/Entourage/Season 07/Entourage - S07E01 - Stunted.mkvvpf-10680: The file [61/Videos/TV/Family Guy/Season 05/Family Guy - S05E03 - Hell Comes to Quahog.avivpf-10680: The file [5356 5389] has the wrong block count in the StatData (117368) - corrected/Videos/Movies/Jay and Silent Bob Strike Back (2001).avivpf-10680: The file [6124 1708] has the wrong block count in the StatData /Step up 3 (2010).avivpf-10680: The file [6124 11945] has the wrong block count in the StatData (2867632) - corrected to (2188248)                                                               /James Bond Movies/From Russia With Love 1963.avivpf-10680: The file [410 420] has the wrong block count in/Videos/Movies/Letters From Iwo Jima (2006).mkvvpf-10680: The file [6124 17379] has th/Videos/Hip Hop Stuff/Rap Videos/Xzibit/Xzibit - Hey Now (Mean Muggin).mpgvpf-10680: The file [3806 3808] has the wrong block count in the StatData (109896) - corrected to /Videos/Hip Hop Stuff/Rap Videos/Xzibitrebuild_semantic_pass: The entry [3806 3809] ("Xzibit feat. Dr Dre & Snoop Dogg - X.avi") in directory [195 3806] points to nowhere - is removed
rebuild_semantic_pass: The entry [3806 3810] ("Xzibit feat. Snoop Dogg - X.mpg") in directory [195 3806] points to nowhere - is removed
vpf-10650: The directory [195 3806] has the wrong size in the StatData (280) - corrected to (176)                                                                    /Keak Da Sneak - Super Hyphe.mpegvpf-10680: The file [195 3811] has the wrong block count in the StatData (110248) - corrected to (7448)          rebuild_semantic_pass: The entry [195 3812] ("Juvenile") in directory [4 195] points to nowhere - is removed
rebuild_semantic_pass: The entry [195 3815] ("SWV feat. Puff Daddy - Someone.mpg") in directory [4 195] points to nowhere - is removed
/Videos/Hip Hop Stuff/Rap Videosvpf-10680: The directory [4 195] has the wrong block count in the StatData (22) - corrected to (21)
vpf-10650: The directory [4 195] has the wrong size in the StatData (10808) - corrected to (10728)                                                               /Music videFlushing..finished
        Files found: 5148
        Directories found: 616
        Broken (of files/symlinks/others): 9
        Names pointing to nowhere (removed): 6
Pass 3a (looking for lost dir/files):
####### Pass 3a (lost+found pass) #########
Looking for lost directories:
Looking for lost files:4 /sec
Flushing..finished93, 15 /sec
        Objects without names 2
        Files linked to /lost+found 2
Pass 4 - finished done 290018, 34 /sec
        Deleted unreachable items 641
Flushing..finished
Syncing..finished
###########
reiserfsck finished at Wed Mar 19 19:43:26 2014

Link to comment
  • 1 month later...

lasted for a month and i received a disabled disk yesterday again.

ran the reiserchk again and got a new hard drive anyways.

installed the new drive in and started to preclear it while rebuilding the old drive after the reiserchk -rebuild tree completed

all was going well and then dumbass me somehow did a hard reset by accident on the tower.

after few tries and wiping the usb stick i managed to get it back up, but none of the drives were assigned and i managed not to save a picture of the slots, except probably on the server itself. so i dont remember where which drive went.

 

i have backups of the usb going back a few years.

but going back to the last one, i was still not getting it to boot properly. so i've left it for the night.

 

is there a quick way to get the drives to lock in their original position? or will going to an older backup for the drive work fine?

 

i never did start the array, since i didnt know where the drives belong. so i dont think i damaged anything.

 

but a suggestion would help save some time...

Link to comment

Obviously, you're in a vulnerable situation, and going slowly and carefully your array should be able to be restored. Others have been in this position, and have been walked through the process.  Above all, don't just guess or panic! :)

 

1. Do you have a Cache drive?  For most of us, its the smallest drive in our array.

 

2. After that, in theory, (if the array were stable BEFORE the error) when you start the array in maintenance mode, you'll see all the drives as available to be assigned EXCEPT ONE that will appear unformatted. The unformatted drive is *probably* the parity drive, and all others (except the cache drive) can be assumed to be data.  Probably the path forward is to assign cache, all the data drives (which are formatte) and the Parity drive (which is unformatted).

 

BUT, the above is theory...I've never personally been in your situation...so you should wait for additional opinions. And if what you're seeing on the web screen is NOT as described, and it sounds like your array was NOT stable and parity may or may not have been valid when the array failed, so you definitely need to wait for more help than I can provide.

Link to comment

I do have a cache drive and you're right it is the smallest with 500GB. The parity drive is the biggest 4TB. So those two I know for sure. The other 5 are all less than 4TB. I can't recall anymore which spots they were in however and I just left it at that and shut her down.

And does it really matter? Do I have to reassign the drives to their exact old positions or would it be OK if old drive 1 ends up being drive 3 now?

 

Tapatalk on Samsung GS3

 

 

Link to comment

The cache and parity drives must be assigned correctly. The data drives can go in any order.

 

Awesome. Those two are no problem. It's the others I was all worried about, for no reason apparently.

 

Tnx guys.

I'll play with it when I get back home.

 

Tapatalk on Samsung GS3

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.