Failing Disk


Recommended Posts

Hi all,

 

I had trouble with one of my hard drives and it was mounted as read only so I used the guide in the wiki to check the file system by running

 

reiserfsck --check /dev/md1 

 

Which told me to run

 

reiserfsck --rebuild-tree /dev/md1

 

Then I got the following response

 

Loading on-disk bitmap .. ok, 487857249 blocks marked used
Skipping 23115 blocks (super block, journal, bitmaps) 487834134 blocks will be read
0%                                                   left 478809868, 24389 /sec
The problem has occurred looks like a hardware problem. If you have
bad blocks, we advise you to get a new hard drive, because once you
get one bad block  that the disk  drive internals  cannot hide from
your sight,the chances of getting more are generally said to become
much higher  (precise statistics are unknown to us), and  this disk
drive is probably not expensive enough  for you to you to risk your
time and  data on it.  If you don't want to follow that follow that
advice then  if you have just a few bad blocks,  try writing to the
bad blocks  and see if the drive remaps  the bad blocks (that means
it takes a block  it has  in reserve  and allocates  it for use for
of that block number).  If it cannot remap the block,  use badblock
option (-B) with  reiserfs utils to handle this block correctly.

bread: Cannot read the block (9114830): (Input/output error).

Aborted
root@Peanut:~# mount /dev/md1 /mnt/disk1
mount: /dev/md1: can't read superblock

 

Any ideas on how to rescue the data on the disk?

Link to comment

At the moment the disk is coming up as unformatted will the smart commnand still work?

 

I don't care about the disk just wanted to recover any lost data?

 

The drive is coming up as unformatted now and I have restarted the server but the data is not being replicated by parity.

 

Couple of questions -

 

1) Is there a way to verify that parity is able to replicate the data with the drive now coming up as unformatted or should parity be simulating the data in some way? Or do I need to unassign the disk from the array so it can be simulated by parity?

 

2) The server was rebooted after these tests and the samba shares became active overnight and data stuff was moved from the cache drive to the array by the mover script. I cannot afford a new disk for another week but I have sabnzbd downloading and kids adding files to the server so I want to make sure if I keep adding new data to the array will the new disk still be able to be rebuilt by parity when I can a new one?  Or is it best to shut the array down and stop adding new data to it until I can get the new drive?

 

 

Cheers...

Link to comment

Thanks for the quick response.

 

I do not have a spare disk and cannot get one for another week.

 

How can I rebuild it in place and is there any risk of loosing all the data completely?

 

Or is it better to just remove it from the array, let parity simulate the drive, copy the needed data off the array and then add a new disk later when I can get one?

 

Link to comment

The file system is corrupted. These corruptions are reflected in parity. It will still be corrupted after the disk is rebuilt. The safest course is to save as much data off of the disk as possible before proceeding.

To rebuild in place:

 

[*]Stop the array and un-assign the disk.

[*]Start the array.

[*]Stop the array.

[*]Assign the disk.

[*]Start the array.

[*]Click Rebuild.

 

After this procedure completes the pending sectors should be zero.

Link to comment

Okay doesn't sound good.

 

The disk is showing as "Unformatted" in the array so I can not browse it and or access it.

 

Is there any way to extract the data from it prior to doing the above even with the errors is it possible to remove the drive from the array and let parity simulate it?

 

I don't want all the data just some of it, the drive is 2TB and I probably only want 20 gigs worth of stuff from it.

Link to comment

Okay doesn't sound good.

 

The disk is showing as "Unformatted" in the array so I can not browse it and or access it.

 

Is there any way to extract the data from it prior to doing the above even with the errors is it possible to remove the drive from the array and let parity simulate it?

 

I don't want all the data just some of it, the drive is 2TB and I probably only want 20 gigs worth of stuff from it.

unformatted simply indicates unRAID was unable to mount the disk's first partition (connected through the /dev/md1 device) as a reiserfs file system.    That is probably because of the corruption you need to repair.

 

You probably do not need to be told this, but DO NOT FORMAT THE DISK!!! or you will clobber the data that is there and make recovery even more difficult.

 

Joe L.

Link to comment

Okay thanks for the help it makes sense!!

 

In dgaschk's instructions it is mentioned I should save as much data of the disk as possible prior to following the instructions provided.

 

The file system is corrupted. These corruptions are reflected in parity. It will still be corrupted after the disk is rebuilt. The safest course is to save as much data off of the disk as possible before proceeding.

To rebuild in place:

 

[*]Stop the array and un-assign the disk.

[*]Start the array.

[*]Stop the array.

[*]Assign the disk.

[*]Start the array.

[*]Click Rebuild.

 

After this procedure completes the pending sectors should be zero.

 

In the instructions above I am told to copy as much data of the disk as possible but I cannot as the disk is showing as "Unformatted" and I can't access it anyway?

 

So do I follow dgaschk's instructions above or is there any way to save data of the drive before doing this?

 

And if I don't copy the data as suggested before doing it is this procedure going to restore the drives data when it is rebuilt?

 

Thanks!

 

 

 

Cheers!

Link to comment

Can't give specific advice unless you post a syslog showing details of why the disk will not mount.

 

I see earlier it said the mount failed because reiserfsck could not find the superblock.  (Not a really good sign, but first I need to know if it is a /dev/md error, or the disk partition is pointing to the correct sector.

 

DO you remember if you formatted the disk for a starting sector of 64?  or 63? 

(mbr-4k-aligned, or mbr-unaligned?)

 

 

Joe L.

Link to comment

Thanks....

 

Unfortunately I cannot remember how it was formatted I used pre clear but I have another drive which is the parity drive that is exactly the same as the one I am having issues with is there a way to check this one for how it is formatted or does parity not have a file system or such?

 

This is a summary of what happened -

 

1) Disk was mounted as "Read Only"

2) As per instructions on the wiki tried to check the disk for errors

 

reiserfsck --check /dev/md1

 

3) Suggested to run rebuild tree option

 

reiserfsck --rebuild-tree /dev/md1

 

4) Rebuild tree option failed with the first mentioned errors at the top of the post

5) Server was rebooted

6) Disk is showing as "Unformatted" obviously because of the "rebuild tree" failed

 

I have looked at the sys log since the reboot and its pretty much empty but when I try and remount the disk with the wiki instructions I get this.

 

Using username "root".
Linux 2.6.32.9-unRAID.
root@...:~# mount /dev/md1 /mnt/disk1
mount: mount point /mnt/disk1 does not exist
root@....:~#

 

So any ideas on how to proceed?

 

Cheers!

 

Link to comment

Thanks for getting back to me I ran the command as suggested.

 

This is the output I get -

 


root@XXXXX:~#dd if=/dev/sdc count=195 | od -c -A d | sed 30q
195+0 records in
195+0 records out
99840 bytes (100 kB) copied, 8.37546 s, 11.9 kB/s
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000448  \0  \0 203  \0  \0  \0   ?  \0  \0  \0   q 210 340 350  \0  \0
0000464  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000496  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0   U 252
0000512  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0097792 016 021 034 035 016 021 034 035  \0  \0  \0  \0 022  \0  \0  \0
0097808  \0  \0  \0  \0  \0      \0  \0  \0 004  \0  \0 241 333   8   q
0097824 204 003  \0  \0 036  \0  \0  \0  \0  \0  \0  \0  \0 020 314 003
0097840   R 002 002  \0   R   e   I   s   E   r   2   F   s  \0 002  \0
0097856 003  \0  \0  \0 377 377   9   : 002  \0  \0  \0   6 325  \a  \0
0097872 001  \0  \0  \0 323 355  \a 024 336 274   G 232 231   z 307   Q
0097888   R   X 263 240  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0097904  \0  \0  \0  \0 005  \0 036  \0 232 025 274   P  \0   N 355  \0
0097920  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0097984  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0
0098000 376 031  \0  \0  \0 032  \0  \0 005 032  \0  \0 006 032  \0  \0
0098016 317 033  \0  \0 321 033  \0  \0 322 033  \0  \0 323 033  \0  \0
0098032 325 033  \0  \0 354 033  \0  \0 355 033  \0  \0 361 033  \0  \0
0098048 362 033  \0  \0 371 033  \0  \0 372 033  \0  \0 373 033  \0  \0
0098064 374 033  \0  \0 375 033  \0  \0 376 033  \0  \0 024 034  \0  \0
0098080 027 034  \0  \0     034  \0  \0   ! 034  \0  \0   ' 034  \0  \0
0098096   ( 034  \0  \0   ) 034  \0  \0   * 034  \0  \0   = 034  \0  \0
0098112   ? 034  \0  \0   E 034  \0  \0   F 034  \0  \0   T 034  \0  \0
0098128   U 034  \0  \0   \ 034  \0  \0   ] 034  \0  \0   y 034  \0  \0
0098144   z 034  \0  \0 215 034  \0  \0 216 034  \0  \0 263 034  \0  \0
0098160 264 034  \0  \0 362 034  \0  \0 364 034  \0  \0 032 035  \0  \0

Link to comment

Thanks for getting back to me I ran the command as suggested.

 

This is the output I get -

 


root@XXXXX:~#dd if=/dev/sdc count=195 | od -c -A d | sed 30q
195+0 records in
195+0 records out
99840 bytes (100 kB) copied, 8.37546 s, 11.9 kB/s
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000448  \0  \0 203  \0  \0  \0   ?  \0  \0  \0   q 210 340 350  \0  \0
0000464  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0000496  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0   U 252
0000512  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0097792 016 021 034 035 016 021 034 035  \0  \0  \0  \0 022  \0  \0  \0
0097808  \0  \0  \0  \0  \0      \0  \0  \0 004  \0  \0 241 333   8   q
0097824 204 003  \0  \0 036  \0  \0  \0  \0  \0  \0  \0  \0 020 314 003
0097840   R 002 002  \0   R   e   I   s   E   r   2   F   s  \0 002  \0
0097856 003  \0  \0  \0 377 377   9   : 002  \0  \0  \0   6 325  \a  \0
0097872 001  \0  \0  \0 323 355  \a 024 336 274   G 232 231   z 307   Q
0097888   R   X 263 240  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0097904  \0  \0  \0  \0 005  \0 036  \0 232 025 274   P  \0   N 355  \0
0097920  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0097984  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0
0098000 376 031  \0  \0  \0 032  \0  \0 005 032  \0  \0 006 032  \0  \0
0098016 317 033  \0  \0 321 033  \0  \0 322 033  \0  \0 323 033  \0  \0
0098032 325 033  \0  \0 354 033  \0  \0 355 033  \0  \0 361 033  \0  \0
0098048 362 033  \0  \0 371 033  \0  \0 372 033  \0  \0 373 033  \0  \0
0098064 374 033  \0  \0 375 033  \0  \0 376 033  \0  \0 024 034  \0  \0
0098080 027 034  \0  \0     034  \0  \0   ! 034  \0  \0   ' 034  \0  \0
0098096   ( 034  \0  \0   ) 034  \0  \0   * 034  \0  \0   = 034  \0  \0
0098112   ? 034  \0  \0   E 034  \0  \0   F 034  \0  \0   T 034  \0  \0
0098128   U 034  \0  \0   \ 034  \0  \0   ] 034  \0  \0   y 034  \0  \0
0098144   z 034  \0  \0 215 034  \0  \0 216 034  \0  \0 263 034  \0  \0
0098160 264 034  \0  \0 362 034  \0  \0 364 034  \0  \0 032 035  \0  \0

yes, the file system starts on sector 63... 

(mbr-unaligned as the interface is telling you)

 

I'd try rebooting, and see if it mounts.

Link to comment

Tried rebooting and the disks comes up as unformatted  :-\

 

Syslog attached these are the filtered errors -

 

Dec 11 04:53:05 Peanut kernel: ata3.00: HPA detected: current 488395055, native 488397168 (Errors)
Dec 11 04:53:06 Peanut emhttp: disk1 mount error: 32 (Errors)
Dec 11 04:53:06 Peanut kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 0. Fsck? (Errors)
Dec 11 04:53:06 Peanut kernel: REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1 2 0x0 SD] (Errors)
Dec 11 04:53:28 Peanut sshd[3085]: error: Could not get shadow information for root (Errors)

 

And when you try and manually mount the disk you get -

 

mount: mount point /mnt/disk1 does not exist

 

Any ideas?

syslog-2012-12-11.txt

Link to comment

Okay tried that and got this -

 

###########
reiserfsck --check started at Tue Dec 11 09:11:31 2012
###########
Replaying journal: Done.
Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed
Checking internal tree..

Bad root block 0. (--rebuild-tree did not complete)

Aborted

 

Should I try the --rebuild-tree option again or it stuffed  :-\?

Link to comment

Okay tried that and got this -

 

###########
reiserfsck --check started at Tue Dec 11 09:11:31 2012
###########
Replaying journal: Done.
Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed
Checking internal tree..

Bad root block 0. (--rebuild-tree did not complete)

Aborted

 

Should I try the --rebuild-tree option again or it stuffed  :-\?

looks to me like you need to run

reiserfsck --rebuild-tree /dev/md1

and let it complete.

Link to comment

Thanks for the continued help.

 

I tried that and it failed again see the attached screen shot.....

 

I am guessing the hard drive is not happy?

 

Whilst working on this issue is it safe to have the mover moving stuff to the array?

 

I am worried whilst I am working on this issue for the last few days when new files are added to the server it could have a negative affect to the possibility of rebuilding the drive with parity?

 

Is it also possible to plug this into windows and try to rescue the data that way?

 

11-12-2012_11-38-58_AM.png.d7cba7721b67dd90f99330f8ea6129fa.png

Link to comment

Thanks for the continued help.

 

I tried that and it failed again see the attached screen shot.....

 

I am guessing the hard drive is not happy?

 

Whilst working on this issue is it safe to have the mover moving stuff to the array?

 

I am worried whilst I am working on this issue for the last few days when new files are added to the server it could have a negative affect to the possibility of rebuilding the drive with parity?

 

Is it also possible to plug this into windows and try to rescue the data that way?

No way to know how the drive will be read by the window's driver.  The file-system is corrupted.

 

What seems to be happening is the file system check is failing when it attempts to read a un-readable sector on the disk.  You might be able to get around that error by powering down and  un-plugging the drive.  If you un-plug it,  and then reboot, the drive's file-system contents will be simulated by parity in combination with all the other drives.  It will still be corrupt, but the reiserfsck will (should) be able to read the block it currently cannot and should be able to complete.  It will repair the file-system.  When you then re-connect the drive, it should attempt to re-construct the corrected data onto the drive, allowing the SMART firmware on the drive to re-allocate the un-readable sector.

 

While the disk is un-plugged, you can see what you can do to read it under windows using a reiserfs driver.

 

Joe L.

Link to comment

Ok thanks again to summarise....

 

1) Power down server

2) Disconnect drive power

3) Re power server with drive disconnected

4) Check if drive is being simulated by parity

5) If drive is being simulated by parity run the reiserfsck with rebuild tree and hopefully it completes

6) Shut down server, reconnect drive and power back up

 

Whilst all this is going on plug the disk into windows and see if Yareg or something else can read the disk.

 

Ill give this a shot shortly and thanks again for following this through!

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.