REISERFS error..what to do?


Recommended Posts

Just saw this on the IPMI-View screen after I finished backing up data to another server;

 

REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1523 0x0 SD]

REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1553 0x0 SD]

REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1576 1577 0x0 SD]

 

this repeats with different numbers for about 9 page downs..

 

After searching then reading the "Check Disk Filesystems" page, I ran the "reiserfsck --check /dev/md1" command and it gave me this:

 

###########
reiserfsck --check started at Tue Sep  2 22:53:39 2014
###########
Replaying journal: Done.
Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed

The problem has occurred looks like a hardware problem. If you have
bad blocks, we advise you to get a new hard drive, because once you
get one bad block  that the disk  drive internals  cannot hide from
your sight,the chances of getting more are generally said to become
much higher  (precise statistics are unknown to us), and  this disk
drive is probably not expensive enough  for you to you to risk your
time and  data on it.  If you don't want to follow that follow that
advice then  if you have just a few bad blocks,  try writing to the
bad blocks  and see if the drive remaps  the bad blocks (that means
it takes a block  it has  in reserve  and allocates  it for use for
of that block number).  If it cannot remap the block,  use badblock
option (-B) with  reiserfs utils to handle this block correctly.

bread: Cannot read the block (268206080): (Input/output error).

Aborted (core dumped)

 

What I get from this is that the drive /md1 (disk1) has bad blocks, but when I check the dashboard on the unRAID webpage it gives me no indication that there is an issue with the drive, nor when I ran an extended smartest on it.

There was also no indication of any problems when I first ran a preclear 6 months or so ago, nor has it been under heavy usage..so am I to trust this RESIERFS check report or the smarttest or the dashboard that shows no redball or other indication of fault?

 

Any advice as to what I should do next, RMA the drive, preclear it again, any other suggestions?

 

 

Link to comment

The first thing to do would be to power-cycle the server in case the drive needs that to recover.  Having done that it would be worth repeating the reiserfsck. 

 

If the problem repeats then it is worth getting a SMART report on the drive to see if that indicates a problem.  A full syslog would also not go amiss - it is possible that might suggest something to one of the expert users.

Link to comment

If that is the case, then there may be a genuine error on the disk.

 

SMART reports cannot be relied on as an indication of a disks health, just that a bad one indicates a disk has almost certainly failed.

 

unRAID redballs a disk when it fails to write to it.  In this case it looks as if the disk has been read-locked so that you never get as far as writing to the disk.

 

Do you have the data on the problem disk backed up anywhere or?  If so that is normally the first step to get off it as much as you can before attempting any sort of data recovery.  Also, do you have an spare drive of a size suitable to be used as a replacement.  When attempting data recovery it is always a good idea if possible to leave the suspect drive intact if possible in case any sort of recovery fails.

Link to comment

I have a spare 2tb drive of the same WD Green series pre-cleared and ready in case of this happening.

 

The files that are 'read locked' should be on one of the back-up firewire drives, something I will have to check when I get back home.

From the report it doesn't look like too many files are affected by these bad-blocks, so if I am missing a few files on the back-up drives it shouldn't be too much of a hit.

 

I'm assuming this bad-block problem is an issue that the drive won't recover from? So the next action for me would be to have it RMA'd to WD right? Or is this something that isn't covered by RMA's?

 

Also would mounting/unmounting the drive during the starting and stopping of the array damage it further or not?

Link to comment

I'm assuming this bad-block problem is an issue that the drive won't recover from? So the next action for me would be to have it RMA'd to WD right? Or is this something that isn't covered by RMA's?

If it really IS faulty then the reason does not affect RMA as long as it is still under warranty.

 

It might be worth putting it through the WD diagnostic tools and/or repeating a pre-clear cycle to see what happens before initiating the RMA process.

 

Also would mounting/unmounting the drive during the starting and stopping of the array damage it further or not?

It should not purely for a bad blocks issue.    However if the drive is actually failing at the mechanical level then I would say all bets are off.

Link to comment

Thanks for the advice and help itimpi, I will download the wd diagnostic tool (which from their instruction page looks like a smarttest tool) and test it, then give it a 2 cycle preclear.

 

I'll report back the details when that's done :)

 

**UPDATE**

 

Ran the WD Diagnostic tool...which in all honesty wasn't very helpful..picture below;

 

 

KzD52AV.jpg

 

 

It's going to be RMA'd, I don't trust putting it back into the server.

At the moment I'm 2 cycle preclearing the drive just to see it what it will say for future reference - will update again when thats done.

 

**UPDATE 2**

 

Any ideas what the following results mean would be helpful - I've also ran the WD Diagnostic tool again on the drive which it now says has no errors  :o

 

9UoVLdu.jpg

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.