Lt-Chewie Posted September 2, 2014 Share Posted September 2, 2014 Just saw this on the IPMI-View screen after I finished backing up data to another server; REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1523 0x0 SD] REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1522 1553 0x0 SD] REISERFS error (device md1): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1576 1577 0x0 SD] this repeats with different numbers for about 9 page downs.. After searching then reading the "Check Disk Filesystems" page, I ran the "reiserfsck --check /dev/md1" command and it gave me this: ########### reiserfsck --check started at Tue Sep 2 22:53:39 2014 ########### Replaying journal: Done. Reiserfs journal '/dev/md1' in blocks [18..8211]: 0 transactions replayed The problem has occurred looks like a hardware problem. If you have bad blocks, we advise you to get a new hard drive, because once you get one bad block that the disk drive internals cannot hide from your sight,the chances of getting more are generally said to become much higher (precise statistics are unknown to us), and this disk drive is probably not expensive enough for you to you to risk your time and data on it. If you don't want to follow that follow that advice then if you have just a few bad blocks, try writing to the bad blocks and see if the drive remaps the bad blocks (that means it takes a block it has in reserve and allocates it for use for of that block number). If it cannot remap the block, use badblock option (-B) with reiserfs utils to handle this block correctly. bread: Cannot read the block (268206080): (Input/output error). Aborted (core dumped) What I get from this is that the drive /md1 (disk1) has bad blocks, but when I check the dashboard on the unRAID webpage it gives me no indication that there is an issue with the drive, nor when I ran an extended smartest on it. There was also no indication of any problems when I first ran a preclear 6 months or so ago, nor has it been under heavy usage..so am I to trust this RESIERFS check report or the smarttest or the dashboard that shows no redball or other indication of fault? Any advice as to what I should do next, RMA the drive, preclear it again, any other suggestions? Quote Link to comment
itimpi Posted September 2, 2014 Share Posted September 2, 2014 The first thing to do would be to power-cycle the server in case the drive needs that to recover. Having done that it would be worth repeating the reiserfsck. If the problem repeats then it is worth getting a SMART report on the drive to see if that indicates a problem. A full syslog would also not go amiss - it is possible that might suggest something to one of the expert users. Quote Link to comment
Lt-Chewie Posted September 2, 2014 Author Share Posted September 2, 2014 Power-cycle done, same REISERFS error reappears. Smart report indicates no issues. unRAID dashboard shows no redball or any other indication of problems. Syslog provided below; syslog.txt Quote Link to comment
itimpi Posted September 3, 2014 Share Posted September 3, 2014 If that is the case, then there may be a genuine error on the disk. SMART reports cannot be relied on as an indication of a disks health, just that a bad one indicates a disk has almost certainly failed. unRAID redballs a disk when it fails to write to it. In this case it looks as if the disk has been read-locked so that you never get as far as writing to the disk. Do you have the data on the problem disk backed up anywhere or? If so that is normally the first step to get off it as much as you can before attempting any sort of data recovery. Also, do you have an spare drive of a size suitable to be used as a replacement. When attempting data recovery it is always a good idea if possible to leave the suspect drive intact if possible in case any sort of recovery fails. Quote Link to comment
Lt-Chewie Posted September 3, 2014 Author Share Posted September 3, 2014 I have a spare 2tb drive of the same WD Green series pre-cleared and ready in case of this happening. The files that are 'read locked' should be on one of the back-up firewire drives, something I will have to check when I get back home. From the report it doesn't look like too many files are affected by these bad-blocks, so if I am missing a few files on the back-up drives it shouldn't be too much of a hit. I'm assuming this bad-block problem is an issue that the drive won't recover from? So the next action for me would be to have it RMA'd to WD right? Or is this something that isn't covered by RMA's? Also would mounting/unmounting the drive during the starting and stopping of the array damage it further or not? Quote Link to comment
itimpi Posted September 3, 2014 Share Posted September 3, 2014 I'm assuming this bad-block problem is an issue that the drive won't recover from? So the next action for me would be to have it RMA'd to WD right? Or is this something that isn't covered by RMA's? If it really IS faulty then the reason does not affect RMA as long as it is still under warranty. It might be worth putting it through the WD diagnostic tools and/or repeating a pre-clear cycle to see what happens before initiating the RMA process. Also would mounting/unmounting the drive during the starting and stopping of the array damage it further or not? It should not purely for a bad blocks issue. However if the drive is actually failing at the mechanical level then I would say all bets are off. Quote Link to comment
Lt-Chewie Posted September 3, 2014 Author Share Posted September 3, 2014 Thanks for the advice and help itimpi, I will download the wd diagnostic tool (which from their instruction page looks like a smarttest tool) and test it, then give it a 2 cycle preclear. I'll report back the details when that's done **UPDATE** Ran the WD Diagnostic tool...which in all honesty wasn't very helpful..picture below; It's going to be RMA'd, I don't trust putting it back into the server. At the moment I'm 2 cycle preclearing the drive just to see it what it will say for future reference - will update again when thats done. **UPDATE 2** Any ideas what the following results mean would be helpful - I've also ran the WD Diagnostic tool again on the drive which it now says has no errors Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.