August 12, 201312 yr Hi, I noticed my syslog is filled with errors of the like: Aug 12 18:31:32 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 16385 does not match to the expected one 2 Aug 12 18:31:32 Tower kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 304065133. Fsck? Aug 12 18:31:37 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 16385 does not match to the expected one 2 Aug 12 18:31:37 Tower kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 304065133. Fsck? Aug 12 18:31:37 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 16385 does not match to the expected one 2 Aug 12 18:31:37 Tower kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 304065133. Fsck? Aug 12 18:31:40 Tower kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 16385 does not match to the expected one 2 Aug 12 18:31:40 Tower kernel: REISERFS error (device md1): vs-5150 search_by_key: invalid format found in block 304065133. Fsck? I checked the wiki and as per the http://lime-technology.com/wiki/index.php/Check_Disk_Filesystems instructions I ran the reiserfsck --check /dev/md1 command. This gave the following output: ... the problem in the internal node occured (575137531), whole subtree is skipped / 13 (of 162\/123 (of 170-block 575349567: The level of the node (9132) is not correct, (1) expected the problem in the internal node occured (575349567), whole subtree is skipped / 36 (of 162|/ 2 (of 170-block 127096823: The number of items (1) is incorrect, should be (0) the problem in the internal node occured (127096823), whole subtree is skipped / 16 (of 23-/119 (of 165-block 304065133: The level of the node (16385) is not correct, (2) expected the problem in the internal node occured (304065133), whole subtree is skipped finished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 317 found corruptions can be fixed only when running with --rebuild-tree ########### reiserfsck finished at Mon Aug 12 19:13:51 2013 Should I just go ahead and run reiserfsck --rebuild-tree /dev/md1 ? The drive has been put to a read-only state, but has not been red-balled by the system. Some basic info: 11 data drives Motherboard Asus M4A78LT-M AOC-SASLP-MV8 4 GB ram
August 13, 201312 yr Author Correct me if I'm wrong, but shouldn't the disk be flagged as red seeing the read-only state?
August 13, 201312 yr Correct me if I'm wrong, but shouldn't the disk be flagged as red seeing the read-only state? I think that only happens if unRAID detects a write error. I do not think a file system error that was not preceded by a write error causes the red status to appear.
August 13, 201312 yr That's correct -- UnRAID only disables a disk if there's a write error. My understanding is that if a read error is detected, UnRAID will correct it via the parity process (read all other disks to compute the correct value); and will then write the correct value to the disk that had the error. If THAT results in a write error, then the disk is disabled (red balled). But in many cases, this will be successful, as the disk can either rewrite the data okay; or it will reallocate the failed sector to a good spare.
August 13, 201312 yr Author That all makes sense, except that the drive has been put into a read-only state. I can not modify any files or write to the disk via the disk share. I see files on my cache drive which have not been moved because the drive is not writeable. If UnRAID is trying to write to the disk to e.g. correct a read error, it will fail because of the read-only status. I'm wondering why the disk has been put into read-only status. Is it some kind of protection measure from the Reiser filesystem?
August 13, 201312 yr This can happen if for example you a power failure or unclean shutdown that results in an incongruity in the file system. To protect you data / further corruption the drive is put into a read only state until a reiserfscheck is done and any fixes to file system that can be made are made. Basically it's as a result of a file system error being detected not a data read or write / bad sector error (I.e. hard drive error) which means the drive isn't red balled. Note: This doesn't necessarily mean there isn't a hard drive mechanical error/fault as the root cause it just means the way it's manifesting isn't in data read write errors. After you do the rebuild tree it should come out of read only (can't remember if stop start of array is enough or if reboot is needed). If it happens again and there haven't been any unclean shutdowns, then check seating of cables / replace cable. Lastly replace hard drive and do preclear of drive to see if it turns something up... Sent from my Nexus 4 using Tapatalk 2
August 13, 201312 yr If UnRAID is trying to write to the disk to e.g. correct a read error, it will fail because of the read-only status. I'm wondering why the disk has been put into read-only status. Is it some kind of protection measure from the Reiser filesystem? That will be standard Linux treatment if the reiserfs driver reported any file system errors while mounting the drive. Whatever the error it seems it was not sufficient to stop it being mounted so I would think that if you put the array into maintenance mode and ran a reiserfsck check against it (using the appropriate /dev/md?? device) with the --fix-fixable option there is a good chance the issue will be resolved.
August 13, 201312 yr If UnRAID is trying to write to the disk to e.g. correct a read error, it will fail because of the read-only status. I'm wondering why the disk has been put into read-only status. Is it some kind of protection measure from the Reiser filesystem? That will be standard Linux treatment if the reiserfs driver reported any file system errors while mounting the drive. Whatever the error it seems it was not sufficient to stop it being mounted so I would think that if you put the array into maintenance mode and ran a reiserfsck check against it (using the appropriate /dev/md?? device) with the --fix-fixable option there is a good chance the issue will be resolved. Orbi said in his original post that it recommended --rebuild-tree (see the quoted text) so that's the parameter that he should use. It's normally best to do what the check recommends. I've had this happen to some of my drives after a power failure and following the instructions provide by the reiserfschk fixed the file system issues for me and allowed my drives to no longer to be flagged as read only. Sent from my Nexus 4 using Tapatalk 2
August 13, 201312 yr Author Thanks for all the replies. I decided to reboot the server before running reiserfsck --rebuild-tree /dev/md1. The webgui greeted me with a red ball. Under these conditions the reiserfsck utility doesn not want to even check the disk. Tower login: root Linux 3.4.26-unRAID. root@Tower:~# reiserfsck --check /dev/md1 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will read-only check consistency of the filesystem on /dev/md1 Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --check started at Tue Aug 13 17:28:12 2013 ########### Partition /dev/md1 is mounted with write permissions, cannot check it root@Tower:~# In line with the wiki instructions, I wanted to start the array in maintenance mode. Before being able to do that, the array must first be stopped. After I click Stop, the array seems caught in an endless loop of unmounting (syslog attached) resulting in an unresponsive webgui. What to do? syslog.zip
August 14, 201312 yr Author Update: Oddly, reiserfsck found no corruption. It looks like my drive is good to go. I performed a SMART test as well and that didn't show anything unusual. So before reboot reiserfsck found 317 corruptions, after reboot none. I don't understand what's happening here. Is there anything else worth checking before rebuilding the drive?
August 18, 201312 yr I had the same issue on md5 & md12 awhile back, I couldn't find any thing about that so I tried to copy off what I could. then I rebuilt array without those 2 hdd but kept them there & tried not to overwrite them. now I rebuilt a basic array with those 2 hdd & a clear parity in my stower02 with another usb but with the same data but named it stower02old. I have followed this thread closely but not much seems to be happening, I am having another go today. trying --fix-fixable, I will leave it for most of the day & try it again later. Steve Unraid - the next best thing backup, movies/tv/data OS: unRAID OS version Pro 5.0-rc16c Case unknown brand: 13 Bay Duplicator Motherboard: GA-pd55a-ud5 LGA1156 CPU: Intel i5 3.33GHz MEMORY: 2 x 2gb ram EXPANSION CARDS: 1 x SAS2LP-MV8 2 x ?? Power supply: Thermalake TR2-700 4in3 Caddies: 5 Cooler master Hard disks: 19x 2tb (10x sam, 4x wd, 4x sea, 1x hit) OS: unRAID OS version Pro 5.0-rc16c Case Thermalake: 9 Bay M9 Motherboard: Asus LGA1156 CPU: Intel i5 3.33GHz MEMORY: 2 x 2gb ram EXPANSION CARDS: 1 x SAS2LP-MV8 2 x ?? Power supply: Thermalake TR2-700 4in3 Caddies: 5 Cooler master Hard disks: 14x (7x sam hd154 1.5tb, 3x sam hd204 2tb, 2x 2tb wd, 1x sea 2tb, 1x hit 2tb) stower02old_syslog.txt
August 19, 201312 yr Author I was looking into RMA'ing the drive, but prior to that Seagate requests to run their analysis tool, Seatools. I made a full sector scan, and it found naught. After the scan, I noticed that the Runtime Bad Block value increased from 1 to 3. It was 1 before I posted this thread. To be sure I replaced the sata cable as well. Well it's been two days now since I rebuilt the array, I will see how the drive behaves. It's the second time this hard drive is red balling. Previous time it took one month for the Runtime Bad Block value to increase from 0 to 1, resulting in a write failure. Seagate's RMA policy allows to return the hard drive when Runtime Bad Block value reaches 200.
August 19, 201312 yr i found even changing those drives to new or different ones, still showed the reiserfsck errors at that samelocation md5 or md12. did you have any luck getting rid of the reiserfsck errors.
August 19, 201312 yr Author The drive is at location md1, so no similarities here. After the reboot, the reiserfs errors were gone, but the drive redballed. It's odd.
August 19, 201312 yr i beg to differ md1 -24. once I discovered it I rebuilt a new disk into that location, the error appeared after rebuilt drive was finished. I am trying to rebuild that tree now and my error disks are nolonger 5 & 12 they are 1 & 2.
August 20, 201312 yr Author The drive is still green, touching wood. I'll make a reiserfs check tonight just to be sure.
August 20, 201312 yr nothing appears to being happening under --rebuild-tree I get a yes then nothing much happens or indicates anything
September 6, 201312 yr Author Update: chaging the data cable had no effect, the drive kept red balling. So I used another sata port on my motherboard and the drive is still green since August 19th. Seems like I have a difficult or defect sata port.
Archived
This topic is now archived and is closed to further replies.