September 4, 201312 yr What kind of error checking would catch the following error condition? I get an daily email stating "unRaid status OK" But its not OK This backup server (tower3) is started up once per day and sent files by rsync to keep the server up to date from the main server. mkdir /mnt/t3disk7 mount -t nfs tower3:/mnt/disk7/ /mnt/t3disk7 rsync -av --stats --progress /mnt/disk7/ /mnt/t3disk7/ >> /boot/logs/cronlogs/t3disk7.log The last few weeks I have noted that tower3 disk7 hasn't completely synced successfully. Never having seen this before, I expected that I had forgot to turn on NFS for that disk. I couldn't find anything wrong except for the following log entries: Sep 4 06:56:02 Tower3 mountd[1647]: authenticated mount request from 192.168.1.106:902 for /mnt/disk7 (/mnt/disk7) Sep 4 06:59:19 Tower3 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 1 (Minor Issues) Sep 4 06:59:19 Tower3 kernel: REISERFS error (device md7): vs-5150 search_by_key: invalid format found in block 106135554. Fsck? (Errors) Sep 4 06:59:19 Tower3 kernel: REISERFS (device md7): Remounting filesystem read-only (Drive related) Sep 4 06:59:19 Tower3 kernel: REISERFS error (device md7): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [22345 22982 0x0 SD] (Errors) Sep 4 06:59:19 Tower3 kernel: REISERFS warning: reiserfs-5090 is_tree_node: node level 0 does not match to the expected one 1 (Minor Issues) Sep 4 06:59:19 Tower3 kernel: REISERFS error (device md7): vs-5150 search_by_key: invalid format found in block 106135554. Fsck? (Errors) Sep 4 06:59:19 Tower3 kernel: REISERFS error (device md7): vs-13070 reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [22345 22983 0x0 SD] (Errors) This took me to http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems and I did the following disk check then a --rebuild-tree: root@Tower3:~# reiserfsck --check /dev/md7 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will read-only check consistency of the filesystem on /dev/md7 Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --check started at Wed Sep 4 07:12:52 2013 ########### Replaying journal: Trans replayed: mountid 93, transid 2846, desc 3895, len 1, commit 3897, next trans offset 3880 Trans replayed: mountid 93, transid 2847, desc 3898, len 1, commit 3900, next trans offset 3883 Trans replayed: mountid 93, transid 2848, desc 3901, len 2, commit 3904, next trans offset 3887 Trans replayed: mountid 93, transid 2849, desc 3905, len 3, commit 3909, next trans offset 3892 Replaying journal: Done. Reiserfs journal '/dev/md7' in blocks [18..8211]: 4 transactions replayed Checking internal tree.. \/ 6 (of 8|/132 (of 152// 22 (of 170\block 106135553: The level of the node (0) is not correct, (1) expected the problem in the internal node occurefinished Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs. Bad nodes were found, Semantic pass skipped 1 found corruptions can be fixed only when running with --rebuild-tree ########### reiserfsck finished at Wed Sep 4 07:31:18 2013 ########### root@Tower3:~# reiserfsck --rebuild-tree /dev/md7 reiserfsck 3.6.21 (2009 www.namesys.com) ************************************************************* ** Do not run the program with --rebuild-tree unless ** ** something is broken and MAKE A BACKUP before using it. ** ** If you have bad sectors on a drive it is usually a bad ** ** idea to continue using it. Then you probably should get ** ** a working hard drive, copy the file system from the bad ** ** drive to the good one -- dd_rescue is a good tool for ** ** that -- and only then run this program. ** ** If you are using the latest reiserfsprogs and it fails ** ** please email bug reports to [email protected], ** ** providing as much information as possible -- your ** ** hardware, kernel, patches, settings, all reiserfsck ** ** messages (including version), the reiserfsck logfile, ** ** check the syslog file for any related information. ** ** If you would like advice on using this program, support ** ** is available for $25 at www.namesys.com/support.html. ** ************************************************************* Will rebuild the filesystem (/dev/md7) tree Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes Replaying journal: Done. Reiserfs journal '/dev/md7' in blocks [18..8211]: 0 transactions replayed ########### reiserfsck --rebuild-tree started at Wed Sep 4 08:10:31 2013 ########### Pass 0: ####### Pass 0 ####### Loading on-disk bitmap .. ok, 175724361 blocks marked used init_source_bitmap: Bitmap 3239 (of 32768 bits) is wrong - mark all blocks [106135552 - 106168320] as used Skipping 30567 blocks (super block, journal, bitmaps) 175726562 blocks will be read 0%....20%. left 0, 46533320 directory entries were hashed with "r5" hash. "r5" hash is selected Flushing..finished Read blocks (but not data blocks) 175726562 Leaves among those 178331 Objectids found 33322 Pass 1 (will try to insert 178331 leaves): ####### Pass 1 ####### Looking for allocable blocks .. finished 0%....20%....40%....60%....80%....100% left 0, 75 /sec Flushing..finished 178331 leaves read 178302 inserted 29 not inserted ####### Pass 2 ####### Pass 2: 0%....20%....40%....60%....80%....100% left 0, 29 /sec Flushing..finished Leaves inserted item by item 29 Pass 3 (semantic): ####### Pass 3 ######### /Pix2013/2013-08 Aug/[2013-08-15] PB Hartley, Jennifer/IMG_9394.CR2vpf-10680: The file [22345 22981] has the wrong block count in the StatData (52784) - corrected to (43184) rebuild_semantic_pass: The entry [22345 22982] ("IMG_9394.xmp") in directory [18289 22345] points to nowhere - is removed rebuild_semantic_pass: The entry [22345 22983] ("IMG_9395.CR2") in directory [18289 22345] points to nowhere - is removed /Pix2013/2013-08 Aug/[2013-08-15] PB Hartley, Jennifervpf-10650: The directory [18289 22345] has the wrong size in the StatDat/[2013-08-01] PB Nguyen, Keli/pro Flushing..finished Files found: 32918 Directories found: 402 Names pointing to nowhere (removed): 2 Pass 3a (looking for lost dir/files): ####### Pass 3a (lost+found pass) ######### Looking for lost directories: Flushing..finished10, 56 /sec Pass 4 - finished done 178227, 44 /sec Deleted unreachable items 6 Flushing..finished Syncing..finished ########### reiserfsck finished at Wed Sep 4 11:12:44 2013 ########### root@Tower3:~# How likely is this to happen in the future? I want to have a system that is 100% self diagnosing and can be managed from afar. Perhaps I am asking for too much... (there has been no data loss, as this drive is on the backup, not the main system)
Archived
This topic is now archived and is closed to further replies.