April 14, 201115 yr I wanted to poll the forum before I do something stupid here. I just finished running the file system check and got some errors that suggest rebuilding the tree, which according to the docs, is a last resort option. The problem started when my server would freeze up, seemingly when the mover kicked into action. I am guessing that it was happening because the server was copying data up to a particular drive. This is why I decided to run the check against this particular drive. Anyway, I would greatly appreciate it if someone would take a look at my syslog and let me know what I should do. syslog-2011-04-14.zip
April 14, 201115 yr I wanted to poll the forum before I do something stupid here. I just finished running the file system check and got some errors that suggest rebuilding the tree, which according to the docs, is a last resort option. The problem started when my server would freeze up, seemingly when the mover kicked into action. I am guessing that it was happening because the server was copying data up to a particular drive. This is why I decided to run the check against this particular drive. Anyway, I would greatly appreciate it if someone would take a look at my syslog and let me know what I should do. Did you try a "--fix-fixable" first? It did suggest that near the end of the syslog, but before advising of the need for the rebuild-tree: 2 found corruptions can be fixed when running with --fix-fixable I'd do that first, then the "--rebuild-tree" if it still says it is needed. Joe L.
April 14, 201115 yr Author Ok, the test is still running, but this already popped up: Should I stop this process and kick off the tree rebuild, or let it finish first? Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --fix-fixable started at Thu Apr 14 13:19:49 2011 ########### Replaying journal: Done. Reiserfs journal '/dev/md3' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. \/ 3 (of 12|/ 26 (of 170|/ 93 (of 153/bad_stat_data: The objectid (22953) is shared by at least two files. Can be fixed with --rebuil d-tree only. / 4 (of 12|/123 (of 168/
April 14, 201115 yr Ok, the test is still running, but this already popped up: Should I stop this process and kick off the tree rebuild, or let it finish first? Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --fix-fixable started at Thu Apr 14 13:19:49 2011 ########### Replaying journal: Done. Reiserfs journal '/dev/md3' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. \/ 3 (of 12|/ 26 (of 170|/ 93 (of 153/bad_stat_data: The objectid (22953) is shared by at least two files. Can be fixed with --rebuil d-tree only. / 4 (of 12|/123 (of 168/ let it finish first.
April 14, 201115 yr Author Well, the --fix-fixable process is done. here are the results Will check consistency of the filesystem on /dev/md3 and will fix what can be fixed without --rebuild-tree Will put log info to 'stdout' Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes ########### reiserfsck --fix-fixable started at Thu Apr 14 13:19:49 2011 ########### Replaying journal: Done. Reiserfs journal '/dev/md3' in blocks [18..8211]: 0 transactions replayed Checking internal tree.. \/ 3 (of 12|/ 26 (of 170|/ 93 (of 153/bad_stat_data: The objectid (22953) is shared by at least two files. Can be fixed with --rebuil d-tree only. / 6 (of 12// 73 (of 156//118 (of 170\bad_stat_data: The objectid (22952) is sh ared by at least two files. Can be fixed with --rebuild-tree only. / 9 (of 12-/ 43 (of 170\/137 (of 170|bad_stat_data: The objectid (22947) is sh ared by at least two files. Can be fixed with --rebuild-tree only. /123 (of 170\/107 (of 170-bad_stat_data: The objectid (22945) is shared by at le ast two files. Can be fixed with --rebuild-tree only. /125 (of 170// 50 (of 87\bad_stat_data: The objectid (22951) is shared by at le ast two files. Can be fixed with --rebuild-tree only. / 10 (of 12\/ 82 (of 170-/138 (of 170|bad_stat_data: The objectid (22944) is sh ared by at least two files. Can be fixed with --rebuild-tree only. /105 (of 170//139 (of 170|bad_stat_data: The objectid (22946) is shared by at le ast two files. Can be fixed with --rebuild-tree only. / 11 (of 12\/ 10 (of 86\/ 87 (of 170-bad_stat_data: The objectid (22943) is sh ared by at least two files. Can be fixed with --rebuild-tree only. / 12 (of 12\/ 47 (of 88// 85 (of 170-bad_stat_data: The objectid (22948) is sh ared by at least two files. Can be fixed with --rebuild-tree only. bad_stat_data: The objectid (22949) is shared by at least two files. Can be fixe d with --rebuild-tree only. / 86 (of 170\bad_stat_data: The objectid (22950) is shared by at least two files . Can be fixed with --rebuild-tree only. bad_stat_data: The objectid (22954) is shared by at least two files. Can be fixe d with --rebuild-tree only. bad_stat_data: The objectid (22955) is shared by at least two files. Can be fixe d with --rebuild-tree only. bad_stat_data: The objectid (22956) is shared by at least two files. Can be fixe d with --rebuild-tree only. finished Comparing bitmaps..vpf-10630: The on-disk and the correct bitmaps differs. Will be fixed later. Checking Semantic tree: finished No corruptions found There are on the filesystem: Leaves 282781 Internal nodes 1788 Directories 2415 Other files 22390 Data block pointers 284257445 (0 of them are zero) Safe links 0 ########### reiserfsck finished at Thu Apr 14 15:31:53 2011
April 14, 201115 yr Author Now is time for the last resort. Run it with rebuild-tree. That's what is running now. Any reason this may have happened? This is one of the 2tb EARS drives. I have been using unraid for years now with many many drive changes and upgrades. This is the first I have seen this type of problem.
April 15, 201115 yr A crash could cause it. Sometimes file systems just go wrong. Or, an un-readable sector could do it too.
April 15, 201115 yr Author Well it seems all is well. The rebuild tree finished up last night and repaired a few things. Nothing was inside the lost+found folder, so I am guessing that all was recovered. I am going to keep a watchful eye on my server now. Is there anything else I should do regarding this drive, or just hope for the best?
April 17, 201115 yr Author Smart Report for the drive smartctl -a -d ata /dev/sda smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WMAZA3341503 Firmware Version: 51.0AB51 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Apr 17 11:49:57 2011 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (35460) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 172 171 021 Pre-fail Always - 6400 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 281 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1883 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 171 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 156 193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 6739 194 Temperature_Celsius 0x0022 117 080 000 Old_age Always - 33 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 1809 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Archived
This topic is now archived and is closed to further replies.