Reiserfs error (invalid format)

December 18, 201213 yr

Hi Guys,

I'm having serious problems with my unRaid server. I had a drive come up red, and somehow did something to screw it up pretty badly. Now all the drives are green but I'm getting errors when I start the array.

On boot I'm seeing messages like:

sas: Enter sas_scsi_recover_host busy: 0 failed: 0

sdb: sdb1

sas: ata5: end_device-0:0: dev error handler

sas: ata6: end_device-0:1: dev error handler

sas: ata7: end_device-0:2: dev error handler

ata7.00: ATA-8: WDC WD20EARS-00MVWB0, 51.0AB51, max UDMA/133

ata7.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)

ata7.00: configured for UDMA/133

sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0

sd 0:0:0:0: [sdb] Attached SCSI disk

scsi 0:0:2:0: Direct-Access ATA WDC WD20EARS-00M 51.0 PQ: 0 ANSI: 5

sd 0:0:2:0: Attached scsi generic sg3 type 0

sd 0:0:2:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)

sd 0:0:2:0: [sdd] Write Protect is off

sd 0:0:2:0: [sdd] Mode Sense: 00 3a 00 00

sd 0:0:2:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

sdc: sdc1

sas: Enter sas_scsi_recover_host busy: 0 failed: 0

sdd: sdd1

sas: ata5: end_device-0:0: dev error handler

sas: ata6: end_device-0:1: dev error handler

sas: ata7: end_device-0:2: dev error handler

sas: ata8: end_device-0:3: dev error handler

ata8.00: ATA-8: WDC WD20EARS-00MVWB0, 51.0AB51, max UDMA/133

ata8.00: 3907029168 sectors, multi 0: LBA48 NCQ (depth 31/32)

ata8.00: configured for UDMA/133

sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0

When I start the array I'm seeing things like:

REISERFS warning: reiserfs-5090 is_tree_node: node level 14796 does not match to the expected one 3

REISERFS error (device md4): vs-5150 search_by_key: invalid format found in block 449079810. Fsck?

REISERFS (device md4): Remounting filesystem read-only

In terms of hardware replacements, I've tried replacing just about everything from the SAS cables, backplane, SAS controller, etc.

I followed the instructions on the http://lime-technology.com/wiki/index.php?title=Check_Disk_Filesystems wiki and each drive (I didn't check the parity as instructed) seems to give me the same --rebuild-sb problem.

root@Tower2:~# reiserfsck --check /dev/sdg

reiserfsck 3.6.21 (2009 www.namesys.com)

*************************************************************

** If you are using the latest reiserfsprogs and it fails **

** please email bug reports to [email protected], **

** providing as much information as possible -- your **

** hardware, kernel, patches, settings, all reiserfsck **

** messages (including version), the reiserfsck logfile, **

** check the syslog file for any related information. **

** If you would like advice on using this program, support **

** is available for $25 at www.namesys.com/support.html. **

*************************************************************

Will read-only check consistency of the filesystem on /dev/sdg

Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

reiserfs_open: the reiserfs superblock cannot be found on /dev/sdg.

Failed to open the filesystem.

If the partition table has not been changed, and the partition is

valid and it really contains a reiserfs partition, then the

superblock is corrupted and you need to run this utility with

--rebuild-sb.

Do I need to run rebuild SB? If so what options should I use?

I'm running the latest unraid 5.0-rc8a. The hardware is from a "recommended" build:

ASRock 880GM-LE AM3 AMD 880G Micro ATX AMD Motherboard

AMD Athlon II X2 250 Regor 3.0GHz Socket AM3 65W Dual-Core Desktop Processor ADX250OCGMBOX

Antec Nine Hundred Two V3 Black Steel ATX Mid Tower Gaming Case

CORSAIR Enthusiast Series TX650 V2 650W ATX12V v2.31/ EPS12V v2.92 80 PLUS BRONZE Certified Active PFC High Performance Power Supply

Kingston ValueRAM 2GB 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) Desktop Memory Model KVR1333D3N9/2G

SUPERMICRO AOC-SASLP-MV8 PCI-Express x4 Low Profile SAS RAID Controller

CORSAIR Enthusiast Series TX650 V2 650W ATX12V v2.31/ EPS12V v2.92 80 PLUS BRONZE Certified Active PFC High Performance Power Supply

Here is the dmesg:

http://pastebin.com/TB6bA9r8

Quote

December 18, 201213 yr

DO NOT RUN reiserfsck on the /dev/sdg device. The raw device DOES NOT HAVE THE FILE SYSTEM, IT IS ON THE FIRST PARTITION. (/dev/sdg1 ... note the trailing 1 in the name to denote the first partition ) You will cause massive corruption if you attempt to run it on the raw device and force it to use that as the start of the file-system superblock.

Furthermore, you should access and fix the file-system, if at all possible through the affiliated /dev/mdX device. If you do not, parity will be incorrect after you fix the file-system errors and will be useless until it is brought back into sync.

run reiserfsck on the /dev/md4 device. It will keep parity in sync as the fixes are applied.

reiserfsck --check /dev/md4

Joe L.

Quote

December 18, 201213 yr

Author

Thanks for the help. I ran --check on /dev/md4 and it's telling me to rebuild-tree

root@Tower2:~# reiserfsck --check /dev/md4
reiserfsck 3.6.21 (2009 www.namesys.com)

*************************************************************
** If you are using the latest reiserfsprogs and  it fails **
** please  email bug reports to [email protected], **
** providing  as  much  information  as  possible --  your **
** hardware,  kernel,  patches,  settings,  all reiserfsck **
** messages  (including version),  the reiserfsck logfile, **
** check  the  syslog file  for  any  related information. **
** If you would like advice on using this program, support **
** is available  for $25 at  www.namesys.com/support.html. **
*************************************************************

Will read-only check consistency of the filesystem on /dev/md4
Will put log info to 'stdout'

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes
###########
reiserfsck --check started at Wed Dec 19 04:50:43 2012
###########
Replaying journal: Trans replayed: mountid 36, transid 141377, desc 2869, len 1, commit 2871, next trans offset 2854
Trans replayed: mountid 36, transid 141378, desc 2872, len 1, commit 2874, next trans offset 2857
Trans replayed: mountid 36, transid 141379, desc 2875, len 1, commit 2877, next trans offset 2860
Trans replayed: mountid 36, transid 141380, desc 2878, len 5, commit 2884, next trans offset 2867
Replaying journal: Done.                                                        
Reiserfs journal '/dev/md4' in blocks [18..8211]: 4 transactions replayed
Checking internal tree.. \/  6 (of  19\/106 (of 170\/126 (of 170/block 325565163: The level of the node (21764) is not correct, (1) expected
the problem in the internal node occured (325565163), whole subtree is skipped
/107 (of 170-block 325610751: The level of the node (57557) is not correct, (2) expected
the problem in the internal node occured (325610751), whole subtree is skipped
/  7 (of  19\block 336632440: The level of the node (14251) is not correct, (3) expected
the problem in the internal node occured (336632440), whole subtree is skipped
finished     
Comparing bitmaps..vpf-10640: The on-disk and the correct bitmaps differs.
Bad nodes were found, Semantic pass skipped
3 found corruptions can be fixed only when running with --rebuild-tree
###########
reiserfsck finished at Wed Dec 19 05:20:32 2012
###########

Should I run --rebuild-tree?

Quote

December 19, 201213 yr

Yes.

Quote

December 19, 201213 yr

Author

I ran the reiserfsck with --rebuild-tree. The GUI now shows drive 4 with a red ball and 2 errors.

I tried to run smartctl and It's just giving me errors attempting to run it.

root@Tower2:/mnt# smartctl -l error /dev/sdc

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

Short INQUIRY response, skip product id

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.

Is it safe to assume it's actually the drive? I found I was able to run smartctl against the other drives in the array. Is it safe to unassign the drive and start the array?

Quote

December 19, 201213 yr

Author

I wanted to add the new dmesg log

http://pastebin.com/VzGHQite

Lots of bad looking stuff at the bottom.

Quote

December 19, 201213 yr

First power cycle the system and see if disk 4 will provide a SMART report. Un-assign the drive and start the array. You may need to run reiserfsck --check /dev/md4 at this point. If disk4 does not respond to smartctl it will need to be replaced.

Quote

December 19, 201213 yr

Author

I rebooted the machine and power cycled it. I was able to run the smart report, and it show's FAILING_NOW on pre-allocated sector count.

root@Tower2:~# smartctl -l error /dev/sdc

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

=== START OF READ SMART DATA SECTION ===

SMART Error Log Version: 1

No Errors Logged

root@Tower2:~# smartctl -i /dev/sdc

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

=== START OF INFORMATION SECTION ===

Model Family: Western Digital Caviar Green (Adv. Format) family

Device Model: WDC WD20EARS-00MVWB0

Serial Number: WD-WCAZA4954073

Firmware Version: 51.0AB51

User Capacity: 2,000,398,934,016 bytes

Device is: In smartctl database [for details use: -P show]

ATA Version is: 8

ATA Standard is: Exact ATA specification draft version not indicated

Local Time is: Thu Dec 20 01:11:24 2012 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

root@Tower2:~# smartctl -a /dev/sdc

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

=== START OF INFORMATION SECTION ===

Model Family: Western Digital Caviar Green (Adv. Format) family

Device Model: WDC WD20EARS-00MVWB0

Serial Number: WD-WCAZA4954073

Firmware Version: 51.0AB51

User Capacity: 2,000,398,934,016 bytes

Device is: In smartctl database [for details use: -P show]

ATA Version is: 8

ATA Standard is: Exact ATA specification draft version not indicated

Local Time is: Thu Dec 20 01:11:56 2012 CET

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: FAILED!

Drive failure expected in less than 24 hours. SAVE ALL DATA.

See vendor-specific Attribute list for failed Attributes.

General SMART Values:

Offline data collection status: (0x84) Offline data collection activity

was suspended by an interrupting command from host.

Auto Offline Data Collection: Enabled.

Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (38400) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: ( 5) minutes.

SCT capabilities: (0x3035) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x002f 083 083 051 Pre-fail Always - 32871

3 Spin_Up_Time 0x0027 253 253 021 Pre-fail Always - 991

4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 37

5 Reallocated_Sector_Ct 0x0033 135 135 140 Pre-fail Always FAILING_NOW 1238

7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0

9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 8924

10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0

11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 35

192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 21

193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 7616

194 Temperature_Celsius 0x0022 124 116 000 Old_age Always - 26

196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 918

197 Current_Pending_Sector 0x0032 200 196 000 Old_age Always - 274

198 Offline_Uncorrectable 0x0030 200 199 000 Old_age Offline - 30

199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0

200 Multi_Zone_Error_Rate 0x0008 001 001 000 Old_age Offline - 114281

SMART Error Log Version: 1

No Errors Logged

SMART Self-test log structure revision number 1

Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error

# 1 Short offline Completed without error 00% 8878 -

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

I am going to unassign the drive and start in maint mode then run the reiserfsck on the drive?

Quote

December 19, 201213 yr

The disk must be replaced ASAP.

Quote

Reiserfs error (invalid format)

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)