Jump to content

woke up to 2 drives reporting errors at the same time. what to do next


lennygman

Recommended Posts

root@Tower:/dev# xfs_repair -v /dev/md4
Phase 1 - find and verify superblock...
        - block cache size set to 1503368 entries
Phase 2 - using internal log
        - zero log...
* ERROR: mismatched uuid in log
*            SB : fbf0b82a-a72f-4e97-b8c6-bbd3be245bf9
*            log: 3481cfee-2864-430f-aa40-01ade1573116
zero_log: head block 503295 tail block 503291
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
 

 

But I think i screwed up.. I misread the instructions. I thought i needed to substitute with linux based drive ID.. in my case it is "sdf" for disk4.  ( I only recently installed Ver6.0). 

Link to comment

There was a bunch of these messages about "Metadata corruption detected"

 

Metadata corruption detected at xfs_dir3_block block 0x135438/0x1000
libxfs_writebufr: write verifer failed on xfs_dir3_block bno 0x135438/0x1000
Metadata corruption detected at xfs_dir3_block block 0x658/0x1000
libxfs_writebufr: write verifer failed on xfs_dir3_block bno 0x658/0x1000
Metadata corruption detected at xfs_dir3_block block 0x650/0x1000
libxfs_writebufr: write verifer failed on xfs_dir3_block bno 0x650/0x1000
Metadata corruption detected at xfs_dir3_block block 0x58/0x1000
libxfs_writebufr: write verifer failed on xfs_dir3_block bno 0x58/0x1000
cache_purge: shake on cache 0x6b5080 left 3 nodes!?

        XFS_REPAIR Summary    Sun Nov  5 17:11:37 2017

Phase           Start        End     Duration
Phase 1:        11/05 17:10:16  11/05 17:10:16
Phase 2:        11/05 17:10:16  11/05 17:10:38  22 seconds
Phase 3:        11/05 17:10:38  11/05 17:10:45  7 seconds
Phase 4:        11/05 17:10:45  11/05 17:10:45
Phase 5:        11/05 17:10:45  11/05 17:10:45
Phase 6:        11/05 17:10:45  11/05 17:10:45
Phase 7:        11/05 17:10:45  11/05 17:10:45

Total run time: 29 seconds
done
root@Tower:/dev/disk#
 

Drive still shows with red x / disabled / contents emulated 

Link to comment

Dont think so

 

when i start Array in regular mode still see this:

 

Unmountable disk present:

Disk 4 • WDC_WD20EADS-00S2B0_WD-WCAVY5770247 (sdf)

Format will create a file system in all Unmountable disks, discarding all data currently on those disks.
Yes I want to do this

 

 

Link to comment

There is a lot of metadata corruption, start the array one more time and if still unmountable it's likely unfixable, then best bet is to do a new config:

 

-Tools -> New Config -> Retain current configuration: All -> Apply

-if needed assign any missing disk(s)
-check "parity is already valid" before starting the array

-Start the array, check that all disks mount, if yes run a correcting parity check

 

 

Link to comment

Sorry.. I want to confirm this and doublecheck to what I am doing. 

 

There is a message:

This is a utility to reset the array disk configuration so that all disks appear as "New" disks, as if it were a fresh new server.

This is useful when you have added or removed multiple drives and wish to rebuild parity based on the new configuration.

Use the 'Retain current configuration' selection to populate the desired disk slots after the array has been reset. By default no disk slots are populated.

DO NOT USE THIS UTILITY THINKING IT WILL REBUILD A FAILED DRIVE - it will have the opposite effect of making it impossible to rebuild an existing failed drive - you have been warned!

 

So if disk4 is corrupted, is this the right process to rebuild parity and restore data on disk4?  would format disk 4 and rebuild party be the same thing?

 

 

Link to comment
4 minutes ago, lennygman said:

would format disk 4 and rebuild party be the same thing?

 

No formatting is never part of a rebuild, it will delete all data on disk4 and update parity to reflect that.

 

New config also isn't usually part of a rebuild, but if the filesystem on the emulated disk can't be repaired the same will happen if you rebuild disk4 as is, so, and assuming disk4 is OK and it was disabled because of a cable or similar problem a new config will restore disk4 to how it was before being disabled, it's also currently your only option to get that data back.

 

 

Link to comment
5 minutes ago, lennygman said:

I run Smarts self test it does show completed with errors.. so I guess failing smarts?

 

There are no errors on SMART, at least not yet on the one you posted, still the disk could be failing, or it's a bad cable or whatever the initial problem was, either way and since you currently can't rebuild it best option would be to copy all important data to another disk and only after that try to confirm where the problem is.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...