Jump to content
LAST CALL on the Unraid Summer Sale! 😎 ⌛ ×

[SOLVED] Unmountable file system


spall

Recommended Posts

Hi all,

 

Migrated my second server over to new hardware. I had one disk that was bad on the old hardware which I replaced when I migrated.

 

When I brought the array online, disk 5 is showing as unmountable. I started the array in maintenance mode and did a check on the filesystem:

Phase 1 - find and verify superblock...
        - block cache size set to 751712 entries
Phase 2 - using internal log
        - zero log...
totally zeroed log
zero_log: head block 0 tail block 0
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 3
        - agno = 1
Phase 5 - rebuild AG headers and trees...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:8366) is ahead of log (0:0).
Format log to cycle 4.
xfs_repair: libxfs_device_zero write failed: Input/output error

 

I'm not sure what (if anything) to do at this point. Sadly, I don't have a backup of the data on that disk.

 

Any help appreciated.

data-diagnostics-20180619-2000.zip

Link to comment

Disk 2 actually has no data on it. It was an empty disk that failed. Is there a way to go about this that would allow me to salvage disk 5?

 

That aside, are you suggesting that I rebuild disk 2 and then replace and rebuild disk 5?

Link to comment
7 minutes ago, spall said:

Disk 2 actually has no data on it. It was an empty disk that failed. Is there a way to go about this that would allow me to salvage disk 5?

 

That aside, are you suggesting that I rebuild disk 2 and then replace and rebuild disk 5?

 

Empty or not doesn't matter. Any data disk that have been added to the array will contribute to the parity. It's only empty as in 100% zeroed that doesn't contribute to the parity - but a zeroed disk doesn't have a file system or partition table so directly unRAID formats the drive it can't be all-zero anymore.

Link to comment

Just pointing out the simple answer, since they are the ones that always come back to bite me (I over think this stuff way to much when it happens to me)... make sure that your system is set to AHCI in Bios, and that you try unpluging power and data cables with another drive known to be working before you assume bad drive...  Especially when you now seem to have two drives going bad at the same time, it makes it very unlikely to actually be bad drives...  Not impossible... Just unlikely...

Link to comment
2 hours ago, Warrentheo said:

Just pointing out the simple answer, since they are the ones that always come back to bite me (I over think this stuff way to much when it happens to me)... make sure that your system is set to AHCI in Bios, and that you try unpluging power and data cables with another drive known to be working before you assume bad drive...  Especially when you now seem to have two drives going bad at the same time, it makes it very unlikely to actually be bad drives...  Not impossible... Just unlikely...

 

Hey Warrentheo,

 

Thanks.. yeah.. I hear you. It is set to AHCI. I actually checked that first.

 

I'll change out the SATA cables and take a look. The drive is being powered via a 5 bay SuperMicro cage that is getting power from two molex coming off the PSU. I can swap which bay it is in, but otherwise if it's power it would probably be an issue with the backplane then.

 

 

Link to comment

So I rebuilt drive 2 and then rebuilt drive 5. After repairing the file system on drive 5, I have data again.

 

This drive contained a ton of movie files.. so I have no idea how to tell what (if anything) is corrupt. But at least some set of the data is accessible again.

 

 

Link to comment
1 hour ago, spall said:

This drive contained a ton of movie files.. so I have no idea how to tell what (if anything) is corrupt.

If there were read errors during disk2's rebuild there likely will be some some corrupt files(s) on disk5, you'd need to already have checksums to check them, but if it were only a few errors they should be mostly unnoticeable on video files, like a a couple of glitches or so during playback.

Link to comment

There were about 20 or so read errors during both rebuilds from disk5.  A handful of files being corrupt won't be the end of the world. At least most of the data was recovered :)

 

Thanks for the help!

 

Regarding checksums.. that would be useful. You have any tips on the best way to get started with that for files on my servers?

 

Thanks again.

Link to comment
34 minutes ago, spall said:

Regarding checksums.. that would be useful.

 

Keeping checksums is really very, very useful for files that are seldom or never updated.

 

Also very useful for files that are often updated - but then it is only practical if using a file system that computes new checksums inside the file system on every file write like btrfs does.

 

In the end, it's good to be able to do a scrub and have a program report back that all file data is available and correct - or that lists specifically what files and/or file system blocks that contains incorrect content.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...