(SOLVED) Can't mount disks

grphx · July 1, 2017

I'm unable to mount my 4 disks(3+1). It seems to take a while and then the web gui stops responding. I can still SSH into it but some functions(reboot) won't respond either.

I've attached my diag zip file but if you can tell me where in the diag you are looking if you find out what's causing the disks to fail to mount, that would be appreciated. I'm all about learning.

tower-diagnostics-20170701-1528.zip

EDIT: Disk1 has filesystem corruption, I ran a repair on /dev/md1 and now I'm back up and running!

JorgeB · July 1, 2017

Probably a xfs disk with filesystem corruption, grab the diagnostics on the console/ssh after starting the array by typing diagnostics or start in maintenance mode and check filesystem on all xfs disks:

https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_XFS

grphx · July 1, 2017

18 minutes ago, johnnie.black said:

Probably a xfs disk with filesystem corruption, grab the diagnostics on the console/ssh after starting the array by typing diagnostics or start in maintenance mode and check filesystem on all xfs disks:

https://wiki.lime-technology.com/Check_Disk_Filesystems#Drives_formatted_with_XFS

Is there a way to run a check on all disks with one command or do I need to run it against each one separately?

JorgeB · July 1, 2017

One by one, xfs data disks only.

grphx · July 1, 2017

Quote


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
sb_fdblocks 105557487, counted 106538526
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 3
        - agno = 0
        - agno = 2
        - agno = 1
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
Maximum metadata LSN (28:1117792) is ahead of log (22:3417256).
Would format log to cycle 31.
No modify flag set, skipping filesystem flush and exiting.

I ran a check on a random disk. Unsure if this is the problematic drive or not. What would it say if it had corruption?

JorgeB · July 1, 2017

Or post the after start diags and the problem disk should be visible.

grphx · July 1, 2017

I'm guessing disk1 is the problem but please take a look yourself.

tower-diagnostics-20170701-1624.zip

JorgeB · July 1, 2017

Yes, check disk1 (md1)

grphx · July 1, 2017

Quote


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
Metadata corruption detected at xfs_agf block 0x15d508ec9/0x200
flfirst 118 in agf 3 too large (max = 118)
agf 118 freelist blocks bad, skipping freelist scan
agi unlinked bucket 6 is 14870406 in ag 3 (inode=6457321350)
agi unlinked bucket 8 is 14806472 in ag 3 (inode=6457257416)
agi unlinked bucket 19 is 98076627 in ag 3 (inode=6540527571)
agi unlinked bucket 60 is 98046204 in ag 3 (inode=6540497148)
agi unlinked bucket 63 is 14884799 in ag 3 (inode=6457335743)
sb_ifree 54216, counted 50595
sb_fdblocks 77913191, counted 78974531
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 1
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 6457257416, would move to lost+found
disconnected inode 6457321350, would move to lost+found
disconnected inode 6457335743, would move to lost+found
disconnected inode 6540497148, would move to lost+found
disconnected inode 6540527571, would move to lost+found
Phase 7 - verify link counts...
would have reset inode 6457257416 nlinks from 0 to 1
would have reset inode 6457321350 nlinks from 0 to 1
would have reset inode 6457335743 nlinks from 0 to 1
would have reset inode 6540497148 nlinks from 0 to 1
would have reset inode 6540527571 nlinks from 0 to 1
No modify flag set, skipping filesystem flush and exiting.

Oh yeah this looks scarier than the other disk. Should I run xfs_repair /dev/md1 or should I use any flags to the command?

JorgeB · July 1, 2017

you can use -v, and if it asks for it, -L.

grphx · July 1, 2017

Quote

xfs_repair /dev/md1 -v
Phase 1 - find and verify superblock...
- block cache size set to 1469720 entries
Phase 2 - using internal log
- zero log...
zero_log: head block 2984339 tail block 2918920
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed. Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair. If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

I'm assuming this is "asking for -L" since I can't mount the disks.

JorgeB · July 1, 2017

Yes, it's normal in these cases and usually there's no data loss.

grphx · July 1, 2017

Success! Repair was complete and my system is back up and running! Thanks a lot!

(SOLVED) Can't mount disks

Recommended Posts

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

JorgeB

Link to comment

grphx

Link to comment

Archived