unmountable XFS data disk


ShaneH

Recommended Posts

Ok this does not look fun.   Parity sync is complete.  0 errors.   8tb disk now has an unmountable file system.

 

Here we go:

--------------------------

Event: Unraid Disk 2 message
Subject: Notice [TOWER] - Disk 2 returned to normal operation
Description: ST8000AS0002-1NA17Z_Z840E3TF (sdd)

--------------------------

Event: Unraid Parity sync / Data rebuild
Subject: Notice [TOWER] - Parity sync / Data rebuild finished (0 errors)
Description: Duration: 14 hours, 40 minutes, 10 seconds. Average speed: 189.4 MB/s

--------------------------

 

 

unRaid  2019-01-22 17:35:50.png

Link to comment

Looks good. You will have to repair the filesystem on disk2 as expected.

 

Click on Disk 2 to get to its page. You should see a section Check Filesystem Status. The button will be disabled telling you to be in Maintenance Mode.

 

Stop the array, start it in Maintenance mode and go back to that page and click the button. Post your results.

Link to comment

I started the array in Maintenance mode
I clicked the check file system button for disk2.

 

Results are:
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_fdblocks 1530287853, counted 1532948218
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
would have corrected directory 99 size from 95 to 89
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
would have corrected directory 99 size from 95 to 89
        - agno = 1
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
Metadata corruption detected at 0x44f20d, inode 0x63 data fork
couldn't map inode 99, err = 117
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 2147483744, would move to lost+found
disconnected dir inode 8657675550, would move to lost+found
disconnected dir inode 10737418336, would move to lost+found
disconnected dir inode 10737856698, would move to lost+found
Phase 7 - verify link counts...
Metadata corruption detected at 0x44f20d, inode 0x63 data fork
couldn't map inode 99, err = 117, can't compare link counts
No modify flag set, skipping filesystem flush and exiting.

 

Link to comment

I seem to remember some issues with the xfs repair procedure in prior versions of unraid. Since you are on an older version (6.6.3), maybe @johnnie.black or @trurl has a better memory of which versions were affected. I would hold tight where you are at right now until somebody confirms it's safe to continue the repair on that specific unraid version, or perhaps it would be better to upgrade unraid first.

Link to comment

This bug report - [6.6.6] XFSPROGS 4.16.X VERSION OF XFS_REPAIR HAS BUG IN PHASE6.C

is for 6.6.6 but might have been on earlier versions as well. That report says solved in 6.7.0-rc1

 

The linked threads in that report seem similar to this xfs_repair result we just got here. 

 

So it looks like we have come full circle on this. To recap:

 

OP started this thread after he had removed the disk and repaired it in another system, thus invalidating parity. Then, through a misunderstanding, partially my fault, he rebuilt the disk, returning it to its original state. Now he has attempted the repair in Unraid and encountered what may be a bug in his version of Unraid that prevents the repair from completing.

 

I just reviewed this thread and the rest of his post history and found no indication that he repaired the disk on another system purposely so as to avoid this bug. But it looks like that might have been a valid approach after all. Of course then a parity sync would have been needed and that was mentioned early in this thread.

 

I guess at this point the way forward is to upgrade Unraid. Either that or go back and do it all again the "wrong way" on another system like he did before and resync parity.

 

I hesitate to make any firm recommendations at this point without other opinions. I'm just going to tag @johnnie.black again and see if he has other ideas.

Link to comment

OK. I was afraid maybe this part in phase 6

13 hours ago, ShaneH said:

Phase 6 - check inode connectivity...
        - traversing filesystem ...
Metadata corruption detected at 0x44f20d, inode 0x63 data fork
couldn't map inode 99, err = 117

was related to the bug

 

@ShaneH proceed

5 hours ago, johnnie.black said:

First thing would be to run xfs_repair without -n.

Link to comment

OK,  I went to Disk 2  I cleared the "-n" from the "options" box after check.

I then ran check.

Output:

 

Phase 1 - find and verify superblock...

Phase 2 - using internal log

             - zero log...

ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

Link to comment
3 minutes ago, ShaneH said:

OK,  I went to Disk 2  I cleared the "-n" from the "options" box after check.

I then ran check.

Output:

 

Phase 1 - find and verify superblock...

Phase 2 - using internal log

             - zero log...

ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.

That is not at all unusual!     You can run with the -L option.     In the vast majority of cases there is no data loss at all, and even if there is it is only likely to affect the last file written

Link to comment

What is next?

 

Output:

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
sb_fdblocks 1530287853, counted 1532948218
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
corrected directory 99 size, was 95, now 89
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 3
        - agno = 2
        - agno = 0
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:1995504) is ahead of log (1:2).
Format log to cycle 4.
done

 

Link to comment
Just now, ShaneH said:

I went to the "main" page, then clicked on the folder icon (on the right) for disk2.  I also check the unRaid terminal.

I do not see a lost+found folder.

That is probably a good sign!   The lost+found folder is only created if the repair process found some files where it could not correctly identify the name.

Link to comment

Hello.   All the apps are gone from "Dashboard".   Dockers and VMs are gone.   I can rebuild.

I am going to do a file compare to look for missing files.   It is looking good with a quick glance.

I need to do a better check.    (I was not really using the array until we were finished.)

 

Link to comment
6 minutes ago, ShaneH said:

All the apps are gone from "Dashboard".   Dockers and VMs are gone.

Looked at your diagnostics again. Unfortunately I can't tell from the diagnostics with that "older" version of Unraid exactly which disk(s) your system share was on. It is cache-prefer, but if you set these up before adding cache  then probably they never got moved to cache and maybe they didn't survive the repair of disk2.

 

You can reinstall your dockers exactly as they were before using the Previous Apps feature on the Apps page.

Link to comment

Hello,  The video files seem to have returned since the rebuild.   I copied all pictures from a backup on top off the unRaid pictures share.   Very few pictures were actually copied to the array.    I am going to look at the documents folder but that will be a slow process.
Things are looking very good with the array.   I am going to upgrade my back up process.

 

Thank you very much for all of the help and your time.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.