Disk Errors

May 15, 20251 yr

Hello, hoping someone can help me out here.

I recently got notification that Disk 1 in my array became unmountable for some reason. I shut the server down and made sure the cables were all plugged in properly, and upon restarting, Disk 1 is still unmountable, and now Disk 2 is showing read errors. I'm not sure how to proceed. I've attached diagnostics.

Thanks in advance for any help.

Christower-diagnostics-20250515-1316.zip

Quote

May 15, 20251 yr

Community Expert

Check/replace cables for disk2 and post new diags after array start.

Quote

May 15, 20251 yr

Author

2 minutes ago, JorgeB said:

Check/replace cables for disk2 and post new diags after array start.

Thanks Jorge, will do!

Quote

May 15, 20251 yr

Author

I changed out the cable for disk 2 and got the same result. Updated diags attached.tower-diagnostics-20250515-1435.ziptower-diagnostics-20250515-1435.zip

Quote

May 15, 20251 yr

Community Expert

Check filesystem on disk1 and post the output

Quote

May 16, 20251 yr

Author

Here's the results of check filesystem on disk1


File system status:
    - 
File system type:

Warning disk utilization threshold (%):

Critical disk utilization threshold (%):

 

Check Filesystem Status

xfs_repair status:

    Phase 1 - find and verify superblock...
    Phase 2 - using internal log
            - zero log...
    ALERT: The filesystem has valuable metadata changes in a log which is being
    ignored because the -n option was used.  Expect spurious inconsistencies
    which may be resolved by first mounting the filesystem to replay the log.
            - scan filesystem freespace and inode maps...
            - found root inode chunk
    Phase 3 - for each AG...
            - scan (but don't clear) agi unlinked lists...
            - process known inodes and perform inode discovery...
            - agno = 0
            - agno = 1
            - agno = 2
            - agno = 3
            - agno = 4
            - agno = 5
            - agno = 6
            - agno = 7
            - agno = 8
            - agno = 9
            - agno = 10
            - agno = 11
            - agno = 12
            - agno = 13
            - agno = 14
            - agno = 15
            - process newly discovered inodes...
    Phase 4 - check for duplicate blocks...
            - setting up duplicate extent list...
            - check for inodes claiming duplicate blocks...
            - agno = 2
            - agno = 3
            - agno = 6
            - agno = 4
            - agno = 5
            - agno = 7
            - agno = 1
            - agno = 0
            - agno = 8
            - agno = 9
            - agno = 10
            - agno = 11
            - agno = 12
            - agno = 13
            - agno = 14
            - agno = 15
    No modify flag set, skipping phase 5
    Phase 6 - check inode connectivity...
            - traversing filesystem ...
            - traversal finished ...
            - moving disconnected inodes to lost+found ...
    Phase 7 - verify link counts...
    Maximum metadata LSN (1:586725) is ahead of log (1:583619).
    Would format log to cycle 4.
    No modify flag set, skipping filesystem flush and exiting.

Thanks in advance

Quote

May 16, 20251 yr

Community Expert

You should see a Fix and/or Log button at the bottom of the results display. Use that to correct the issue and then restart the array in normal mode and the drive should then be fine.

Quote

May 16, 20251 yr

Author

Thanks for your assistance! Disk1 is up and running now. Disk2 however, is still showing the red X beside it. I have attached a new diagnostics file.

tower-diagnostics-20250516-1116.zip

Quote

May 16, 20251 yr

Community Expert

Start the array in normal not maintenance mode and post new diagnostics

Quote

May 16, 20251 yr

Author

Sorry, I didn't realize that maintenance mode would be a problem for Diagnostics. Now the unmountable notification is back on Disk1 as well...tower-diagnostics-20250516-1141.zip

Quote

May 16, 20251 yr

Community Expert

9 minutes ago, sam65 said:

didn't realize that maintenance mode would be a problem for Diagnostics

Maintenance mode doesn't mount any disks, so no way to know if they are mountable or not.

8 minutes ago, sam65 said:

Now the unmountable notification is back on Disk1

Are you sure disk1 was mountable after you repaired it? Did you ever start the array in normal mode after the repair and examine disk1 contents?

20 hours ago, trurl said:

Check filesystem on disk1 and post the output

Quote

May 16, 20251 yr

Author

Now that you ask, no, I started it in maintenance mode until you corrected me... I should have realized....

Quote

May 16, 20251 yr

Community Expert

OK, let's try again.

21 hours ago, trurl said:

Check filesystem on disk1 and post the output

Quote

May 17, 20251 yr

Author

Here's the results:

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 5
        - agno = 4
        - agno = 6
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
Maximum metadata LSN (1:586725) is ahead of log (1:583619).
Would format log to cycle 4.
No modify flag set, skipping filesystem flush and exiting.

File system corruption detected

Quote

May 18, 20251 yr

Community Expert

Click the Fix button

Quote

May 18, 20251 yr

Author

OK, I clicked the fix button, and this is the result. Not sure if I should mount it (start array in normal mode instead of maintenance mode) or click the ZERO LOG button?

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If the filesystem is a snapshot of a mounted
filesystem, you may need to give mount the nouuid option. If you are unable
to mount the filesystem, then use the -L option to destroy the log and
attempt a repair.  Note that destroying the log may cause corruption --
please attempt a mount of the filesystem before doing this.

Dirty log detected

Quote

May 19, 20251 yr

Community Expert

Click the Fix Log button

Quote

May 19, 20251 yr

Author

Thanks again! Here's the results after clicking ZERO LOG

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 3
        - agno = 7
        - agno = 1
        - agno = 2
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - agno = 15
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:586733) is ahead of log (1:2).
Format log to cycle 4.
done

File system corruption fixed

Is is safe to assume that now I should go restart the array in normal mode and attempt to browse the contents of the disk?

Quote

May 19, 20251 yr

Community Expert

13 minutes ago, sam65 said:

Is is safe to assume that now I should go restart the array in normal mode and attempt to browse the contents of the disk?

Yes

Quote

May 19, 20251 yr

Author

Great! Disk1 is back up and running, and it looks like everything is there.

Disk2 however, is still showing the orange X beside it, saying the device is disabled, the contents are emulated when I hover my cursor over it.

Quote

May 19, 20251 yr

Community Expert

10 minutes ago, sam65 said:

Disk2 however, is still showing the orange X beside it, saying the device is disabled, the contents are emulated when I hover my cursor over it.

A different action (rebuild) s required to clear the 'disabled' state. Does the contents of the emulated drive look correct?

Quote

May 19, 20251 yr

Author

Yes, as far as I can tell, everything looks good, though there are a few movie folders that are empty, which has me wondering

Quote

May 19, 20251 yr

Community Expert

46 minutes ago, sam65 said:

few movie folders that are empty

Do you have a lost+found share now?

Quote

May 19, 20251 yr

Author

No lost+found share, but on the shares page, all of my shares have an orange triangle in front that says "some or all files unprotected"

Quote

May 20, 20251 yr

Community Expert

Any share with files on a pool that isn't redundant (such as single cache) has some files unprotected.

On 5/19/2025 at 8:43 AM, itimpi said:
A different action (rebuild) s required to clear the 'disabled' state.

Did you do the rebuild?

If you have a disabled disk and single parity then no array files are protected.

Quote

Disk Errors

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)