Disk suddenly errored


Go to solution Solved by JorgeB,

Recommended Posts

Hello,

 

I had a power failure this week end which my UPS took over and made a gracefull shutdown. I found out today that the drive was mounted read only with no disk access and once I shutdown the server, checked all connection and start it back, I was greeted with this:

 

image.thumb.png.6f4643d92669279f42dc1b652c1f7544.png

 

Now, I don't know what happen to the disk. It had no smart error and was working fine. If I "reformat" the drive, will the array rebuild itself and put the data back?

 

At the sametime, a parity check started yesterday (parity ran each 1st of the month). Could that lead to problem (I cancelled the parity check)? I installed a new drive about 2 weeks ago so I know the parity should be good.

 

Thank you

 

edit: I've started the array in maintenance mode and ran xfs_repair -n that I saw under disk 6 option. This is the output

 


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
ir_freecount/free mismatch, inode chunk 11/346247680, freecount 2 nfree 1
finobt ir_freecount/free mismatch, inode chunk 11/346247680, freecount 2 nfree 1
agi unlinked bucket 40 is 346247720 in ag 11 (inode=23968567848)
agi unlinked bucket 8 is 306951432 in ag 9 (inode=19634304264)
agi unlinked bucket 41 is 306923689 in ag 9 (inode=19634276521)
agi unlinked bucket 30 is 102889438 in ag 5 (inode=10840307678)
agi unlinked bucket 46 is 102889454 in ag 5 (inode=10840307694)
agi unlinked bucket 61 is 102889469 in ag 5 (inode=10840307709)
agi unlinked bucket 17 is 13602577 in ag 4 (inode=8603537169)
agi unlinked bucket 19 is 13602579 in ag 4 (inode=8603537171)
agi unlinked bucket 20 is 13602580 in ag 4 (inode=8603537172)
agi unlinked bucket 47 is 1071805807 in ag 7 (inode=16104191343)
agi unlinked bucket 23 is 13602583 in ag 4 (inode=8603537175)
agi unlinked bucket 10 is 361219338 in ag 8 (inode=17541088522)
agi unlinked bucket 11 is 361219339 in ag 8 (inode=17541088523)
sb_ifree 2411, counted 2418
sb_fdblocks 1978355686, counted 2005596972
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 5
        - agno = 9
        - agno = 14
        - agno = 6
        - agno = 7
        - agno = 12
        - agno = 3
        - agno = 8
        - agno = 1
        - agno = 10
        - agno = 11
        - agno = 13
        - agno = 4
        - agno = 2
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 8603537169, would move to lost+found
disconnected inode 8603537171, would move to lost+found
disconnected inode 8603537172, would move to lost+found
disconnected inode 8603537175, would move to lost+found
disconnected inode 10840307678, would move to lost+found
disconnected inode 10840307694, would move to lost+found
disconnected inode 10840307709, would move to lost+found
disconnected inode 16104191343, would move to lost+found
disconnected inode 17541088522, would move to lost+found
disconnected inode 17541088523, would move to lost+found
disconnected inode 19634276521, would move to lost+found
disconnected inode 19634304264, would move to lost+found
disconnected inode 23968567848, would move to lost+found
Phase 7 - verify link counts...
would have reset inode 8603537169 nlinks from 0 to 1
would have reset inode 23968567848 nlinks from 0 to 1
would have reset inode 8603537171 nlinks from 0 to 1
would have reset inode 8603537172 nlinks from 0 to 1
would have reset inode 8603537175 nlinks from 0 to 1
would have reset inode 17541088522 nlinks from 0 to 1
would have reset inode 17541088523 nlinks from 0 to 1
would have reset inode 19634276521 nlinks from 0 to 1
would have reset inode 16104191343 nlinks from 0 to 1
would have reset inode 19634304264 nlinks from 0 to 1
would have reset inode 10840307678 nlinks from 0 to 1
would have reset inode 10840307694 nlinks from 0 to 1
would have reset inode 10840307709 nlinks from 0 to 1
No modify flag set, skipping filesystem flush and exiting.

 

I tried manual mount in a temp folder, greeted with these error

image.thumb.png.cddc810a6d2a98859bb2a53ac611f9f1.png

Edited by Nodiaque
Link to comment

Tried without -n

 


Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

 

tried mounting

image.thumb.png.8790723f368cb22874c9991d4702792c.png

Link to comment

Ok, this is the output. Do I manually mount or start the array not in maintenance mode?

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
agi unlinked bucket 30 is 102889438 in ag 5 (inode=10840307678)
agi unlinked bucket 46 is 102889454 in ag 5 (inode=10840307694)
agi unlinked bucket 61 is 102889469 in ag 5 (inode=10840307709)
agi unlinked bucket 47 is 1071805807 in ag 7 (inode=16104191343)
agi unlinked bucket 17 is 13602577 in ag 4 (inode=8603537169)
agi unlinked bucket 19 is 13602579 in ag 4 (inode=8603537171)
agi unlinked bucket 20 is 13602580 in ag 4 (inode=8603537172)
agi unlinked bucket 23 is 13602583 in ag 4 (inode=8603537175)
ir_freecount/free mismatch, inode chunk 11/346247680, freecount 2 nfree 1
finobt ir_freecount/free mismatch, inode chunk 11/346247680, freecount 2 nfree 1
agi unlinked bucket 40 is 346247720 in ag 11 (inode=23968567848)
agi unlinked bucket 8 is 306951432 in ag 9 (inode=19634304264)
agi unlinked bucket 41 is 306923689 in ag 9 (inode=19634276521)
agi unlinked bucket 10 is 361219338 in ag 8 (inode=17541088522)
agi unlinked bucket 11 is 361219339 in ag 8 (inode=17541088523)
sb_ifree 2411, counted 2418
sb_fdblocks 1978355686, counted 2005596978
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 14
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 8
        - agno = 4
        - agno = 7
        - agno = 1
        - agno = 9
        - agno = 5
        - agno = 6
        - agno = 10
        - agno = 11
        - agno = 12
        - agno = 13
        - agno = 3
        - agno = 14
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 8603537169, moving to lost+found
disconnected inode 8603537171, moving to lost+found
disconnected inode 8603537172, moving to lost+found
disconnected inode 8603537175, moving to lost+found
disconnected inode 10840307678, moving to lost+found
disconnected inode 10840307694, moving to lost+found
disconnected inode 10840307709, moving to lost+found
disconnected inode 16104191343, moving to lost+found
disconnected inode 17541088522, moving to lost+found
disconnected inode 17541088523, moving to lost+found
disconnected inode 19634276521, moving to lost+found
disconnected inode 19634304264, moving to lost+found
disconnected inode 23968567848, moving to lost+found
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:3681890) is ahead of log (1:2).
Format log to cycle 4.
done

 

Link to comment

Start the array in normal mode and it should mount. 
 

you will probably find you now have a lost+found folder on the drive which is where the repair process puts any files/folders for which it cannot find the directory entry giving its correct name.    Sorting this out is a manual process (although you can at least use the Linux ‘file’ command to find the content type of each file) and you have to decide if it is worth the effort.

  • Like 1
Link to comment

Ah nice. About parity and rebuild, can something be done with that? I guess I should run a parity check after?

 

Edit: IT worked! Only about 4 files with content in them. One seems to be a docker config file, another seems to be linux boot files (weird since it's on the usb). I think it's backup files since everything is now running without any error.

Edited by Nodiaque
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.