[SOLVED] Power Outage - unmountable Disk


Recommended Posts

My Street had a construction worker related power outage yesterday.

 

After Power came back on I of course checked my server for any problems, but it seemed fine. It started doing a parity check and everything was in order.

Later i realized that my Disk 3 (a fairly old 3TB Drive) had racked up over 1000 errors and the parity check was running extremely slow. I decided to stop the array. Upon restarting the array Disk 3 showed as Unmountable: No File system. I panicked and rebooted the system, but nothing changed of course.

How do I proceed now? I already ordered a new drive, seeing as Disk 3 probably has reached EOL, but what is the best way to make sure parity stays intact?

 

Unraid Version: 6.8.3

CPU: Xeon X3470

Motherboard: Supermicro X8SIL

server-diagnostics-20200702-1031.zip

Edited by CaphalorAlb
hardware info added
Link to comment
Phase 1 - find and verify superblock...
        - block cache size set to 744256 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 2604786 tail block 2604527
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_ifree 840, counted 839
sb_fdblocks 245193167, counted 242165844
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in ino 8386192 claims free block 1048397
data fork in ino 8386197 claims free block 1048396
imap claims a free inode 143527865 is in use, would correct imap and clear inode
imap claims a free inode 143527866 is in use, would correct imap and clear inode
        - agno = 1
data fork in ino 2287098465 claims free block 287134209
data fork in ino 2287098465 claims free block 287134210
data fork in ino 2287098466 claims free block 291074427
data fork in ino 2287098466 claims free block 291074428
        - agno = 2
data fork in ino 4299884673 claims free block 537485596
imap claims in-use inode 4299884673 is free, correcting imap
data fork in ino 4299884674 claims free block 537466410
data fork in ino 4299884674 claims free block 537466411
imap claims in-use inode 4299884674 is free, correcting imap
data fork in ino 4299884677 claims free block 537485597
imap claims in-use inode 4299884677 is free, correcting imap
data fork in ino 4299884679 claims free block 537485599
data fork in ino 4299884679 claims free block 537485600
imap claims in-use inode 4299884679 is free, correcting imap
data fork in ino 4301990338 claims free block 537896661
imap claims a free inode 4301990361 is in use, would correct imap and clear inode
imap claims a free inode 4539558894 is in use, would correct imap and clear inode
        - agno = 3
imap claims a free inode 6518499649 is in use, would correct imap and clear inode
imap claims a free inode 6542784048 is in use, would correct imap and clear inode
data fork in ino 6566668407 claims free block 817848027
imap claims in-use inode 6566668407 is free, correcting imap
data fork in ino 6566668412 claims free block 826790832
imap claims in-use inode 6566668412 is free, correcting imap
data fork in ino 6625116225 claims free block 828139536
data fork in ino 6625116225 claims free block 828139537
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 2
        - agno = 1
        - agno = 3
entry "unraid-check.cron" in shortform directory 8387256 references free inode 8386206
would have junked entry "unraid-check.cron" in directory inode 8387256
entry "preclear.disk.plg" at block 0 offset 1928 in directory inode 4301990341 references free inode 4539558894
	would clear inode number in entry at offset 1928...
entry "unassigned.devices.plg" at block 0 offset 2096 in directory inode 4301990341 references free inode 4301990361
	would clear inode number in entry at offset 2096...
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
entry "unraid-check.cron" in shortform directory inode 8387256 points to free inode 8386206
would junk entry
        - agno = 1
        - agno = 2
entry "preclear.disk.plg" in directory inode 4301990341 points to free inode 4539558894, would junk entry
entry "unassigned.devices.plg" in directory inode 4301990341 points to free inode 4301990361, would junk entry
bad hash table for directory inode 4301990341 (no data entry): would rebuild
        - agno = 3
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 143527835, would move to lost+found
disconnected inode 4299884673, would move to lost+found
disconnected inode 4299884674, would move to lost+found
disconnected inode 4299884677, would move to lost+found
disconnected inode 4299884679, would move to lost+found
Phase 7 - verify link counts...
would have reset inode 4301990340 nlinks from 9 to 8
Maximum metadata LSN (1:2605357) is ahead of log (1:2604786).
Would format log to cycle 4.
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Thu Jul  2 11:30:49 2020

Phase		Start		End		Duration
Phase 1:	07/02 11:30:47	07/02 11:30:47
Phase 2:	07/02 11:30:47	07/02 11:30:48	1 second
Phase 3:	07/02 11:30:48	07/02 11:30:48
Phase 4:	07/02 11:30:48	07/02 11:30:49	1 second
Phase 5:	Skipped
Phase 6:	07/02 11:30:49	07/02 11:30:49
Phase 7:	07/02 11:30:49	07/02 11:30:49

Total run time: 2 seconds

 

according to the wiki "If however issues were found, the display of results will indicate the recommended action to take. Typically, that will involve repeating the command with a specific option, clearly stated, which you will type into the options box (including any hyphens, usually 2 leading hyphens)."

Excuse my ineptness, but i have no clue which flags to set for the repair? should I  just run it again only with -v ?

 

Edited by CaphalorAlb
Link to comment
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
sb_ifree 840, counted 839
sb_fdblocks 245193167, counted 242165844
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in ino 8386192 claims free block 1048397
data fork in ino 8386197 claims free block 1048396
imap claims a free inode 143527865 is in use, correcting imap and clearing inode
cleared inode 143527865
imap claims a free inode 143527866 is in use, correcting imap and clearing inode
cleared inode 143527866
        - agno = 1
data fork in ino 2287098465 claims free block 287134209
data fork in ino 2287098465 claims free block 287134210
data fork in ino 2287098466 claims free block 291074427
data fork in ino 2287098466 claims free block 291074428
        - agno = 2
data fork in ino 4299884673 claims free block 537485596
correcting imap
data fork in ino 4299884674 claims free block 537466410
data fork in ino 4299884674 claims free block 537466411
correcting imap
data fork in ino 4299884677 claims free block 537485597
correcting imap
data fork in ino 4299884679 claims free block 537485599
data fork in ino 4299884679 claims free block 537485600
correcting imap
data fork in ino 4301990338 claims free block 537896661
imap claims a free inode 4301990361 is in use, correcting imap and clearing inode
cleared inode 4301990361
imap claims a free inode 4539558894 is in use, correcting imap and clearing inode
cleared inode 4539558894
        - agno = 3
imap claims a free inode 6518499649 is in use, correcting imap and clearing inode
cleared inode 6518499649
imap claims a free inode 6542784048 is in use, correcting imap and clearing inode
cleared inode 6542784048
data fork in ino 6566668407 claims free block 817848027
correcting imap
data fork in ino 6566668412 claims free block 826790832
correcting imap
data fork in ino 6625116225 claims free block 828139536
data fork in ino 6625116225 claims free block 828139537
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 3
        - agno = 2
entry "unraid-check.cron" in shortform directory 8387256 references free inode 8386206
junking entry "unraid-check.cron" in directory inode 8387256
entry "preclear.disk.plg" at block 0 offset 1928 in directory inode 4301990341 references free inode 4539558894
	clearing inode number in entry at offset 1928...
entry "unassigned.devices.plg" at block 0 offset 2096 in directory inode 4301990341 references free inode 4301990361
	clearing inode number in entry at offset 2096...
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
bad hash table for directory inode 4301990341 (no data entry): rebuilding
rebuilding directory inode 4301990341
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected dir inode 143527835, moving to lost+found
disconnected inode 4299884673, moving to lost+found
disconnected inode 4299884674, moving to lost+found
disconnected inode 4299884677, moving to lost+found
disconnected inode 4299884679, moving to lost+found
Phase 7 - verify and correct link counts...
resetting inode 8386206 nlinks from 2 to 3
resetting inode 4301990340 nlinks from 9 to 8
Maximum metadata LSN (1:2607525) is ahead of log (1:2).
Format log to cycle 4.
done

thanks! seems to have run successfully and i was able to start the array without any problems

going to attempt a parity check now and probably replace the drive once the spare arrives

thank you for the help!

  • Like 1
Link to comment
  • JorgeB changed the title to [SOLVED] Power Outage - unmountable Disk

new diagnostics

it's essentially stuck in a loop doing this

Jul 2 12:40:00 Server kernel: ata4: hard resetting link
Jul 2 12:40:06 Server kernel: ata4: link is slow to respond, please be patient (ready=0)
Jul 2 12:40:06 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jul 2 12:40:06 Server kernel: ata4.00: configured for UDMA/33
Jul 2 12:40:06 Server kernel: ata4: EH complete
Jul 2 12:40:06 Server kernel: ata4.00: exception Emask 0x10 SAct 0x40c0002 SErr 0x4890000 action 0xe frozen
Jul 2 12:40:06 Server kernel: ata4.00: irq_stat 0x08400040, interface fatal error, connection status changed
Jul 2 12:40:06 Server kernel: ata4: SError: { PHYRdyChg 10B8B LinkSeq DevExch }
Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED
Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:08:d0:71:19/05:00:00:00:00/40 tag 1 ncq dma 688128 in
Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY }
Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED
Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:90:10:77:19/05:00:00:00:00/40 tag 18 ncq dma 688128 in
Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY }
Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED
Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:98:50:7c:19/05:00:00:00:00/40 tag 19 ncq dma 688128 in
Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY }
Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED
Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:d0:d0:11:19/05:00:00:00:00/40 tag 26 ncq dma 688128 in
Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY }
Jul 2 12:40:06 Server kernel: ata4: hard resetting link

 

server-diagnostics-20200702-1234.zip

Edited by CaphalorAlb
Link to comment

that appears to be the case

fail.PNG

 

 

 

and holy shit, reading up on the ST3000DM001 it borders on a miracle it lasted as long as it did! I've had this drive for what has to be over 10 years now as it was my first external hard drive and 'backup' solution - thankfully i have since moved critical data to cloud services and keep multiple copies, even if i never recover this it's just movies and TV

Edited by CaphalorAlb
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.