CaphalorAlb Posted July 2, 2020 Share Posted July 2, 2020 (edited) My Street had a construction worker related power outage yesterday. After Power came back on I of course checked my server for any problems, but it seemed fine. It started doing a parity check and everything was in order. Later i realized that my Disk 3 (a fairly old 3TB Drive) had racked up over 1000 errors and the parity check was running extremely slow. I decided to stop the array. Upon restarting the array Disk 3 showed as Unmountable: No File system. I panicked and rebooted the system, but nothing changed of course. How do I proceed now? I already ordered a new drive, seeing as Disk 3 probably has reached EOL, but what is the best way to make sure parity stays intact? Unraid Version: 6.8.3 CPU: Xeon X3470 Motherboard: Supermicro X8SIL server-diagnostics-20200702-1031.zip Edited July 2, 2020 by CaphalorAlb hardware info added Quote Link to comment
Squid Posted July 2, 2020 Share Posted July 2, 2020 Check disk file system on disk #3 1 Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 (edited) thank you for the quick answer stopping the array seems to be stuck at "Array Stopping•Retry unmounting disk share(s)..." edit: figured it out, log below Edited July 2, 2020 by CaphalorAlb Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 (edited) Phase 1 - find and verify superblock... - block cache size set to 744256 entries Phase 2 - using internal log - zero log... zero_log: head block 2604786 tail block 2604527 ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_ifree 840, counted 839 sb_fdblocks 245193167, counted 242165844 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 data fork in ino 8386192 claims free block 1048397 data fork in ino 8386197 claims free block 1048396 imap claims a free inode 143527865 is in use, would correct imap and clear inode imap claims a free inode 143527866 is in use, would correct imap and clear inode - agno = 1 data fork in ino 2287098465 claims free block 287134209 data fork in ino 2287098465 claims free block 287134210 data fork in ino 2287098466 claims free block 291074427 data fork in ino 2287098466 claims free block 291074428 - agno = 2 data fork in ino 4299884673 claims free block 537485596 imap claims in-use inode 4299884673 is free, correcting imap data fork in ino 4299884674 claims free block 537466410 data fork in ino 4299884674 claims free block 537466411 imap claims in-use inode 4299884674 is free, correcting imap data fork in ino 4299884677 claims free block 537485597 imap claims in-use inode 4299884677 is free, correcting imap data fork in ino 4299884679 claims free block 537485599 data fork in ino 4299884679 claims free block 537485600 imap claims in-use inode 4299884679 is free, correcting imap data fork in ino 4301990338 claims free block 537896661 imap claims a free inode 4301990361 is in use, would correct imap and clear inode imap claims a free inode 4539558894 is in use, would correct imap and clear inode - agno = 3 imap claims a free inode 6518499649 is in use, would correct imap and clear inode imap claims a free inode 6542784048 is in use, would correct imap and clear inode data fork in ino 6566668407 claims free block 817848027 imap claims in-use inode 6566668407 is free, correcting imap data fork in ino 6566668412 claims free block 826790832 imap claims in-use inode 6566668412 is free, correcting imap data fork in ino 6625116225 claims free block 828139536 data fork in ino 6625116225 claims free block 828139537 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 entry "unraid-check.cron" in shortform directory 8387256 references free inode 8386206 would have junked entry "unraid-check.cron" in directory inode 8387256 entry "preclear.disk.plg" at block 0 offset 1928 in directory inode 4301990341 references free inode 4539558894 would clear inode number in entry at offset 1928... entry "unassigned.devices.plg" at block 0 offset 2096 in directory inode 4301990341 references free inode 4301990361 would clear inode number in entry at offset 2096... No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 entry "unraid-check.cron" in shortform directory inode 8387256 points to free inode 8386206 would junk entry - agno = 1 - agno = 2 entry "preclear.disk.plg" in directory inode 4301990341 points to free inode 4539558894, would junk entry entry "unassigned.devices.plg" in directory inode 4301990341 points to free inode 4301990361, would junk entry bad hash table for directory inode 4301990341 (no data entry): would rebuild - agno = 3 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 143527835, would move to lost+found disconnected inode 4299884673, would move to lost+found disconnected inode 4299884674, would move to lost+found disconnected inode 4299884677, would move to lost+found disconnected inode 4299884679, would move to lost+found Phase 7 - verify link counts... would have reset inode 4301990340 nlinks from 9 to 8 Maximum metadata LSN (1:2605357) is ahead of log (1:2604786). Would format log to cycle 4. No modify flag set, skipping filesystem flush and exiting. XFS_REPAIR Summary Thu Jul 2 11:30:49 2020 Phase Start End Duration Phase 1: 07/02 11:30:47 07/02 11:30:47 Phase 2: 07/02 11:30:47 07/02 11:30:48 1 second Phase 3: 07/02 11:30:48 07/02 11:30:48 Phase 4: 07/02 11:30:48 07/02 11:30:49 1 second Phase 5: Skipped Phase 6: 07/02 11:30:49 07/02 11:30:49 Phase 7: 07/02 11:30:49 07/02 11:30:49 Total run time: 2 seconds according to the wiki "If however issues were found, the display of results will indicate the recommended action to take. Typically, that will involve repeating the command with a specific option, clearly stated, which you will type into the options box (including any hyphens, usually 2 leading hyphens)." Excuse my ineptness, but i have no clue which flags to set for the repair? should I just run it again only with -v ? Edited July 2, 2020 by CaphalorAlb Quote Link to comment
JorgeB Posted July 2, 2020 Share Posted July 2, 2020 Run it again without -n or nothing will be done, if it asks for it use -L. 1 Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... sb_ifree 840, counted 839 sb_fdblocks 245193167, counted 242165844 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 data fork in ino 8386192 claims free block 1048397 data fork in ino 8386197 claims free block 1048396 imap claims a free inode 143527865 is in use, correcting imap and clearing inode cleared inode 143527865 imap claims a free inode 143527866 is in use, correcting imap and clearing inode cleared inode 143527866 - agno = 1 data fork in ino 2287098465 claims free block 287134209 data fork in ino 2287098465 claims free block 287134210 data fork in ino 2287098466 claims free block 291074427 data fork in ino 2287098466 claims free block 291074428 - agno = 2 data fork in ino 4299884673 claims free block 537485596 correcting imap data fork in ino 4299884674 claims free block 537466410 data fork in ino 4299884674 claims free block 537466411 correcting imap data fork in ino 4299884677 claims free block 537485597 correcting imap data fork in ino 4299884679 claims free block 537485599 data fork in ino 4299884679 claims free block 537485600 correcting imap data fork in ino 4301990338 claims free block 537896661 imap claims a free inode 4301990361 is in use, correcting imap and clearing inode cleared inode 4301990361 imap claims a free inode 4539558894 is in use, correcting imap and clearing inode cleared inode 4539558894 - agno = 3 imap claims a free inode 6518499649 is in use, correcting imap and clearing inode cleared inode 6518499649 imap claims a free inode 6542784048 is in use, correcting imap and clearing inode cleared inode 6542784048 data fork in ino 6566668407 claims free block 817848027 correcting imap data fork in ino 6566668412 claims free block 826790832 correcting imap data fork in ino 6625116225 claims free block 828139536 data fork in ino 6625116225 claims free block 828139537 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 3 - agno = 2 entry "unraid-check.cron" in shortform directory 8387256 references free inode 8386206 junking entry "unraid-check.cron" in directory inode 8387256 entry "preclear.disk.plg" at block 0 offset 1928 in directory inode 4301990341 references free inode 4539558894 clearing inode number in entry at offset 1928... entry "unassigned.devices.plg" at block 0 offset 2096 in directory inode 4301990341 references free inode 4301990361 clearing inode number in entry at offset 2096... Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... bad hash table for directory inode 4301990341 (no data entry): rebuilding rebuilding directory inode 4301990341 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 143527835, moving to lost+found disconnected inode 4299884673, moving to lost+found disconnected inode 4299884674, moving to lost+found disconnected inode 4299884677, moving to lost+found disconnected inode 4299884679, moving to lost+found Phase 7 - verify and correct link counts... resetting inode 8386206 nlinks from 2 to 3 resetting inode 4301990340 nlinks from 9 to 8 Maximum metadata LSN (1:2607525) is ahead of log (1:2). Format log to cycle 4. done thanks! seems to have run successfully and i was able to start the array without any problems going to attempt a parity check now and probably replace the drive once the spare arrives thank you for the help! 1 Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 the parity check is still extremely slow, is that an indicator for further problems? Total size: 8 TB Elapsed time: 7 minutes Current position: 1.03 GB (0.0 %) Estimated speed: 2.2 MB/sec Estimated finish: 42 days, 22 hours, 42 minutes Sync errors corrected: 0 Quote Link to comment
Squid Posted July 2, 2020 Share Posted July 2, 2020 Post a new set of diagnostics while this is happening. Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 (edited) new diagnostics it's essentially stuck in a loop doing this Jul 2 12:40:00 Server kernel: ata4: hard resetting link Jul 2 12:40:06 Server kernel: ata4: link is slow to respond, please be patient (ready=0) Jul 2 12:40:06 Server kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jul 2 12:40:06 Server kernel: ata4.00: configured for UDMA/33 Jul 2 12:40:06 Server kernel: ata4: EH complete Jul 2 12:40:06 Server kernel: ata4.00: exception Emask 0x10 SAct 0x40c0002 SErr 0x4890000 action 0xe frozen Jul 2 12:40:06 Server kernel: ata4.00: irq_stat 0x08400040, interface fatal error, connection status changed Jul 2 12:40:06 Server kernel: ata4: SError: { PHYRdyChg 10B8B LinkSeq DevExch } Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:08:d0:71:19/05:00:00:00:00/40 tag 1 ncq dma 688128 in Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY } Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:90:10:77:19/05:00:00:00:00/40 tag 18 ncq dma 688128 in Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY } Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:98:50:7c:19/05:00:00:00:00/40 tag 19 ncq dma 688128 in Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY } Jul 2 12:40:06 Server kernel: ata4.00: failed command: READ FPDMA QUEUED Jul 2 12:40:06 Server kernel: ata4.00: cmd 60/40:d0:d0:11:19/05:00:00:00:00/40 tag 26 ncq dma 688128 in Jul 2 12:40:06 Server kernel: res 40/00:90:10:77:19/00:00:00:00:00/40 Emask 0x10 (ATA bus error) Jul 2 12:40:06 Server kernel: ata4.00: status: { DRDY } Jul 2 12:40:06 Server kernel: ata4: hard resetting link server-diagnostics-20200702-1234.zip Edited July 2, 2020 by CaphalorAlb Quote Link to comment
JorgeB Posted July 2, 2020 Share Posted July 2, 2020 Disk3 appears to be failing, and most likely is since it's the infamous ST3000DM001, you can run an extended SMART test to confirm. 1 Quote Link to comment
CaphalorAlb Posted July 2, 2020 Author Share Posted July 2, 2020 (edited) that appears to be the case and holy shit, reading up on the ST3000DM001 it borders on a miracle it lasted as long as it did! I've had this drive for what has to be over 10 years now as it was my first external hard drive and 'backup' solution - thankfully i have since moved critical data to cloud services and keep multiple copies, even if i never recover this it's just movies and TV Edited July 2, 2020 by CaphalorAlb Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.