ShaneH Posted January 22, 2019 Author Share Posted January 22, 2019 Ok this does not look fun. Parity sync is complete. 0 errors. 8tb disk now has an unmountable file system. Here we go: -------------------------- Event: Unraid Disk 2 message Subject: Notice [TOWER] - Disk 2 returned to normal operation Description: ST8000AS0002-1NA17Z_Z840E3TF (sdd) -------------------------- Event: Unraid Parity sync / Data rebuild Subject: Notice [TOWER] - Parity sync / Data rebuild finished (0 errors) Description: Duration: 14 hours, 40 minutes, 10 seconds. Average speed: 189.4 MB/s -------------------------- Quote Link to comment
trurl Posted January 22, 2019 Share Posted January 22, 2019 Looks good. You will have to repair the filesystem on disk2 as expected. Click on Disk 2 to get to its page. You should see a section Check Filesystem Status. The button will be disabled telling you to be in Maintenance Mode. Stop the array, start it in Maintenance mode and go back to that page and click the button. Post your results. Quote Link to comment
ShaneH Posted January 22, 2019 Author Share Posted January 22, 2019 I started the array in Maintenance mode I clicked the check file system button for disk2. Results are: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being ignored because the -n option was used. Expect spurious inconsistencies which may be resolved by first mounting the filesystem to replay the log. - scan filesystem freespace and inode maps... sb_fdblocks 1530287853, counted 1532948218 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 would have corrected directory 99 size from 95 to 89 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 would have corrected directory 99 size from 95 to 89 - agno = 1 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... Metadata corruption detected at 0x44f20d, inode 0x63 data fork couldn't map inode 99, err = 117 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 2147483744, would move to lost+found disconnected dir inode 8657675550, would move to lost+found disconnected dir inode 10737418336, would move to lost+found disconnected dir inode 10737856698, would move to lost+found Phase 7 - verify link counts... Metadata corruption detected at 0x44f20d, inode 0x63 data fork couldn't map inode 99, err = 117, can't compare link counts No modify flag set, skipping filesystem flush and exiting. Quote Link to comment
JonathanM Posted January 23, 2019 Share Posted January 23, 2019 I seem to remember some issues with the xfs repair procedure in prior versions of unraid. Since you are on an older version (6.6.3), maybe @johnnie.black or @trurl has a better memory of which versions were affected. I would hold tight where you are at right now until somebody confirms it's safe to continue the repair on that specific unraid version, or perhaps it would be better to upgrade unraid first. Quote Link to comment
trurl Posted January 23, 2019 Share Posted January 23, 2019 This bug report - [6.6.6] XFSPROGS 4.16.X VERSION OF XFS_REPAIR HAS BUG IN PHASE6.C is for 6.6.6 but might have been on earlier versions as well. That report says solved in 6.7.0-rc1 The linked threads in that report seem similar to this xfs_repair result we just got here. So it looks like we have come full circle on this. To recap: OP started this thread after he had removed the disk and repaired it in another system, thus invalidating parity. Then, through a misunderstanding, partially my fault, he rebuilt the disk, returning it to its original state. Now he has attempted the repair in Unraid and encountered what may be a bug in his version of Unraid that prevents the repair from completing. I just reviewed this thread and the rest of his post history and found no indication that he repaired the disk on another system purposely so as to avoid this bug. But it looks like that might have been a valid approach after all. Of course then a parity sync would have been needed and that was mentioned early in this thread. I guess at this point the way forward is to upgrade Unraid. Either that or go back and do it all again the "wrong way" on another system like he did before and resync parity. I hesitate to make any firm recommendations at this point without other opinions. I'm just going to tag @johnnie.black again and see if he has other ideas. Quote Link to comment
JorgeB Posted January 23, 2019 Share Posted January 23, 2019 8 hours ago, ShaneH said: No modify flag set, skipping filesystem flush and exiting. First thing would be to run xfs_repair without -n. Quote Link to comment
trurl Posted January 23, 2019 Share Posted January 23, 2019 OK. I was afraid maybe this part in phase 6 13 hours ago, ShaneH said: Phase 6 - check inode connectivity... - traversing filesystem ... Metadata corruption detected at 0x44f20d, inode 0x63 data fork couldn't map inode 99, err = 117 was related to the bug @ShaneH proceed 5 hours ago, johnnie.black said: First thing would be to run xfs_repair without -n. Quote Link to comment
JorgeB Posted January 23, 2019 Share Posted January 23, 2019 16 minutes ago, trurl said: was related to the bug Doesn't look like it is, if xfs_repair without -n fails then the the OP can try again after upgrading xfsprogs or Unraid. Quote Link to comment
ShaneH Posted January 23, 2019 Author Share Posted January 23, 2019 OK, I went to Disk 2 I cleared the "-n" from the "options" box after check. I then ran check. Output: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. Quote Link to comment
itimpi Posted January 23, 2019 Share Posted January 23, 2019 3 minutes ago, ShaneH said: OK, I went to Disk 2 I cleared the "-n" from the "options" box after check. I then ran check. Output: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_repair. If you are unable to mount the filesystem, then use the -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this. That is not at all unusual! You can run with the -L option. In the vast majority of cases there is no data loss at all, and even if there is it is only likely to affect the last file written Quote Link to comment
ShaneH Posted January 24, 2019 Author Share Posted January 24, 2019 Hello, @trurl Are we going to proceed with using the -L option? I don't want to discount itimpi but you have been a strong voice in this topic. Thanks Quote Link to comment
trurl Posted January 24, 2019 Share Posted January 24, 2019 1 hour ago, ShaneH said: Are we going to proceed with using the -L option? Yes Quote Link to comment
ShaneH Posted January 24, 2019 Author Share Posted January 24, 2019 What is next? Output: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... sb_fdblocks 1530287853, counted 1532948218 - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 corrected directory 99 size, was 95, now 89 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 1 - agno = 3 - agno = 2 - agno = 0 - agno = 4 - agno = 5 - agno = 6 - agno = 7 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (1:1995504) is ahead of log (1:2). Format log to cycle 4. done Quote Link to comment
trurl Posted January 24, 2019 Share Posted January 24, 2019 Looks like it completed. Start the array in normal mode and see if the drive is mountable now. Quote Link to comment
ShaneH Posted January 25, 2019 Author Share Posted January 25, 2019 The array is up. Disk 2 is mounted and things are looking way better. Do we have any more disk integrity checks to preform? What are the next steps? Quote Link to comment
trurl Posted January 25, 2019 Share Posted January 25, 2019 Check the lost+found folder. Quote Link to comment
ShaneH Posted January 25, 2019 Author Share Posted January 25, 2019 I went to the "main" page, then clicked on the folder icon (on the right) for disk2. I also check the unRaid terminal. I do not see a lost+found folder. Quote Link to comment
trurl Posted January 25, 2019 Share Posted January 25, 2019 Nothing else to be done then. Have you got all your files back? Quote Link to comment
itimpi Posted January 25, 2019 Share Posted January 25, 2019 Just now, ShaneH said: I went to the "main" page, then clicked on the folder icon (on the right) for disk2. I also check the unRaid terminal. I do not see a lost+found folder. That is probably a good sign! The lost+found folder is only created if the repair process found some files where it could not correctly identify the name. Quote Link to comment
ShaneH Posted January 25, 2019 Author Share Posted January 25, 2019 Hello. All the apps are gone from "Dashboard". Dockers and VMs are gone. I can rebuild. I am going to do a file compare to look for missing files. It is looking good with a quick glance. I need to do a better check. (I was not really using the array until we were finished.) Quote Link to comment
trurl Posted January 25, 2019 Share Posted January 25, 2019 6 minutes ago, ShaneH said: All the apps are gone from "Dashboard". Dockers and VMs are gone. Looked at your diagnostics again. Unfortunately I can't tell from the diagnostics with that "older" version of Unraid exactly which disk(s) your system share was on. It is cache-prefer, but if you set these up before adding cache then probably they never got moved to cache and maybe they didn't survive the repair of disk2. You can reinstall your dockers exactly as they were before using the Previous Apps feature on the Apps page. Quote Link to comment
ShaneH Posted January 28, 2019 Author Share Posted January 28, 2019 Hello, The video files seem to have returned since the rebuild. I copied all pictures from a backup on top off the unRaid pictures share. Very few pictures were actually copied to the array. I am going to look at the documents folder but that will be a slow process. Things are looking very good with the array. I am going to upgrade my back up process. Thank you very much for all of the help and your time. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.