Yonder Posted February 3, 2022 Share Posted February 3, 2022 I'm running a 9-disk Unraid array with dual disk parity configured. Earlier this evening, I was noticing some slow network file access, and when I went to check the status of the array I see that it had been running it's monthly parity check for a full day and a half now, but had only reached 3% complete. Something was obviously wrong, so I manually stopped the parity check where it was (as I said, it had made minimal progress but had recorded no errors), and then I rebooted the server. Everything was proceeding fine, except that the array didn't come back online as expected. Instead it was stopped with one of the seven data disks listed as "Unmountable: not mounted". There were no SMART errors detected on that drive, but obviously the file system wasn't happy. I started the array in maintenance mode and ran the "Check Filesystem Status" operation. It seemed to detect a few inode issues, and so I re-ran it telling it to fix the errors, but even after completion I still can't get that disk to mount properly when I try to start the array. I also did a "SMART short self-test" on that disk with no errors detected. Any recommendations as to what I can do next? I've had disks fail outright, refuse to spin up, and replaced them with new disks to have my array back up and running quite rapidly. Because the disk isn't mountable, UNRAID is giving me the option to Format the disk, but then of course gives me the following warning: A format is NEVER part of a data recovery or disk rebuild process and if done in such circumstances will normally lead to loss of all data on the disks being formatted. This isn't a new disk but one with existing data on it, so it wouldn't seem that a format is an acceptable operation in this case, as I can't be sue that I would have the ability to rebuild that missing filesystem from parity data afterward. It is giving me the option to try to re-run the Parity Check so I'm doing that right now while still started in Maintenance mode. I can't tell if it will make it further than 3% where it seemed to be stuck last night but we'll see. In the meantime, I'd love any useful suggestions. I've attached my diagnostics file if that's helpful. Thanks. diagnostics-20220203-0021.zip Quote Link to comment
JorgeB Posted February 3, 2022 Share Posted February 3, 2022 Start the array in maintenance mode and post the output of: xfs_repair -v /dev/md3 Quote Link to comment
Yonder Posted February 3, 2022 Author Share Posted February 3, 2022 Sure... here it is: root@servername:~# xfs_repair -v /dev/md3 Phase 1 - find and verify superblock... - block cache size set to 1490008 entries Phase 2 - using internal log - zero log... zero_log: head block 0 tail block 0 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 2 - agno = 1 - agno = 3 - agno = 5 - agno = 4 Phase 5 - rebuild AG headers and trees... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... XFS_REPAIR Summary Thu Feb 3 01:37:11 2022 Phase Start End Duration Phase 1: 02/03 01:36:55 02/03 01:36:55 Phase 2: 02/03 01:36:55 02/03 01:36:57 2 seconds Phase 3: 02/03 01:36:57 02/03 01:37:02 5 seconds Phase 4: 02/03 01:37:02 02/03 01:37:02 Phase 5: 02/03 01:37:02 02/03 01:37:06 4 seconds Phase 6: 02/03 01:37:06 02/03 01:37:10 4 seconds Phase 7: 02/03 01:37:10 02/03 01:37:10 Total run time: 15 seconds done Quote Link to comment
JorgeB Posted February 3, 2022 Share Posted February 3, 2022 That looks fine, now post new diags after array start in normal mode. Quote Link to comment
Yonder Posted February 3, 2022 Author Share Posted February 3, 2022 Will do. I'm just letting it complete a parity check. I'll post additional diags as soon as that's complete. Quote Link to comment
Yonder Posted February 4, 2022 Author Share Posted February 4, 2022 The parity check completed and Unraid seems to have repaired Disk 3 to a point that the file system now mounts properly. I ran a diagnostics to post here anyway, just to close the loop, but I think that my array is now running normally and the problems are gone. diagnostics-20220204-0919.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.