ramathiam Posted December 12, 2022 Share Posted December 12, 2022 Hello all, I recently noticed that one of the disks in my array was marked as disabled and emulated. Before moving forward, I ran a SMART test, which came back with no errors found. I did some searches in the forums on how to recover from this and found the following steps: -stop array -unassign disk -start array -stop array -re-assign disk -start array to begin rebuild The results weren't as expected as I wasn't asked to rebuild. Once I I re-added the disk and started the array, the system said that the drive was unmountable and would have to be formatted. It also warned that parity would be updated. At this point, I stopped and started digging deeper in the documentation. I came across the section that said I should check the file system in maintenance mode. I performed the test using the -n switch with the following results: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 4 - agno = 5 - agno = 2 - agno = 1 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. I'm not sure what to do at this point and what my options are. Should I rerun the test without the -n switch and let it make changes? Thank you for any help. - Tom Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 Run xfs_repair again without -n or nothing will be done, if it asks for -L use it. Quote Link to comment
ramathiam Posted December 12, 2022 Author Share Posted December 12, 2022 Thank you for the suggestion. I reran the repair without the -n and received the following output: Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 5 - agno = 3 - agno = 2 - agno = 1 - agno = 4 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... done I don't see any errors called out or the call for the -L switch. I stopped the array and started it normally, but still get the unmountable notice. I'm assuming this stems from where I originally stopped the array, unassigned the disk, and restarted it. Can I get the system to recognize the disk again or have parity rebuild it? Thanks Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 Is this the output from checking the emulated disk or the actual disk? Also please post the diagnostics. Quote Link to comment
ramathiam Posted December 12, 2022 Author Share Posted December 12, 2022 I would assume the emulated disk. The main array screen says that the disk is disabled, contents emulated. I've attached the diagnostics download. Thanks! tower-diagnostics-20221212-1226.zip Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 And you've used the GUI to run xfs_repair correct? Log is full of spam, please reboot and post new diags after array start in normal mode. Quote Link to comment
ramathiam Posted December 12, 2022 Author Share Posted December 12, 2022 Correct, I used the GUI. I shutdown the array and the machine. Restarted it and started the array normally. New diagnostics attached. I hope they're cleaner. Thanks!! tower-diagnostics-20221212-1258.zip Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 It's an "unsupported partition layout" error, this is not a filesystem problem, unassign disk2, start the array and post new diags. Quote Link to comment
ramathiam Posted December 12, 2022 Author Share Posted December 12, 2022 Attached. Thanks!! tower-diagnostics-20221212-1316.zip Quote Link to comment
JorgeB Posted December 12, 2022 Share Posted December 12, 2022 Emulated disk is now mounting, disk2 you were using is generating a lot of ATA errors but SMART looks OK, try replacing the cables to see if those go away, if they don't try using a different disk as replacement. Quote Link to comment
ramathiam Posted December 12, 2022 Author Share Posted December 12, 2022 Will do. Thanks!! I noticed when stopping and starting the array that I heard what sounded like periodic chirping from one of the disks. My assumption would be disk2. I will do some physical investigation and probably go ahead and pick up a replacement disk. If I do replace, parity should rebuild the disk, correct? Thanks so much for all of your help!! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.