RoyP Posted February 22 Share Posted February 22 A few days ago, disk 6 failed in my array. I had a spare on hand, replaced the drive, and let the system begin a rebuild. The rebuild progress was running at a snails pace, between 500 and 800 KB/s. it took over 2 days just to get to ~600G written, with an estimated finish over 70 days. While I was looking for solutions, I noticed that 6.12.8 released, I was on 6.12.6, and figured I might as well cancel the rebuild and perform the upgrade in the hopes that it might speed things along (maybe a bad idea in hindsight?). The upgrade completed without any errors and I was eventually prompted to reboot the server. I enabled syslog logging to the USB drive before rebooting because I anticipated the need to upload diagnostics for the slow rebuild issue. After the reboot, the server came back online with the array stopped. I expected this because the replacement drive had not finished rebuilding. I verified the correct drive was assigned to the failed slot and started the array. When the array started, the data rebuild started up and I noticed that one of my other disks was now showing "Unmountable: Unsupported or no file system". Is there any hope of getting that disk back online and mounted? I'm a little nervous to begin troubleshooting that on my own, although I did stop the array and restart it to see if that might work. Also, the data rebuild of the other disk is still crawling along at ~700 KB/sec with an ETF of 64 days. Diagnostic file is attached. I'm hoping this is recoverable and I can get the failed drive rebuilt. At the moment, I still have VM's and Docker services stopped. Thanks, in advance, for any assistance you can provide. If you have questions, I'll do my best to have answers. Dashboard images below... gumbo-diagnostics-20240222-1029.zip Quote Link to comment
trurl Posted February 22 Share Posted February 22 Do you still have the original disk6? Maybe nothing wrong with it. Bad connections are more common than bad disks. Connections disturbed when replacing disks is a very common reason for users to post about their rebuilding problems. And you have a bad connection on disk5 (if not others), which may be why it is unmountable, and is definitely causing problems rebuilding disk6. Shutdown, check all connections, all disks, both ends, power and SATA, including splitters. Then reboot and post new diagnostics with the array started. Quote Link to comment
RoyP Posted February 22 Author Share Posted February 22 Thanks for the reply. I do still have the original disk 6. I will re-install it and see how that goes. I've seen all the issues folks have with cables and power so I am cognizant of checking my cables and connections before closing up the case every time I have it down for maintenance. I will re-verify though and post back here with new diagnostic file after swapping disk 6 and starting the array. Be back soon... Quote Link to comment
JonathanM Posted February 22 Share Posted February 22 45 minutes ago, RoyP said: I will re-install it and see how that goes. Just don't assign it to slot 6 yet, see if it's mountable in Unassigned Devices. Quote Link to comment
RoyP Posted February 22 Author Share Posted February 22 Disk 6 is back to original. It was throwing UDMA CRC errors (2), so that's why I replaced it to begin with. It is now rebuilding and doing so at a much faster pace. 50 MB/Sec and estimated finish is in 23 hours. I'm still getting the "Unmounted: Unsupported or no file system" status on Disk 5 though. New diagnostics file attached. Thanks! gumbo-syslog-20240222-1606.zip Quote Link to comment
RoyP Posted February 22 Author Share Posted February 22 Oops, I had already mounted and it started rebuilding before I saw your note about unassigned devices. Have I messed up? Quote Link to comment
trurl Posted February 22 Share Posted February 22 We were hoping to keep original disk with its contents just as they were in case of problems rebuilding to the other disk. 49 minutes ago, RoyP said: UDMA CRC errors These are connection problems, not disk problems. You should post diagnostics, not syslog. But syslog seems to indicate you still have 2 hours ago, trurl said: bad connection on disk5 (if not others), which may be why it is unmountable, and is definitely causing problems rebuilding disk6. Shutdown, check all connections, all disks, both ends, power and SATA, including splitters. Then reboot and post new diagnostics with the array started. Quote Link to comment
RoyP Posted February 22 Author Share Posted February 22 (edited) Bah I grabbed the wrong file... here is the right one.gumbo-diagnostics-20240222-1615.zip What should I do at this point? Wait for the rebuild to complete, or stop it and replace cables? I have another set of cables I can use. Also, the rebuild process seems to be reading from that disk 5. Is that bad? You can see the reads in the last image I posted. Edited February 22 by RoyP Quote Link to comment
trurl Posted February 22 Share Posted February 22 That looks OK so far. Let disk6 rebuild complete then we will worry about disk5. Quote Link to comment
RoyP Posted February 23 Author Share Posted February 23 (edited) Thanks for the help so far. Disk 6 (the original one) has finished rebuilding, but it doesn't appear there is any data on it. The size only shows 27.9GB used. I did try stopping and restarting the array to see if it would update, but no go. New diagnostic file attached. I also thought about stopping the array and trying to mount disk 6 with unassigned devices just to check for data that way, but I didn't want to take that chance without checking in. Suggestions for next steps? gumbo-diagnostics-20240223-1735.zip Edited February 23 by RoyP Quote Link to comment
trurl Posted February 24 Share Posted February 24 On 2/22/2024 at 5:23 PM, RoyP said: Oops, I had already mounted and it started rebuilding before I saw your note about unassigned devices. Have I messed up? 21 minutes ago, RoyP said: thought about stopping the array and trying to mount disk 6 with unassigned devices just to check for data that way It won't show anything different than what you have rebuilt. Did you format anything during all this? Quote Link to comment
trurl Posted February 24 Share Posted February 24 23 minutes ago, RoyP said: doesn't appear there is any data on it There wasn't any on it in your first screenshot Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 There was just over 2TB on it before I swapped it out. I don't happen to have a screenshot of it before I did that, though. From what I remember, that number dropped after I put in the replacement disk and started getting the "unmountable" error on disk 5 and the rebuild started. Any ideas on what I should try next? Should I start swapping cables and try to get disk 5 back to a mountable state? Maybe if I can get disk 5 back online I can get disk 6 to rebuild properly. I'm starting to think this may not come back. :-( Quote Link to comment
trurl Posted February 24 Share Posted February 24 17 minutes ago, trurl said: Did you format anything during all this? You aren't going to get disk6 data back except possibly with some third party recovery software such as UFS Explorer. Check filesystem on disk5. Do it from the webUI to make sure it uses the correct command. Post the output. Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 25 minutes ago, trurl said: It won't show anything different than what you have rebuilt. Did you format anything during all this? I missed that. No, not that I recall, or at least, not intentionally. 4 minutes ago, trurl said: You aren't going to get disk6 data back except possibly with some third party recovery software such as UFS Explorer. Check filesystem on disk5. Do it from the webUI to make sure it uses the correct command. Post the output. Should I do the check in the current state, or try swapping cables first? Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 I stopped the array, put in maintenance mode and did the check. Haven't messed with cables yet. Here is the output screenshot (I can post the actual text if you like, but the formatting was wonky): Quote Link to comment
trurl Posted February 24 Share Posted February 24 10 minutes ago, RoyP said: Should I do the check in the current state, or try swapping cables first? Since you successfully completed disk6 rebuild, it all seems to be working well. Probably better if you don't do anything else to the hardware now. Quote Link to comment
trurl Posted February 24 Share Posted February 24 Just now, RoyP said: did the check Do it again without -n. If it asks for it, use -L. Post the output Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 Had to use -L Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... ALERT: The filesystem has valuable metadata changes in a log which is being destroyed because the -L option was used. - scan filesystem freespace and inode maps... clearing needsrepair flag and regenerating metadata - found root inode chunk Phase 3 - for each AG... - scan and clear agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 data fork in ino 562952124 claims free block 70369282 - agno = 1 - agno = 2 - agno = 3 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 4 - agno = 5 - agno = 6 - agno = 7 - agno = 8 - agno = 9 - agno = 3 Phase 5 - rebuild AG headers and trees... - reset superblock... Phase 6 - check inode connectivity... - resetting contents of realtime bitmap and summary inodes - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify and correct link counts... Maximum metadata LSN (4:18631) is ahead of log (1:2). Format log to cycle 7. done Quote Link to comment
trurl Posted February 24 Share Posted February 24 Start the array in normal (not maintenance) mode and post new diagnostics. Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 Looks like disk 5 mounted successfully! I can see the data on it. New diagnostics attached. gumbo-diagnostics-20240223-1855.zip Quote Link to comment
trurl Posted February 24 Share Posted February 24 That looks good except for disk6 which you must have formatted Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 Thanks for the help getting disk 5 back online! 10 minutes ago, trurl said: must have formatted I don't know how, but I guess it's possible. I'm willing to give this UFS explorer a shot. How would this process work with unraid? From what I'm seeing, I guess I install the software on my windows box and then connect the drive to attempt recovery of the data. IF I'm able to recover the data, how do I then get that disk back in the unraid array and sync the data back to parity without wiping something else out? Quote Link to comment
trurl Posted February 24 Share Posted February 24 New Config will let you assign any disks however you want and rebuild parity. Since there is no data on that disk you don't necessarily have to get it back in the array, you just have to have some free space to copy any recovered files. Quote Link to comment
RoyP Posted February 24 Author Share Posted February 24 Just to be sure my thought process is correct... I need to remove disk 6 from the array and replace with another drive (in order to protect any data that might possibly be recoverable on the one in there now). While the replacement disk 6 is rebuilding in the array, I'll attempt recovery of the data on the original disk. If I am able to recover anything, I can then mount the recovered data as an unassigned device and just copy the data back to the newly rebuilt disk 6 once it is done with parity sync. I'm not too hopeful of getting data back, but fingers crossed. Does that sound about right? You mentioned ufs explorer in a previous post. Does that seem to be my best option to potentially recover data at this point? I appreciate all of the help you've provided. Fingers crossed that I can get anything back from the original drive. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.