Jump to content

Unmountable disk after upgrade and reboot


Recommended Posts

A few days ago, disk 6 failed in my array.  I had a spare on hand, replaced the drive, and let the system begin a rebuild.  The rebuild progress was running at a snails pace, between 500 and 800 KB/s.  it took over 2 days just to get to ~600G written, with an estimated finish over 70 days.  While I was looking for solutions, I noticed that 6.12.8 released, I was on 6.12.6, and figured I might as well cancel the rebuild and perform the upgrade in the hopes that it might speed things along (maybe a bad idea in hindsight?).  The upgrade completed without any errors and I was eventually prompted to reboot the server.  I enabled syslog logging to the USB drive before rebooting because I anticipated the need to upload diagnostics for the slow rebuild issue.  After the reboot, the server came back online with the array stopped.  I expected this because the replacement drive had not finished rebuilding.  I verified the correct drive was assigned to the failed slot and started the array.  When the array started, the data rebuild started up and I noticed that one of my other disks was now showing "Unmountable: Unsupported or no file system".  Is there any hope of getting that disk back online and mounted?  I'm a little nervous to begin troubleshooting that on my own, although I did stop the array and restart it to see if that might work.  Also, the data rebuild of the other disk is still crawling along at ~700 KB/sec with an ETF of 64 days.  Diagnostic file is attached.  I'm hoping this is recoverable and I can get the failed drive rebuilt.  At the moment, I still have VM's and Docker services stopped.  Thanks, in advance, for any assistance you can provide.  If you have questions, I'll do my best to have answers. Dashboard images below...

 

image.thumb.png.3ba05563768985266aab55d7347d4df8.png

 

image.thumb.png.5d6a98968859897890f13612020f0571.png

gumbo-diagnostics-20240222-1029.zip

Link to comment

Do you still have the original disk6? Maybe nothing wrong with it. Bad connections are more common than bad disks.

 

Connections disturbed when replacing disks is a very common reason for users to post about their rebuilding problems.

 

And you have a bad connection on disk5 (if not others), which may be why it is unmountable, and is definitely causing problems rebuilding disk6.

 

Shutdown, check all connections, all disks, both ends, power and SATA, including splitters.

 

Then reboot and post new diagnostics with the array started.

Link to comment

Thanks for the reply.  I do still have the original disk 6.  I will re-install it and see how that goes.  I've seen all the issues folks have with cables and power so I am cognizant of checking my cables and connections before closing up the case every time I have it down for maintenance.  I will re-verify though and post back here with new diagnostic file after swapping disk 6 and starting the array.  Be back soon...

Link to comment

We were hoping to keep original disk with its contents just as they were in case of problems rebuilding to the other disk.

49 minutes ago, RoyP said:

UDMA CRC errors

These are connection problems, not disk problems.

 

You should post diagnostics, not syslog. But syslog seems to indicate you still have 

2 hours ago, trurl said:

bad connection on disk5 (if not others), which may be why it is unmountable, and is definitely causing problems rebuilding disk6.

 

Shutdown, check all connections, all disks, both ends, power and SATA, including splitters.

 

Then reboot and post new diagnostics with the array started.

 

Link to comment

Bah I grabbed the wrong file... here is the right one.gumbo-diagnostics-20240222-1615.zip

What should I do at this point?  Wait for the rebuild to complete, or stop it and replace cables?  I have another set of cables I can use.

 

Also, the rebuild process seems to be reading from that disk 5.  Is that bad?  You can see the reads in the last image I posted.

Edited by RoyP
Link to comment

Thanks for the help so far.

Disk 6 (the original one) has finished rebuilding, but it doesn't appear there is any data on it.  The size only shows 27.9GB used.  I did try stopping and restarting the array to see if it would update, but no go.  New diagnostic file attached.  I also thought about stopping the array and trying to mount disk 6 with unassigned devices just to check for data that way, but I didn't want to take that chance without checking in.  Suggestions for next steps?

gumbo-diagnostics-20240223-1735.zipimage.thumb.png.06bf1a12fcbcb7b9bc350bc1f2e7c2bc.png

Edited by RoyP
Link to comment
On 2/22/2024 at 5:23 PM, RoyP said:

Oops, I had already mounted and it started rebuilding before I saw your note about unassigned devices.  Have I messed up?

 

21 minutes ago, RoyP said:

thought about stopping the array and trying to mount disk 6 with unassigned devices just to check for data that way

It won't show anything different than what you have rebuilt.

 

Did you format anything during all this?

 

Link to comment

There was just over 2TB on it before I swapped it out.  I don't happen to have a screenshot of it before I did that, though.  From what I remember, that number dropped after I put in the replacement disk and started getting the "unmountable" error on disk 5 and the rebuild started.  Any ideas on what I should try next?  Should I start swapping cables and try to get disk 5 back to a mountable state?  Maybe if I can get disk 5 back online I can get disk 6 to rebuild properly.  I'm starting to think this may not come back. :-( 

Link to comment
25 minutes ago, trurl said:

 

It won't show anything different than what you have rebuilt.

 

Did you format anything during all this?

 

I missed that.  No, not that I recall, or at least, not intentionally.

 

4 minutes ago, trurl said:

You aren't going to get disk6 data back except possibly with some third party recovery software such as UFS Explorer.

 

Check filesystem on disk5. Do it from the webUI to make sure it uses the correct command. Post the output.

Should I do the check in the current state, or try swapping cables first?

Link to comment

Had to use -L

 

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
clearing needsrepair flag and regenerating metadata
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
data fork in ino 562952124 claims free block 70369282
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - agno = 8
        - agno = 9
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (4:18631) is ahead of log (1:2).
Format log to cycle 7.
done

 

Link to comment

Thanks for the help getting disk 5 back online!

 

10 minutes ago, trurl said:

must have formatted

I don't know how, but I guess it's possible.  I'm willing to give this UFS explorer a shot.  How would this process work with unraid?  From what I'm seeing, I guess I install the software on my windows box and then connect the drive to attempt recovery of the data.  IF I'm able to recover the data, how do I then get that disk back in the unraid array and sync the data back to parity without wiping something else out?

 

 

Link to comment

Just to be sure my thought process is correct...  I need to remove disk 6 from the array and replace with another drive (in order to protect any data that might possibly be recoverable on the one in there now).  While the replacement disk 6 is rebuilding in the array, I'll attempt recovery of the data on the original disk.  If I am able to recover anything, I can then mount the recovered data as an unassigned device and just copy the data back to the newly rebuilt disk 6 once it is done with parity sync.  I'm not too hopeful of getting data back, but fingers crossed.  Does that sound about right?  You mentioned ufs explorer in a previous post.  Does that seem to be my best option to potentially recover data at this point?

 

I appreciate all of the help you've provided.  Fingers crossed that I can get anything back from the original drive.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...