Jump to content

[SOLVED] Drive shows as 'Unmountable: not mounted' after rebuilding onto new drive


Recommended Posts

Hello,

 

Been using UnRAID for a few years now, had a few bumps along the way and using the forums usually been able to help myself.

 

I have read through a couple of other posts with similar/same problems, but they are a couple years old and I want to make sure I proceed correctly.

 

A month or so ago I replaced my parity drive with a larger one, had a 10TB and replaced with a 12TB, since those are getting better price wise. I have mosttly 8TB drives and now the one 12TB and precleared the 10TB and using it as a data drive now. I also bought a second 12TB and precleared it as well to have as a spare.

 

A few days ago I was getting a lot of read errors and drive 6 dropped out. Drive 5 also had some read errors, but UnRAID did not kick it out. I know this can be a bad/lose power or data cable, so I stopped the array and shut down the system. I went through and reseated all the drive connections and powered the system back up and drive 6 was still marked as not useable. So I thought to be safe I would put the precleared 12TB spare drive in the system and replace drive 6 with it, then later I would preclear the original drive 6 and see if it really had problems. I shutdown the system and installed the 12TB spare, powered up, and selected the new drive for slot 6. I started the array and the rebuild took place. It stopped a couple time from the parity check pause plugin but it resumed and finished. I thought it was OK, but drive six showed as after the parity completed. It may have listed it that way during the rebuild, but I am not sure at this point. I still have the original drive 6 in the system, and it shows up right now as 'Dev 2' in unassigned devices as shown in the screen shot.

 

I have attached a diagnostic and screen shot. I'd like some advice as how to proceed, as I am not sure the REAL state of the array.  

 

Thank you for your time and help.

 

Matt

 

 

2021-03-18 -Tower_Main.png

tower-diagnostics-20210317-2210.zip

Link to comment

Ok, I ran the disk check without the -n and then added the -L as instructed. Did not take long but generated thousands of entries like this:

 

entry ".." in directory inode 8879439719 points to non-existent inode 4394111961
bad hash table for directory inode 8879439719 (no data entry): rebuilding
rebuilding directory inode 8879439719

 

After doing the disk check, I re-started the array and the drive shows up correctly, but only has 1.31TB of the data that used to be just over 7TB.

 

So I'm not sure if I trust the data on the device. Would it be any benefit to put in a new drive (I have two more 12TB drives un-used but NOT yet pre-cleared) and rebuild onto it?

 

Just as an FYI, the drive now in UnRAIDs slot 6 is on a new cable, not the same cable from the original drive 6 that was reporting read errors and started this whole process.

 

I may also want to change one or more of the SAS cables, but don't want to change too much to impede troubleshooting.

 

Thanks again for help with this.

 

Matt

Edited by MatrixMJK
Link to comment
9 hours ago, MatrixMJK said:

Would it be any benefit to put in a new drive (I have two more 12TB drives un-used but NOT yet pre-cleared) and rebuild onto it?

That won't help, since parity is always updated real time.

 

Best bet is the old disk, it looks healthy, so you should be able to mount it with UD and copy the data to the new disk/array, note that you need to change the XFS UUID first to be able to mount both at the same time, that can be done in the UD settings.

Link to comment

Thank you for that suggestion. I did try mounting it but got the UUID error. 

 

I now tried to change the UUID, but still getting a superblock error. Below is the log from before the UUID change, and after. Looks like it can't change the UUID till the log replays or the superblock is repaired.

 

Mar 19 08:57:53 Tower unassigned.devices: Adding disk '/dev/sdd1'...
Mar 19 08:57:53 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdd1' '/mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W'
Mar 19 08:57:53 Tower kernel: XFS (sdd1): Filesystem has duplicate UUID 5f37ccbd-b83f-40d0-be94-6aa9b2c0c81f - can't mount
Mar 19 08:57:53 Tower unassigned.devices: Mount of '/dev/sdd1' failed. Error message: mount: /mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W: wrong fs type, bad option, bad superblock on /dev/sdd1, missing codepage or helper program, or other error.


Mar 19 08:59:57 Tower unassigned.devices: Changing disk '/dev/sdd' UUID. Result: ERROR: The filesystem has valuable metadata changes in a log which needs to be replayed. Mount the filesystem to replay the log, and unmount it before re-running xfs_admin. If you are unable to mount the filesystem, then use the xfs_repair -L option to destroy the log and attempt a repair. Note that destroying the log may cause corruption -- please attempt a mount of the filesystem before doing this.
Mar 19 09:00:05 Tower unassigned.devices: Adding disk '/dev/sdd1'...
Mar 19 09:00:05 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdd1' '/mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W'
Mar 19 09:00:05 Tower kernel: XFS (sdd1): Filesystem has duplicate UUID 5f37ccbd-b83f-40d0-be94-6aa9b2c0c81f - can't mount
Mar 19 09:00:05 Tower unassigned.devices: Mount of '/dev/sdd1' failed. Error message: mount: /mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W: wrong fs type, bad option, bad superblock on /dev/sdd1, missing codepage or helper program, or other error.

 

So I ran from the terminal:

 

root@Tower:~# xfs_repair -L /dev/sdd1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
sb_fdblocks 221012495, counted 222965907
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 5
        - agno = 4
        - agno = 2
        - agno = 3
        - agno = 1
        - agno = 7
        - agno = 6
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (1:1780937) is ahead of log (1:2).
Format log to cycle 4.
done
root@Tower:~# 

 

but still getting this in the disk log after the xfs_repair and trying to change the UUID in 'Settings - Unassigned Devices' and no mounting of the drive:

 

Mar 19 09:10:44 Tower unassigned.devices: Error: shell_exec(/usr/sbin/xfs_admin -U generate /dev/sdd1) took longer than 1s!
Mar 19 09:10:44 Tower unassigned.devices: Changing disk '/dev/sdd' UUID. Result: command timed out
Mar 19 09:10:51 Tower unassigned.devices: Adding disk '/dev/sdd1'...
Mar 19 09:10:51 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdd1' '/mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W'
Mar 19 09:10:51 Tower kernel: XFS (sdd1): Filesystem has duplicate UUID 5f37ccbd-b83f-40d0-be94-6aa9b2c0c81f - can't mount
Mar 19 09:10:51 Tower unassigned.devices: Mount of '/dev/sdd1' failed. Error message: mount: /mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W: wrong fs type, bad option, bad superblock on /dev/sdd1, missing codepage or helper program, or other error.

 

Thanks!

 

Matt

Link to comment

Thank you again for the command (I don't have a lot of experience with XFS tools)

 

root@Tower:~# xfs_admin -U generate /dev/sdd1
totally zeroed log
Clearing log and setting UUID
writing all SBs
new UUID = df031299-61a0-458e-98b9-9bf4e1cd2f1d
root@Tower:~#

 

So that took a minute or so but worked, but now the disk log shows this when trying to mount:

 

Mar 19 09:16:22 Tower unassigned.devices: Error: shell_exec(/usr/sbin/xfs_admin -U generate /dev/sdd1) took longer than 1s!
Mar 19 09:16:22 Tower unassigned.devices: Changing disk '/dev/sdd' UUID. Result: command timed out
Mar 19 09:31:54 Tower unassigned.devices: Adding disk '/dev/sdd1'...
Mar 19 09:31:54 Tower unassigned.devices: Mount drive command: /sbin/mount -t xfs -o rw,noatime,nodiratime '/dev/sdd1' '/mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W'
Mar 19 09:31:54 Tower kernel: XFS (sdd1): Mounting V5 Filesystem
Mar 19 09:31:54 Tower kernel: XFS (sdd1): Corruption warning: Metadata has LSN (1:1780937) ahead of current LSN (1:2). Please unmount and run xfs_repair (>= v4.3) to resolve.
Mar 19 09:31:54 Tower kernel: XFS (sdd1): log mount/recovery failed: error -22
Mar 19 09:31:54 Tower kernel: XFS (sdd1): log mount failed
Mar 19 09:31:54 Tower unassigned.devices: Mount of '/dev/sdd1' failed. Error message: mount: /mnt/disks/ST8000DM004-2CX188_ZCT0LQ5W: wrong fs type, bad option, bad superblock on /dev/sdd1, missing codepage or helper program, or other error.

 I'm not sure what it is asking me to run with xfs_repair. I'll do some research too.

 

Thank you very much for your time and effort, it is appreciated!

 

Matt

Link to comment

Sweet that worked and I can mount it! The "(>= v4.3)" threw me, I thought it was asking to specify a version of the tool.

 

I can move the data back now I hope!

 

By the way what is a safe way to move those files back into place? Should I use 'Krusader' on the UnRAID box to the specific drive or just to the shares from a client machine? I think I have read not to copy directly to the drive share eg. 'Drive 6'.

 

Just so I don't run into this again, did I do something wrong or miss a step when replacing the drive that had issues? I have replaced a couple drives in the past but never had a problem doing it. 

 

You guys need a tip jar (or do you have one?)

 

Thanks again for the assistance!

 

Matt

Edited by MatrixMJK
Link to comment
1 hour ago, MatrixMJK said:

By the way what is a safe way to move those files back into place?

Yes.

 

1 hour ago, MatrixMJK said:

Should I use 'Krusader' on the UnRAID box to the specific drive or just to the shares from a client machine?

If you're using Windows 10 you can use windows explorer to move/copy from the UD device to the array, and the transfer will still be done locally, it won't use the network.

 

1 hour ago, MatrixMJK said:

Just so I don't run into this again, did I do something wrong or miss a step when replacing the drive that had issues?

Diags posted don't show anything out of the ordinary, but before the time covered there you mentioned another disk with read errors while that one was disable, this could very easily have caused issues with the emulated disk, since you only have one parity all the other disks need to be 100% for the emulated disk to be OK.

Link to comment

So I just ran 'xfs_repair -nv' from the GUI on disk5 (the other one that showed read errors and the output looks good to me:

 

Phase 1 - find and verify superblock...
        - block cache size set to 3018840 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 1283169 tail block 1283169
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 7
        - agno = 2
        - agno = 4
        - agno = 6
        - agno = 5
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - agno = 4
        - agno = 5
        - agno = 6
        - agno = 7
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

        XFS_REPAIR Summary    Fri Mar 19 15:07:04 2021

Phase		Start		End		Duration
Phase 1:	03/19 15:06:46	03/19 15:06:46
Phase 2:	03/19 15:06:46	03/19 15:06:46
Phase 3:	03/19 15:06:46	03/19 15:06:56	10 seconds
Phase 4:	03/19 15:06:56	03/19 15:06:56
Phase 5:	Skipped
Phase 6:	03/19 15:06:56	03/19 15:07:04	8 seconds
Phase 7:	03/19 15:07:04	03/19 15:07:04

Total run time: 18 seconds

 

Anything else I can/should check for silent corruption?

 

 

Link to comment
  • JorgeB changed the title to [SOLVED] Drive shows as 'Unmountable: not mounted' after rebuilding onto new drive

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...