(SOLVED) [6.9.2] One disk unmountable in my array(XFS) after setting New Config


joemama

Recommended Posts

Hi all,

 

I had a failing disk(Disk 20) that I migrated the data off of and spun down with the goal of removing it from the array before it tanked. So I did all that, physically removed it, hit New Config(w/ Preserve Assignments). I also rebooted after that succeeded to update my Nvidia Driver.

 

After it came back up it didn't have any of the disk assignments(all showed unassigned), so I re-added them all in the correct order manually. *This may be an area I made a mistake? I added all the disks in the correct order, but when I added my parity drive I got a warning saying the disk would be overwritten when I started the array. In my head I was fine with that because I had no parity errors before this so I assumed my array was intact even after spinning that drive down and removing it. So I was fine with it doing a fresh parity check and overwriting 

 

Enter the problem.

 

One disk in my array(Disk4) shows unmountable. When I clicked the disk the File System Type showed Any so I set that to XFS like the rest, but no good. Then while looking through the forums I saw a XFS repair command that I tried, but still no luck. 

 

I'm hoping someone can help me get it re-added to the array. I have attached diagnostics as I noticed in previous posts that helps you to see where things are at. The disk in question is Disk 4 - ST12000VN0008-2PH103_ZLW03STN (sdi)

 

Thanks

xfs_repair_log.txt

Edited by joemama
Link to comment
10 minutes ago, joemama said:

re-added them all in the correct order manually. *This may be an area I made a mistake?

If you accidentally assigned a data disk to a parity slot then it would be overwritten. Do you have anything that would confirm your previous disk assignments, such as Diagnostics, syslog, status email, screenshot from before? 

 

On mobile now so can't look at Diagnostics yet. 

Link to comment

I assigned the existing parity disk with parity data on it to the parity slot. I just meant I *may* have screwed up because it said it was going to overwrite that disk after I made all of my assignments and started the array.

 

If its positive in anyway that the parity disk still has valuable data, as soon as I started the array and saw the disk was unmountable I immediately stopped it and have only had it in maintenance mode since. So maybe it didnt get overwritten yet?

 

I am attaching a previous diagnostic from Dec 2020 as well for reference

 

Edited by joemama
Link to comment

I'm trying to hang tight for someone more knowledgeable than me to offer some insight so I don't screw something up by going too far. But as I'm poking around with the array still in maintenance mode, I see an option called Sync that looks like it would fire a rebuild. I'm curious if that would help or hurt my situation. My Parity also shows INVALID now, but idk if that means invalid against the newly assigned array data and that it still has the old data in tact. I can confirm that before this the Parity was valid with zero errors.

 

I'll attach an image of the button I'm referencing...

Main.png

Edited by joemama
Link to comment
34 minutes ago, trurl said:

Why don't you have anything assigned as disk1?

I wanted the unraid disk layout to match the physical layout. So since my first disk is actually my parity disk, I started at disk 2 for the array. So its just cosmetic

 

So it doesn't sound like the parity disk will help me with the issue on Disk4 at all. And I'm fine with parity overwriting once the array is intact. I just need a recommendation on what to try to get Disk4 back into the array

Link to comment
20 minutes ago, joemama said:

So since my first disk is actually my parity disk

Better to think of parity as disk0, that is how Unraid thinks of it.

 

Your old diagnostics show you had the correct assignments for parity and disk4. And the filesystem repair you did earlier certainly looked as if there was a filesystem on the unmountable disk. If it had been a parity disk then it wouldn't have done as well as it did.

 

You might try upgrading to Unraid 6.10rc2, since it has a newer version of xfsprogs, and see if it can repair the filesystem on disk4.

 

Let's see if @JorgeB has anything.

Link to comment

Good news!

 

So I was reading more articles on the xfs_repair command and somewhere said to do a reboot afterwards because the corrections might not actually take in unraid until you do(or something to that effect). So with the array still in maintenance mode I rebooted the server. When it came back up I logged in and it said array stopped - autostart disabled(or something like that) but it didn't show the disk as unmountable this time. So I went ahead and started the array normally and it came up fine. I can browse the contents and the status appears to be good!

I'll attach a final diagnostic I just ran with everything actually stood up and mounted, for posterity. And I read somewhere that if the xfs_repair command found corrupt data it would move it to a directory called Lost and Found, but I don't even see that directory so I believe it recovered everything without corruption! Its recalculating parity now

 

Thanks for the help! I definitely appreciate it since this was a pretty tense start to the day 

 

Edited by joemama
Link to comment
46 minutes ago, trurl said:

Better to think of parity as disk0, that is how Unraid thinks of it.

Going back to your comment, would it be best practice to arrange my disks better? Or will it not matter that much? I'll attach a mockup of how its currently laid out and if you have any tips or insight I'd love to hear it. This is my first run with the platform, and so far I'm loving it, but there may be room for improvement

server_mockup.png

Link to comment

Some comments regarding your setup.

 

Labels.

The physical location of the drives isn't very important, the biggest issue here is heat, I'd arrange the drives in a way that keeps the temps most consistent. Not necessarily the lowest temps possible, but most consistent over time.

Instead of focusing on the disk numbers, you should put the last few digits of the drive's serial number in a place you can see without disturbing the drive. That way when you need to replace a drive you know exactly which drive is involved.

 

Using old drives in Unraid.

Keep in mind that the parity recovery mechanism in Unraid requires not only the parity drives, but ALL the other data drives. So, you are trusting all your data to the LEAST reliable drive in the array. Do not use questionable drives in the parity array. It sucks to lose 12TB of data when one of your brand new drives decides to die unexpectedly and one of your old 4TB also dies while trying to rebuild the first failed drive. Don't use any drives in your parity array that you don't trust completely. As a corollary, only use as many drives as you need to hold your data. Empty drives in the array are still required to be read end to end completely accurately bit for bit to rebuild any failed drive.

Link to comment
21 minutes ago, JonathanM said:

As a corollary, only use as many drives as you need to hold your data. Empty drives in the array are still required to be read end to end completely accurately bit for bit to rebuild any failed drive.

I always say, each additional disk is an additional point of failure.

Link to comment
1 hour ago, JorgeB said:

Xfs_repair looks successful, was the array started in normal mode after the repair? If yes and it didn't mount please post diags.

Looks like the screenshots you posted earlier was from before the repair, even though you posted them after the repair results you posted. So, you didn't know the repair had already succeeded because you just didn't know much, and I didn't know the repair had already succeeded because the older screenshot indicated it hadn't.

Link to comment
6 hours ago, trurl said:

Did you start the array in regular mode after the repair?

This question I asked earlier would have cleared everything up but it wasn't answered or acted on.

 

edit:

Sorry, I guess it was answered

  

5 hours ago, joemama said:

only had it in maintenance mode since

 

Link to comment
3 hours ago, JonathanM said:

Some comments regarding your setup.

 

Labels.

The physical location of the drives isn't very important, the biggest issue here is heat, I'd arrange the drives in a way that keeps the temps most consistent. Not necessarily the lowest temps possible, but most consistent over time.

Instead of focusing on the disk numbers, you should put the last few digits of the drive's serial number in a place you can see without disturbing the drive. That way when you need to replace a drive you know exactly which drive is involved.

 

Using old drives in Unraid.

Keep in mind that the parity recovery mechanism in Unraid requires not only the parity drives, but ALL the other data drives. So, you are trusting all your data to the LEAST reliable drive in the array. Do not use questionable drives in the parity array. It sucks to lose 12TB of data when one of your brand new drives decides to die unexpectedly and one of your old 4TB also dies while trying to rebuild the first failed drive. Don't use any drives in your parity array that you don't trust completely. As a corollary, only use as many drives as you need to hold your data. Empty drives in the array are still required to be read end to end completely accurately bit for bit to rebuild any failed drive.

 

 

2 hours ago, trurl said:

I always say, each additional disk is an additional point of failure.

 

I appreciate both of these points and didn't think about the points of failure by coasting lackluster disks. I'll keep that in mind going forward and bump up my time table to phase out the little disks with better larger disks

Link to comment
2 hours ago, trurl said:

This question I asked earlier would have cleared everything up but it wasn't answered or acted on.

 

edit:

Sorry, I guess it was answered

  

 

 

Yep, just in case someone else stumbles across this post with a similar issue...I believe the xfs_repair did the trick, but even after it ran, the disk still showed unmountable. And I was timid to act on a reboot or change because in the world of data recovery(or support in general), your best bet is dont touch anything until you know lest you cross a line of no return that no one can help you out of. But after the reboot, the unmountable flag was cleared and I could start it up like normal.

 

Thanks again for all the help!

Link to comment
  • joemama changed the title to (SOLVED) [6.9.2] One disk unmountable in my array(XFS) after setting New Config
6 hours ago, joemama said:

But after the reboot, the unmountable flag was cleared and I could start it up like normal.

The same is what I expect would have happened if you had stopped the array and then restarted it in Normal mode so that Unraid makes another attempt to mount the drive.   The reboot should not be required.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.