Jump to content

"Parity device is disabled"


Recommended Posts

Hi everyone,

I'm desperately looking for some help getting my data and array back to a working state. I've started a new topic because the original post I created was started months ago and I think it's being missed by the wizards in here. I link the original post here because it has the full details and history of what's been tried and tested.

 

 

Essentially, I'm now at a stage where Disk #5 (12Tb) and Disk #6 (2Tb) are "unformatted". This makes sense for the 2Tb given it's new, whereas the 12Tb is the drive that was originally causing trouble at the beginning of the above post. But the real clanger at the moment is that the second parity disk (12Tb) is in a disabled state. 

 

I dare not make any other changes or attempts because I feel like I'm close to irreparable damage.

 

I welcome any good news and guidance you can all bring.

 

 

lucindraid-diagnostics-20240117-0141.zip

Link to comment

Thanks for the reply, here is the diagnostic with the array running in normal mode.

 

Late night thinking got me to realise, before I started working on the NAS this week, the situation was already at:

 

  • Parity 1 (12Tb) - OK
  • Parity 2 (12Tb) - OK
  • Disk 1 (2Tb) - OK
  • Disk 2 (2Tb) - OK
  • Disk 3 (2Tb) - OK
  • Disk 4 (2Tb) - OK
  • Disk 5 (2Tb) - Removed (failed months ago and was removed while waiting for warranty replacement)
  • Disk 6 (12Tb) - In some kind of broken state - see original post in other thread.

 

 

This week, in one move I re-added Disk 5 with the brand new disk and checked all cables, especially re-seating Disk 6's cables. I believe now I shouldn't have re-added Disk 5 until the array was stable with Disk 6 first. BUT... until this recent work, I still apparently had two stable parity disks, and one broken data disk. Even IF Parity #2 is faulty, this situation should still be within my fault tolerance, correct? Therefore, should I re-remove Disk 5, and allow the array to try and rebuild Disk #6 using Parity #1 ?

 

Edit: FYI - The SMART Extended Self-Test completed without error on Parity Disk #2.

 

I'll await your response(s), and thank you again for taking the time.

 

 

lucindraid-diagnostics-20240117-1011.zip

Edited by siege801
Link to comment

According to your diagnostics, parity2 and disk6 are disabled, and disks 5 and 6 are unmountable.

 

First thing is to see if we can repair filesystems on the unmountable disks. Since disk6 is also disabled, we will be working with the emulated disk6. Repairing its filesystem before rebuilding will allow you to rebuild a mountable disk.

 

Check filesystem on disk5 and disk6. Be sure to use the webUI and not the command line. Capture the output and post it.

Link to comment

Thanks for continuing to help. I really do appreciate it.

 

I've attached the output from Disk #6 as a text file. Disk #5 is still running, but has so far only produced:

 

Phase 1 - find and verify superblock...
couldn't verify primary superblock - not enough secondary superblocks with matching geometry !!!

attempting to find secondary superblock...

<snip - a long line of dots>

and beneath this it says the check is still Running.

 

Neither output looks promising to me, which is sad. 

 

Just a reminder, Disk #5 has only just been added. I assume that is why it's in a broken state, as it likely hasn't been cleanly pulled into the array.

unRAID file system check - disk 6.txt

Link to comment
9 hours ago, siege801 said:

Just a reminder, Disk #5 has only just been added.

If disk5 hasn't been formatted after adding it to the array, then Unmountable is correct for that disk until you format it. But DON'T format anything until we get disk6 working somehow, you don't want to format disk6.

 

Emulated disk6 does seem to need a lot of filesystem repair.

 

To avoid confusion, I am now going to refer to the actual physical disk that was assigned as disk 6 by the last characters of its serial number, B5SX.

 

Since B5SX is not currently being used, Unassign it and see if it is mountable as an Unassigned Device. 

 

If so, it might be better to New Config B5SX back into the array as disk6 and rebuild parity instead. You need to rebuild parity2 anyway.

 

If B5SX is not mountable as an Unassigned Device, we can try check filesystem on it and see if it looks better than emulated disk6.

Link to comment

I've stopped the array and removed B5SX from it.
I have then mounted the same as an unassigned device, and it appears to be mounted without a problem (I can browse the directory structure).

 

Looking at your last post, it seems you want me to "New Config B5SX back into the array as disk6 and rebuild parity instead."

 

Could I trouble you for a breakdown of this process? 

 

Also, before I let myself get too excited that the disk is mountable. This is the disk that was showing as emulated back in September. What are the chances the mounted data is actually up to date and useful? (But alas, I'll try not get ahead of where we're at).

Edited by siege801
Link to comment
13 hours ago, siege801 said:

This is the disk that was showing as emulated back in September. What are the chances the mounted data is actually up to date and useful? (But alas, I'll try not get ahead of where we're at).

Anything written to the emulated disk will not be on the physical disk.

 

Leave that original disk out of the array for now. We still have some options we can pursue.

 

Let's go ahead with repairing the filesystem of the emulated disk, and see what results that gives us.

 

Check filesystem on emulated disk6. Be sure to use the webUI and not the command line. This time, remove -n (nomodify), so it will actually make the changes it wants to do the repair. If it asks for it, use -L (Unraid has already determined it is unmountable).

 

Capture the ouput and post it.

Link to comment

Ok, I've done that. It looks like maybe it wants me to run that again?

 

End of output:

Metadata corruption detected at 0x453030, xfs_bmbt block 0x10081ce0/0x1000
libxfs_bwrite: write verifier failed on xfs_bmbt bno 0x10081ce0/0x8
xfs_repair: Releasing dirty buffer to free list!
cache_purge: shake on cache 0x50c6f0 left 5 nodes!?
xfs_repair: Refusing to write a corrupt buffer to the data device!
xfs_repair: Lost a write to the data device!

fatal error -- File system metadata writeout failed, err=117.  Re-run xfs_repair.

 

Full output:

unRAID file system check without -n - disk 6.txt

Edited by siege801
Link to comment

Emulated disk6 mounts and has more than 6TB of data. Repair created a 'lost+found' share on that disk. Check that to see what repair couldn't figure out.

 

I recommend installing Dynamix File Manager plugin. It will let you work with files directly on the server.

Link to comment

This is real progress! Thank you so much again for the help so far.

 

I'm very comfortable on the command line. I've just been working through the output of:

du -ahx --max-depth=1 /mnt/disk6/lost+found/ | sort -k1 -rh | less

 

So far I've determined:

  • 3.3Tb in both /mnt/user/lost+found, and /mnt/disk6/lost+found - presumably the same data, but this is to be confirmed.
  • Approximately 9,800 sub directories within the /mnt/disk6/lost+found
  • Approximately 7,500 of these have a directory size > 0
  • 1.9T are Virtual Machine images that I have backed up anyway.

 

Notably, I have backups of the irreplaceable data, but there is further data that is not economically feasible to backup. With that said, the more of it that I don't have to acquire through alternate means the better.

 

I can spend the day working through the contents of the other sizeable sub directories of lost+found and come back to you once I've retrieved what is feasibly useful.

 

Questions:

  1. Is it safe to leave the array running? I've stopped the Docker service.
  2. Also, is it safe to move content from /mnt/disk6/lost+found into the correct location under /mnt/user/ ?

 

In case I haven't mentioned, my sincere gratitude for your guidance so far!

Edited by siege801
Link to comment
3 hours ago, siege801 said:

I'm very comfortable on the command line.

Don't get too comfortable with that. For example

 

3 hours ago, siege801 said:

3.3Tb in both /mnt/user/lost+found, and /mnt/disk6/lost+found - presumably the same data

This is correct, but

3 hours ago, siege801 said:

safe to move content from /mnt/disk6/lost+found into the correct location under /mnt/user/

Absolutely not. Never mix disks and user shares when moving or copying. Linux doesn't know that the source and destination may be a path to the same file, and so will try to overwrite what it is trying to read.

 

 

Link to comment

Since parity2 and disk6 are both disabled, you have no redundancy currently.

 

Since emulated disk6 is mountable, it should be safe to format disk5. On the array operation page, it will show you which disks it thinks need formatting. You can format if disk5 is the only one listed.

 

And you can rebuild only parity2, or rebuild parity2 and also rebuild disk6 to a spare disk at the same time.

 

Original disk6 B5SX can be read as an Unassigned Device to help recover some of that lost+found.

 

Link to comment
20 minutes ago, trurl said:

Never mix disks and user shares

/mnt/disk6/someshare  is just the portion of someshare that is on disk6. All top level folders on pool(s) and array are combined into the user shares.

 

If you create a user share, top level folders named for the share are created on array and pools as needed in accordance with the settings for the share.

 

Conversely, any top level folder on array or pools is automatically part of a user share named for the folder. If you don't make settings for a user share it has defaults. This is how lost+found became a user share.

 

Since all of lost+found is on disk6, all of it was originally on disk6 as part of whatever folder corresponds to the user share it was in.

 

You can work directly with the folders on disk6, or work directly with the user shares including lost+found.

20 minutes ago, trurl said:

Never mix disks and user shares when moving or copying. Linux doesn't know that the source and destination may be a path to the same file, and so will try to overwrite what it is trying to read.

 

Link to comment
  • 1 month later...

Hi @trurl,

 

Again, I want to repeat how thankful I am for your help. I fully intend on dropping a donation on your link.

 

I've gone ahead and recovered what I need/want from the lost+found. Would you be able to give a little more guidance on how I now get the two disks back into the array?

Link to comment

Thanks @trurl.

 

From what I've read, I believe I can only do one operation at a time.

 

From the documentation:

 

NOTE:

You cannot add a parity disk(s) and data disk(s) at the same time in a single operation. This needs to be split into two separate steps, one to add parity and the other to add additional data space.

 

Just to clarify, intended outcome is to have 2x 12Tb parity, and the the rest as data. 

 

Thanks again!

lucindraid-diagnostics-20240229-1716.zip

Link to comment
7 hours ago, siege801 said:

You cannot add a parity disk(s) and data disk(s) at the same time in a single operation. This needs to be split into two separate steps, one to add parity and the other to add additional data space.

My understanding is that you are not adding any disks, you are rebuilding existing disks, possibly to replacements.

Link to comment

Oh right, I misunderstood that re-adding was "adding", but I see the distinction.

 

The 2Tb disk is a "new" disk but it's replacing one that failed a while ago.

 

The 12Tb parity I believe is emulated and just needs to have parity refreshed on it.

 

And the 12Tb that is currently mounted through UD is going back to the array as it was before.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...