Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Swapping disk in cache pool gone wrong?

Featured Replies

So, in this specific setup, we had a space issue with cache.

  • Cache pool was 2 x M.2 500GB each, btrfs in RAID1 configuration.

  • We bought 2 x M.2 1TB.

  • I checked btrfs balance and it was declaring raid1 as expected and fully balanced.

  • In "MAIN" tab it showed "Cache" and "Cache 2".

  • Clicking "Cache" shows me btrfs balance status, clicking "Cache 2" doesn't show me such details.

  • I followed the OFFICIAL (in documentation) instructions of replacing a cache disk with a larger one. They seem to differ a bit with array disk replace instructions. I followed cache disk replace instructions to the letter: https://docs.unraid.net/unraid-os/using-unraid-to/manage-storage/cache-pools/#replace-a-disk-in-a-pool

  • So I stopped server, DID NOT unassign the disk I was going to replace (this is where instructions differ from array disk), replaced one of the M.2 disks and booted server.

  • Server shows missing for one of the cache pool disks (actually the "Cache" one, not the "Cache 2" one - this was because I chose a random M.2 from the two motherboard slots).

  • I assigned the new 1TB to "Cache" and started the array.

  • This according to instructions, should "show" immediately that balance is broken and supposedly start balancing to the new 1TB by itself? (i.e. rebuild raid1) It did not!

  • So I thought, maybe I should start "full balance" myself. So I did. It took almost 1.5 hours.

  • When it finished, I thought I would then be able to now swap the SECOND 500GB to 1TB and do a new balancing afterwards.

...and here is where things go boom. Although the 1TB SHOWS as assigned to the pool (see screen grab), in actual balance status shows as missing (still expects the old one?) and the balance I did before (and waited 1.5 hours) did nothing?

image.png

image.png

...and now I cannot proceed.

  • I cannot replace the second 500GB with my second 1TB because the first 1TB actually never got built.

  • I cannot unassign the 1TB, because it tells me "too many disks missing"!!! (this is the most weird) THIS IS WHAT IS THE WORSE AS I CANNOT FOLLOW THE "remove disk from pool" procedure.

  • I cannot put back the "old" 500GB (although it expects it, see warning in screen grab below), because the "old" 500GB does not have proper sync any more (because the live 500GB cache runs VMs and containers and they have worked and changed content since the time I removed the "old" 500GB). So I cannot just put it back.

    image.png

  • I cannot unassign both 1TB and 500GB (and "reset" the cache pool somehow), because it again says "too many disks missing".

  • I cannot start even in mainenance mode when the 1TB or both the 500GB and 1TB are unassigned!!!

    ...in other words the 1TB is empty, BUT STILL NEEDS IT!!!

So now I cannot just "reset" my cache pool (without losing cache contents)!!!

What can I do???

I suspect that I POSSIBLY need to make this a "single" disk pool? (but make sure to keep the 500GB that is in "Cache 2" position now) AND THEN re-add the 1TB?

AND THEN how to replace the second one? Because as it seems, the official instructions ARE NOT VALID.

Right now, so that the system stays in production for now, I just let it work with 1 500GB cache (Cache 2) and 1 1TB (empty and as you can see from first screen grab above... actually unused as there are no reads/writes to it).
The VMs are online, containers work fine. But no cache redundancy, nor solved the space issue.

I include diags. Help please?

I hope to fix this within the weekend.


Note in the attached diags it might show that parity is rebuilding. This is normal, since the original issue could not be resolved, at least I swapped a data disk that I wanted to replace with larger, so that this finishes by the time I know how to fix the cache.

Note also that my array has plenty of space and in theory I can move VMs and containers (?) to the (slower) array and remove cache fully temporarily if needed?

spezierixl-diagnostics-20260501-2122.zip

Edited by NLS

Solved by JorgeB

  • Community Expert

Was the pool raid1 when you started the replacement?

  • Author
2 hours ago, JorgeB said:

Was the pool raid1 when you started the replacement?

As I say above, before replacing any disk (when there were two 500GB disks) I checked "balance" and it was clearly showing "no balance needed" and "raid1". I didn't keep a screen capture of this because I expected it would work fine.

What I need now and I suspect is the solution is this (please tell me if this plan is correct):

  • Actually make it a single disk pool (the one 500GB disk that remained on the machine and has all the latest content - make it NOT need the 1TB disk that is unused after all or the old 500GB).

  • Re-add the 1TB disk and actually this time MAKE IT get a mirror of the 500GB (so maybe make single pool again multi disk pool).

  • Then replace the original 500GB with the second 1TB and again "balance". (but verify it doesn't show as "missing" this time)

I think the instructions are not correct. I probably had to start the array after unassigning one of the 500GB disks and stop it again (like we do for array disk replacements) or maybe it needed TWICE to balance? (one with second disk unassigned and AGAIN with new disk added?).

What I need now, is a way out, as I have the system not-expanded and not-redundant (so worse than when I started) and for some reason it still "needs" (!?!?!?) the unused 1TB disk. This is very wrong.

image.png

Edited by NLS

  • Community Expert
  • Solution

The instructions are correct, woudl need the diags before rebooting to see what happened, but to address the issue now, you can do this:

With the array running type

btrfs device remove missing /mnt/cache

Then stop the array and reimport the pool with the remaining device only.

on main click on the first device for that pool and then "remove pool"

back on main, create a new pool with the same name and 1 slot

assign the good pool device (nvme0n1), leave the filesystem set to auto

start the array to import the pool

Now to create a mirror

Stop array

change slots to 2

Assign the other devie

start array to create a mirror.

  • Author

Thanks I will try these.

  • Author
1 hour ago, JorgeB said:

...

I did and it seems to go ok. This time it doesn't show anything missing and DOES write things to 1TB disk while the contents are actually active (VMs are running and containers).

This will probably take around 1-1:30 h.
Can you follow up (regardless of instructions in documentation) and tell me exactly how to properly remove the (remaining, active) 500G and replace it with the (second, empty) 1TB and re-sync the raid1?

Thank you.

image.png

...attached progress.

Edited by NLS

  • Author

...btw why ID is 2 and 3? Is this correct?

It finishes in 20-30 min. If you can quickly like above give me the correct process to then swap the 500GB with the second 1TB, it would be great.

(worst case scenario, I will follow official docs, in other words, stop, remove disk, add disk, start, without any bringing online of array in the middle steps - unlike when replacing array disk)

(and if it does again the same problem with the second disk, I will do the same recovery steps that you say above, using the first 1TB as the "mother")

Edited by NLS

  • Author

Final (?) update.

After rebooting (with new second 1TB) and assigning the new disk, it started rebuilding the raid1 as it seems (see screen shots). Remember this DID NOT happen when I did the first switch (and why I started this thread).

So I think we are good now, except one thing: For some reason when I had 500GB with data and 1TB empty but system confused, it showed me a total of 750GB size???

Now that I am with 2 x 1TB cache, rebuilding the second 1TB, it still shows 750GB...
Not 500GB, not 1TB. I want to believe that after the syncing, it will raise to 1TB.

after-reboot-1.png

after-reboot-2.png

after-reboot.zip

rebuilding-1-why-750.png

rebuilding-2.png

  • Author

One more screenshot, while it rebuilds...

image.png

...it shows the 500GB as missing, It doesn't show here that balancing IS taking place.

I am waiting to see what will happen after it finishes the balancing.

  • Author

And yes there is another update:

image.png

Now it shows properly as 1TB the cache and DOES show that it is balancing.

It just needed a few minutes to "catch up" with the changes?

  • Community Expert
17 hours ago, NLS said:

Now it shows properly as 1TB the cache and DOES show that it is balancing.

Yep, it's normal to show wrong stats during the babble when one of the older devices was smaller.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.