Is my cache pool working properly?

July 21, 201511 yr

I just took two brand new 500GB SSDs and installed them in my server, created two cache slots, assigned the SSDs and started the array. It said the first drive in the pool was unmountable and needed formatting so I did that. I started using the cache to set up some dockers, etc. but then I looked at the dashboard and something doesn't seem right.

It shows that the size of the pool is 1TB (if this is RAID-1 shouldn't it be 500GB?). Used and free space for cache drive look correct but on the "pool of two devices" line it shows 516GB used which makes no sense.

Also I tried stopping the array and unassigning cache 2 drive.. when I start up the array, it shows the assigned cache disk as unmountable and is asking me to format again. This can't be right. I assign cache 2 again and everything is "working" again.

I tried running a balance command which seems to run ok but it has no effect on the above. After running the balance, if I go back to the cache page where I execute the balance utility it says "No balance found on '/mnt/cache'".

Seems to me that based on this behavior, if a cache drive were to fail, I would not have a working cache!

Could someone please let me know what's going on here? Log and screenshot attached! Thanks!

log.txt

Quote

July 21, 201511 yr

Makes sense to me, actually. But hey, I'm weird!

The pool comprises two disks, 500GB each, so it has the size of 500GB+500GB=1000GB. How the 1000GB are used doesn't make any difference, there are still 1000GB in total in the pool.

Those 1000GB can be used in different ways. Since the pool is set to RAID-1 redundancy in this case, and the pool has two disks of equal size, half of that space is used for redundancy. This leaves 500GB to use for storage, and you use about 16GB of that for dockers and stuff, which leaves a total of 500GB-16GB=484GB free and 500GB+16GB=516GB used.

Quote

July 21, 201511 yr

Totally normal.

Quote

July 21, 201511 yr

Author

Ok that logic makes sense... but why does the first cache drive show as unmountable and the system wants to format it when I unassign the cache 2 drive? Shouldn't the data still be available on cache 1? Isn't that the point of a redundant cache pool?

Quote

July 21, 201511 yr

Author

Further to my question above, how can I simulate a drive failure in the cache pool to test that it's working? When I unassign the cache 2 drive and restart the array the other cache disk comes up as unmountable and unraid is asking to format it. I was led to believe by limetech that the cache would still operate in a degrated performance state if a drive were to fail.

Hoping someone can help me out with some answers here...

Quote

July 21, 201511 yr

Community Expert

Further to my question above, how can I simulate a drive failure in the cache pool to test that it's working? When I unassign the cache 2 drive and restart the array the other cache disk comes up as unmountable and unraid is asking to format it. I was led to believe by limetech that the cache would still operate in a degrated performance state if a drive were to fail.

Hoping someone can help me out with some answers here...

If you want to simulate a failure, then you should not unassign the disk. Instead leave it assigned and do something like removing the power of SATA cable.

Quote

July 21, 201511 yr

Author

Ok I can do that... will try it shortly.

So then, by unassigning the cache 2 disk, when I restart the array and it shows the other cache disk as unmountable, is this normal, expected behavior?

Quote

July 22, 201511 yr

Author

I stopped the array, powered down and disconnected cache 2. Then booted back up and the cache was functional... looking in the log it showed that a balance was underway and when it finished printed a line stating "Tower kernel: BTRFS info (device sdf1): disk deleted missing" which I believe is correct.

HOWEVER, I then powered back down, reconnected cache two, powered back up and although it shows both drives back as operational on the Main tab as before, when I click on the cache link, it states the following under the Pool information heading:

Label: none uuid: 0f3b251d-d4f8-4786-9cc4-e79c77251a11

Total devices 1 FS bytes used 15.17GiB

devid 1 size 465.76GiB used 20.03GiB path /dev/sdh1

btrfs-progs v4.0.1

...and when I try to run a balance it prints the following in the log:

Jul 21 20:20:28 Tower php: /sbin/btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/cache &>/dev/null &

Jul 21 20:20:28 Tower kernel: BTRFS error (device sdh1): unable to start balance with target data profile 16

Why can't I run the balance on the pool? Don't I need to re-balance now that I've added the second drive back in to the pool?

Quote

July 22, 201511 yr

Author

So I decided to just blow away the cache pool and start over again since I had just started playing with dockers and didn't have much to lose. I assigned one drive to cache, formatted it, stopped the array, assigned cache 2, started the array and then ran a balance. I still had dockers enabled to use a 15G image so that's why some of the space is used.

Now I see the following:

Label: none uuid: c15451c8-8f62-45a4-b6b1-c3eba515dc09

Total devices 2 FS bytes used 15.00GiB

devid 1 size 465.76GiB used 17.03GiB path /dev/sdh1

devid 2 size 465.76GiB used 17.03GiB path /dev/sdm1

btrfs-progs v4.0.1

This makes me nervous about recovering from a failed drive in the cache pool. When I simulated the fail, the cache was still fine... but when I reconnected the "failed" drive, well, I explained what happened in my previous post.

Can anyone provide some insight into this? Before I start installing and configuring production dockers and VMs I want to be sure that this all works properly and I can recover from a drive failure. Did I do something wrong when I did my drive failure simulation?

Quote

July 22, 201511 yr

Community Expert

There was some discussion over here. Don't know of anybody that has posted about success.

Quote

July 22, 201511 yr

Author

There was some discussion over here. Don't know of anybody that has posted about success.

LOL I was the one asking those questions :-)

Hoping Limetech can chime in on this one again!

Quote

Is my cache pool working properly?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)