ptr727 Posted May 10, 2020 Share Posted May 10, 2020 Hi, I'm wondering if BTRFS is the right solution for a resilient cache / storage solution? I run two Unraid servers, primary 40TB disk plus 2TB cache (4 x 1TB SSD), secondary 26TB disk plus 2TB cache (4 x 1TB SSD). On two occasions I've lost my entire Cache volume due one of the drives "failing". I say failing, but really both times it was my own fault, I didn't want to shut down, and I pulled the wrong drive, and immediately plugged it back in. But this is no different to a drive failing, or a connection failing. Pulling disks during certification of large resilient storage systems is a perfectly good test. One would expect the loss of 1 disk in a 4 disk BTRFS RAID10 config to be a non-issue, not so, first the log started showing BTRFS corruption issue, ok, seems it is not being auto fixed, then I run a cache scrub, no errors, still errors in the log, scrub with repair, reported repaired. Then I started getting docker write failures, seems my cache became read-only, and BTRFS corrupt. In both cases I resorted to rebuilding the cache from scratch, and restored appdata backups, lost the VM's (unlike docker stop/restart no easy way to backup VM's). I've run hardware RAID for a long time, including hardware that uses SSD caching, I've lost disks, pulled disks, but in all cases the array eventually comes back on its own. I simply do not have the same trust in Unraid's cache, I think it is fragile, I think it is unreliable to the point where it needs to be backed up constantly. I'd like to see the Unraid/Limetech publish their resiliency test and performance plans? What is tested for, what are known failure scenarios, what are known recoverable scenarios, are my expectations of resiliency and performance unfounded? And this is not about BTRFS, this is about Unraid, I don't care what Unraid uses for the cache volume, it could have supported SSD's in data volumes and no cache would be required, it could have used ZFS and we would have different problems, BTRFS was an Unraid choice, and I find it fragile. What are your experiences with cache resiliency? Quote Link to comment
administrator Posted June 5, 2020 Share Posted June 5, 2020 I was using BTRFS for 3 years no problem, but now I've had corruption twice about 5 months apart. It's time consuming to format and restore the data, so I won't be using BTRFS for the foreseeable future. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.