LittleMike Posted August 12, 2021 Share Posted August 12, 2021 So I'm sure I screwed up something somewhere. Looking for some assistance. I had 2x 512GB SSD's in a cache pool (default BTRFS RAID1). One of the drives died. This happened a few months back. I was able to remove the bad drive, no errors. Everything seemed okay (though under the hood, they may not have been). Recently I bought a 1TB Samsung SSD. Put that in, added it to the cache pool. Everything seemed to be okay still. The second 512GB drive started giving me errors again. So In my screwing around trying to remove it and just use the 1TB, I started getting BTRFS pool profile errors. I ran a balance and the error went away. However, my 1TB drive is showing as 70% used. Looking at btrfs fi, I'm not sure if it's correct. I am not versed well enough in btrfs to know for sure, but looks like some of my data is possibly duplicated, but also not entirely. Like maybe a balance didn't complete and I screwed something up? The original 512 is still in the machine but now is an unassigned device. I was going to try to re-add it back to the pool but I got the warning that all data would be wiped, so I decided against it. Could someone just take a peek and see if this looks right? Data, RAID1: total=75.00GiB, used=55.18GiB Data, single: total=194.00GiB, used=147.93GiB System, RAID1: total=32.00MiB, used=64.00KiB System, single: total=32.00MiB, used=0.00B Metadata, RAID1: total=1.00GiB, used=455.95MiB Metadata, single: total=12.00GiB, used=9.98GiB GlobalReserve, single: total=512.00MiB, used=0.00B My concern is the RAID1 entries. I should be Single now, I think. But that's where I probably screwed up. Attaching Diagnostics in case it helps. blackpearl-diagnostics-20210811-2218.zip Quote Link to comment
JorgeB Posted August 12, 2021 Share Posted August 12, 2021 sdg is failing and because of that the balance aborted, since there's a lot of data using the single profile on that device you can't just remove it, copy everything you can from the pool then re-format with just the good device. Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 5 hours ago, JorgeB said: sdg is failing and because of that the balance aborted, since there's a lot of data using the single profile on that device you can't just remove it, copy everything you can from the pool then re-format with just the good device. When you say copy everything from the pool, do you mean both? Because my concern is that neither contains all the data. Quote Link to comment
JorgeB Posted August 12, 2021 Share Posted August 12, 2021 Pool is still both devices. Quote Link to comment
trurl Posted August 12, 2021 Share Posted August 12, 2021 And you can't work with them separately. Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 7 minutes ago, JorgeB said: Pool is still both devices. 2 minutes ago, trurl said: And you can't work with them separately. Understood. So forgive the noobish question, what's the best way to do that? And more importantly, what's the best way to restore it? If I do an rsync /mnt/cache is that going to grab everything? And then format the good drive, remove the bad one, then rsync back? Will it then see it as one profile? Like is it just metadata and how do I prevent that from being restored back? Oh, when you say format, did you mean the cache drive or the unRAID OS drive, just to clarify? Quote Link to comment
JorgeB Posted August 12, 2021 Share Posted August 12, 2021 1 hour ago, LittleMike said: If I do an rsync /mnt/cache is that going to grab everything? Depends on the state of the failing device, also and if using the array as destination make sure you rsync to a disk, or use /mnt/user0/share. 1 hour ago, LittleMike said: And then format the good drive, remove the bad one, then rsync back? Will it then see it as one profile? Make a new pool of the remaining device only and format it, you can wipe it first. Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 6 minutes ago, JorgeB said: Depends on the state of the failing device, also and if using the array as destination make sure you rsync to a disk, or use /mnt/user0/share. Make a new pool of the remaining device only and format it, you can wipe it first. Okay. So let me see if I got this right: Copy everything from /mnt/cache somewhere (rsync to /mnt/user0/share or to a disk, or WinSCP to another machine, whatever, correct?) Create a new cache pool of 1 device using the new/working drive Format new drive Copy everything backed up to new drive Remove old pool Restart docker services/VM's etc. Is that it? Just copying back all of the data will line everything up correctly? Is that because the configuration is on the OS drive? So is the profile information stored on the cache pool instead? I'm just trying to figure out how this happened in the first place so I can prevent it from happening again. Quote Link to comment
trurl Posted August 12, 2021 Share Posted August 12, 2021 13 minutes ago, LittleMike said: Remove old pool Create a new cache pool of 1 device using the new/working drive Format new drive Copy everything backed up to new drive Changed the order for you. You don't want a different pool (name) in the end or you will have to deal with reconfiguring some things to use a different pool. Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 3 minutes ago, trurl said: Changed the order for you. You don't want a different pool (name) in the end or you will have to deal with reconfiguring some things to use a different pool. Okay, so remove old pool first. I didn't even realize you can name the pools. I should make the new one just "Cache" like the existing one. Is that why you suggest removing the old one first, because the defaults should do what I want? Quote Link to comment
trurl Posted August 12, 2021 Share Posted August 12, 2021 If you have more than one pool they must have different names so you can work with them separately. "Cache" is the name of the pool from pre 6.9 releases, so that is what your shares are likely configured to use and any path to cache you might have specified explicitly. Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 1 minute ago, trurl said: If you have more than one pool they must have different names so you can work with them separately. "Cache" is the name of the pool from pre 6.9 releases, so that is what your shares are likely configured to use and any path to cache you might have specified explicitly. Ah that totally makes sense. Yeah, this is on 6.9.2 but was set up on whatever revision it was 3 years ago or so, so definitely pre 6.9. Okay, currently backing up the contents of /mnt/cache. I should just need appdata, domains, downloads, and system, right? Do I need to back up from anywhere else like /mnt/user0 or anything? Quote Link to comment
LittleMike Posted August 12, 2021 Author Share Posted August 12, 2021 (edited) Okay, so I think there were some steps that were missing. I deleted the pool, started, stopped the array, created a new pool, started the array and nothing had changed and it didn't give me the option to format. So I stopped the array, clicked on the cache pool and selected Erase. Restarted the array and it yelled at me "Unmountable: Unsupported partition layout" Going to try messing around with trying to get it formatted. *EDIT* A combination of deleting the pool. Start/stop array, create pool with no drive assigned, start/stop array, assign the drive, then the Format checkbox appeared. Now to copy my data back over and cross my fingers. Edited August 12, 2021 by LittleMike Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.