DrJake Posted December 29, 2021 Share Posted December 29, 2021 Hi all, I currently have 2 drives assigned to the cache pool: 1TB SATA ssd 500GB NVMe ssd (intend to remove from the pool I read the FAQ on removing the cache drive, but I'm encountering issues I'm not sure how to proceed. I'm running Unraid 6.9.2, followed the instructions to the letter. 1. stop the array 2. unassign pool disk to remove 3. did not do any re-ordering 4. start the array (after checking the "I'm sure" box next to the start array button) The system is asking me if I want to format the cache2 drive, which is the one I intend to keep in the pool. The drive that was removed from the pool could not be mounted as an unassigned device. I planned on doing this later after the cache pool is sorted... Having encountered this problem, I didn't really want to deal with a server problem during the holidays... I tried to add the drive I removed back into the pool, but seems like I cannot anymore... I read on the forum about physically removing the drive, but that was an older version of Unraid, so I wanted to check with the pros before I shut down the server n all... Help... what are my options at this stage? Quote Link to comment
DrJake Posted December 29, 2021 Author Share Posted December 29, 2021 (edited) So I guess I'm just trying to recover the data on my cache drive at this point. The 2 pooled drives were setup as raid1 in case 1 of them failed. But at the moment, I can't seem to figure out/find information about how to recover the data. I just saw this thread, and think I might be one of these cases... cant remember when I setup the redundancy, but I recall it was the Unraid version when the pooled cache feature was first introduced. I believe I have not done anything irreversible, (have not formatted any of the drives, have not rebooted the server). But think I need some expert help... tower-diagnostics-20211229-1432.zip Edited December 29, 2021 by DrJake Quote Link to comment
JorgeB Posted December 29, 2021 Share Posted December 29, 2021 Pool wasn't configured correctly before, i.e., only the NVMe device was part of it, and by unassigning it it was wiped: Dec 29 11:31:17 Tower emhttpd: shcmd (1202764): /sbin/wipefs -a /dev/nvme1n1p1 Dec 29 11:31:17 Tower root: /dev/nvme1n1p1: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d This usually works for this, type in the console: btrfs-select-super -s 1 /dev/nvme1n1p1 If the command is successful (there's no error) then reset the pool with: if Docker/VM services are using the cache pool disable them, unassign all cache devices, start array to make Unraid "forget" current cache config, stop array, reassign only the NVMe device (there can't be an "All existing data on this device will be OVERWRITTEN when array is Started" warning for any cache device), re-enable Docker/VMs if needed, start array. Quote Link to comment
DrJake Posted December 29, 2021 Author Share Posted December 29, 2021 oh wow, thank you thank you so much Jorge. the server back to life now, with VMs and Dockers all working fine. You've saved me a lot of time reconfiguring my VMs and security cameras... So back to what I was trying to do in the beginning. I'm trying to swap out the 500GB NVMe where the cache data is on at the moment, and replace it with a 1TB NVMe (currently has data, and used as an unassigned device) pooled with a 1TB SATA ssd. Following what I originally was planning to do, I guess: 1. add the 1TB SATA ssd to the cache pool (resulting in the redundant cache pool I should have had at the beginning) 2. remove the 500GB NVMe from the cache pool (the 1TB SATA ssd should still function, right?) 3. do my data transfer, so the 1TB NVMe is free 4. add the 1TB NVMe to the cache pool Is there a better way? I'm a bit worried about step 2 and 4, because what I just went through Would the 1TB SATA ssd be a functional cache drive by itself after step 2? Quote Link to comment
JorgeB Posted December 29, 2021 Share Posted December 29, 2021 3 hours ago, DrJake said: remove the 500GB NVMe from the cache pool (the 1TB SATA ssd should still function, right?) Yes, as long as the device is correctly added to the pool, in doubt you can post diags after doing it, it usually works without issues, but sometimes it doesn't. Quote Link to comment
DrJake Posted December 30, 2021 Author Share Posted December 30, 2021 Hi Jorge, something is not right. I'm only at step 1, but dont think the device was correctly added to the pool (for reasons unknown to me). It says the 1TB sata ssd is "part of a pool". I tried to do "balance" in the GUI of Cache (nvme1n1), twice, each time completed without complaint. FYI it would not allow me to "convert to raid1 mode" here. Afterwards, using "btrfs fi usage -T /mnt/cache" I get this, doesn't even look like the device is in the pool... Tried to run "btrfs balance start -mconvert=raid1 /mnt/cache" posted on the other thread, and got error syslog says Dec 30 11:48:38 Tower kernel: BTRFS error (device nvme1n1p1): balance: invalid convert metadata profile raid1 P.S. I tried stopping the array and adding the 2nd cache device twice, same issues. syslog and new diagnostics attached. syslog.txt tower-diagnostics-20211230-1151.zip Quote Link to comment
JorgeB Posted December 30, 2021 Share Posted December 30, 2021 Yes, it failed to add the device to the pool, try this: -stop array -unassign cache2 -start array -to completely wipe the device type in the console: blkdiscard /dev/sdc -reboot -try again, post new diags if it still fails. Quote Link to comment
Solution DrJake Posted December 30, 2021 Author Solution Share Posted December 30, 2021 Thx for getting back to me Jorge, I suspected it was an issue with the cache pool, something is still bugging out. So I ended up taking the long way around. 1. transferred all the cache data back to the array (for peace of mind as well) 2. deleted the cache pool 3. rebooted the server 4. recreated the cache pool (at some point I needed to use the "btrfs-select-super -s 1 /dev/nvme1n1p1" command again, because the system was preventing me from mounting the drive as an unassigned device) 5. now I think everything is in working order, the mover is still moving data from the array back onto the new cache pool. So just to confirm, this means the cache pool is working in RAID1 config right? Does the ID number matter as to which drive data will be read/written from/to? because 1 is SATA and 1 is NVMe. Quote Link to comment
JorgeB Posted December 30, 2021 Share Posted December 30, 2021 4 hours ago, DrJake said: this means the cache pool is working in RAID1 config right? Yep. 4 hours ago, DrJake said: Does the ID number matter as to which drive data will be read/written from/to? No, with raid1 with will always be written to both devices and read also from both alternatively according to even/odd PIDs. Quote Link to comment
DrJake Posted December 30, 2021 Author Share Posted December 30, 2021 (edited) Thank you again Jorge, can mark this one as resolved. Lucky I encountered this issue without actually losing the cache data Edited December 30, 2021 by DrJake Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.