cammelspit Posted July 2, 2021 Share Posted July 2, 2021 Hey, so i had my cache drive start failing almost immediately after I installed a brand new drive and set up a BTRFS raid 1 pool. SO I was under the impression that this meant the data is wholly complete on both drives, however, the I pull the bad drive, the array won't start, I tried putting the good new drive in the pool alone but it still uses the old drive and wont start without it plugged in. Even though unraid doesn't have the old drive mounted, for some reason it is still using it in the BTRFS RAID1 and I don't know what to do to make my cache pool a single drive using ONLY the new drive alone. IMO, the GUI lets me set it as a single drive again so it seems like a pretty big oversight that I can't un-raid the pool again. I can assume I need to do some terminal FU to fix this but I don't know what to do for that. Oh and the old drive is in right now and it does boot up and work temporarily but once it starts having write errors the whole server freezes and since the server runs my pfSense VM for my internet, I work from home, this is a major issue I must have fixed as immediately as humanly possible. Please and thank you to anyone who can help me here! Quote Link to comment
trurl Posted July 3, 2021 Share Posted July 3, 2021 https://wiki.unraid.net/Manual/Storage_Management#Removing_disks_from_a_multi-device_pool Also 1 Quote Link to comment
cammelspit Posted July 3, 2021 Author Share Posted July 3, 2021 I really wish that worked, as I previously stated, I already did this, and I just tested it out one step at a time based on the link you posted and no dice. Simply put the drive unmountable when the old failing frive is removed from the pool and there is nothing I can do about that. I am at my wits end here, the data is there I just cant get rid of this failing drive and its driving me bananas. Quote Link to comment
JorgeB Posted July 3, 2021 Share Posted July 3, 2021 Please post current diagnostics: Tools -> Diagnostics 1 Quote Link to comment
cammelspit Posted July 3, 2021 Author Share Posted July 3, 2021 connollyserver-diagnostics-20210703-0226.zip BTW, now it's totally unmountable. that's it it's just unable to mount anything at all. I even tried remaking the pool, which worked last time, and totally nothing. Following the instructions on how to remove a pool device has now rendered everything unmountable. Fun times... Quote Link to comment
cammelspit Posted July 3, 2021 Author Share Posted July 3, 2021 is there any way to use the command line to pull the data off? I mean redoing every docker from scratch is just a ludicrous amount of work I shouldn't have to do at all. I mean, is this a bug in unraid where the GIU is unable to remove a drive form a pool? Because I work tomorrow I had to pull the old bedroom HTPC and installed pfSense on that because the server is 100% down. I know the data is there which is why I wont be doing anything else at all till I can get it mounted and at least pull the data off of it. Quote Link to comment
JorgeB Posted July 3, 2021 Share Posted July 3, 2021 It's not mounting because you converted the pool to single profile then removed a device, that's not possible, you can only remove devices from a redundant pool, this might work: -stop array -unassign all cache devices -start array -type on the console (if you rebooted since the diags make sure the ADATA SSD is still sdb): btrfs-select-super -s 1 /dev/sdb1 -stop array -assign both cache devices, there can't be an "all data on this device will be deleted" warning for any of the cache devices -start array -post new diags. 1 Quote Link to comment
JorgeB Posted July 3, 2021 Share Posted July 3, 2021 1 minute ago, cammelspit said: is this a bug in unraid where the GIU is unable to remove a drive form a pool? No, there is some problem with pool, probably because of the failing ADATA device but can't see what it was in the diags posted. 1 Quote Link to comment
cammelspit Posted July 3, 2021 Author Share Posted July 3, 2021 connollyserver-diagnostics-20210703-0244.zip Whatever i just did, it mounted at least. I assume by looking at the command that forced the adata back to being the primary drive? Also, holy heck I just realized it's almost 3AM and I work in the morning... I am gonna go hit the shower and check back in here once more tonight but I have to get in bed like, two hours ago. lol Thanks a bunch for trying to help me out here, it is appreciated! I just know so very little about BTRFS. Quote Link to comment
JorgeB Posted July 3, 2021 Share Posted July 3, 2021 Since the pool is now in single mode and has a possible failing device you can try to remove it now instead of converting to back to raid1 and then removing, but to remove a device from a single profile pool you can only do it manually, before starting it's a good idea to make sure backups are up to date, then: -with the array started type in the console: btrfs dev del /dev/sdb1 /mnt/cache -if the command aborts with errors post new diags, if the command completes without errors and you get the cursor back stop the array -unassign both cache devices -start array -stop array -assign the Samsung cache device only -start array -done 1 Quote Link to comment
cammelspit Posted July 3, 2021 Author Share Posted July 3, 2021 As much as I would LOVE to do this right now, its bed time for me, at least I have the internet working on the bedroom HTPC, which you may find interesting to hear that once upon a time it was the pfSense box long ago before i ever put the server together and before I knew what unraid even was. it's like homelab resurrection. At least I can work without fear in the morning, then once im off and all that jazz, i will do exactly what you recommend. Question though, so should I balance as Raid 1 again and then refollow the proper steps or should I do it this way from the command line as you suggest? is there any reason it might be better to balance as Raid1 again? Either way, you have gotten it mounted for me and I will get all my backups confirmed that they are truly up to date, I have been slacking a bit on this and my most recent backup is over a month old on cloud storage. No bueno... I'll let you know how it goes, thanks my dude! Quote Link to comment
JorgeB Posted July 3, 2021 Share Posted July 3, 2021 34 minutes ago, cammelspit said: I do it this way from the command line as you suggest? I would suggest this since there's a suspect device, so the quicker it's done the better, but note that if there are read errors there will be problems, but it would be the same if you try to convert to raid1. 1 Quote Link to comment
cammelspit Posted July 5, 2021 Author Share Posted July 5, 2021 Hey so i tried running the command, and it seems that it thinks its in RAID1 ERROR: error removing device '/dev/sdb1': unable to go below two devices on raid1 which is weird because I swear i converted it to Single. However I just noticed, now that it is mounted that I have this. So is it confused as to if this is Raid1 or Single? Sorry it took so long to try and test this but I've been real busy at work and also I forgot how long it takes to do a backup of my appdata, I mean, with PLEX and it's ten and a half billion little files it takes forever, especially with a library as big as mine. Also here is a new log for good measure. connollyserver-diagnostics-20210704-2011.zip Quote Link to comment
JorgeB Posted July 5, 2021 Share Posted July 5, 2021 5 hours ago, cammelspit said: Hey so i tried running the command, and it seems that it thinks its in RAID1 Sorry, my fault, I forgot about the metadata, it's still raid1, first convert it to single also: btrfs balance start -f -mconvert=single /mnt/cache Then do the above. 1 Quote Link to comment
cammelspit Posted July 5, 2021 Author Share Posted July 5, 2021 Im running that right now, I will update you when something arises because I have no idea how long it will take. *crosses fingers* Quote Link to comment
cammelspit Posted July 5, 2021 Author Share Posted July 5, 2021 root@connollyserver:~# btrfs balance start -f -mconvert=single /mnt/cache Done, had to relocate 9 out of 739 chunks root@connollyserver:~# btrfs dev del /dev/sdb1 /mnt/cache ERROR: error removing device '/dev/sdb1': Input/output error connollyserver-diagnostics-20210705-0233.zip I dunno about you, but for me this feels like it should be way more straightforward. Like, drive broke, take out drive, done! Right? Quote Link to comment
itimpi Posted July 5, 2021 Share Posted July 5, 2021 1 minute ago, cammelspit said: I dunno about you, but for me this feels like it should be way more straightforward. Like, drive broke, take out drive, done! Right? It cannot be quite that simple as the default assumption would always be that a 'failed' drive will be replaced. Quote Link to comment
cammelspit Posted July 5, 2021 Author Share Posted July 5, 2021 Sure, I see your point but that really isn't a reason for things to not be straightforward just as a rule. Maybe I am looking at it from the 'filthy pleb' sort of perspective. It is just weird that one can't just remove the drive, I mean, all things being equal it does sound pretty simple. I just got that Samsung drive barely a couple weeks ago so I am confident it's reasonably workable and I do intend on getting another for a RAID1 pair but I kinda had my TV die and a few other unforeseen expenses and I don't think it's unusual to assume someone may want to pull a drive out of a pool at some point or have a very good reason for doing so, but hey that's just me I guess. 🤷♂️ Quote Link to comment
JorgeB Posted July 5, 2021 Share Posted July 5, 2021 2 hours ago, cammelspit said: ERROR: error removing device '/dev/sdb1': Input/output error There are read errors on the failing device and because of that some data can't be moved to the other one, there's still a lot of data remaining on the that device, you'll need to back up whatever you can to the array or other device then recreate the cache. 1 Quote Link to comment
cammelspit Posted July 5, 2021 Author Share Posted July 5, 2021 I was thinking that might be what was needed now. I just have to say thanks though, you have been very helpful and have gone above and beyond in helping and for that I am grateful. I already have my appdata backed up which is the important bit and there is maybe one or two small things for convenience I will copy off and I'll just recreate the cache from scratch. Again, you have been great. 👍 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.