DanielPetrica Posted August 11, 2022 Share Posted August 11, 2022 (edited) Hi everyone, I messed up my unraid cache today while trying to remove a disk from my array. System info: Unraid version Version: 6.10.3 diagnostic zip: tower-diagnostics-20220811-1301.zip I was trying to follow the "The Clear Drive Then Remove Drive Method" guide here https://wiki.unraid.net/Shrink_array#The_.22Remove_Drives_Then_Rebuild_Parity.22_Method. When trying to format my disk (disk3) I changed his format to another one and I have wrongly chosen btrfs format. Then I started my array and formatted the disk3 as it was the only disk appearing under the format disk option. I think this is the step which messed up my system as is the same system as my cache. Then I created the user script as said in the guide and ran it but i couldn't clear disk even if the cli was showing only the specified "clear-me" folder. Here's the code I placed inside it: #!/bin/bash # A script to clear an unRAID array drive. It first checks the drive is completely empty, # except for a marker indicating that the user desires to clear the drive. The marker is # that the drive is completely empty except for a single folder named 'clear-me'. # # Array must be started, and drive mounted. There's no other way to verify it's empty. # Without knowing which file system it's formatted with, I can't mount it. # # Quick way to prep drive: format with ReiserFS, then add 'clear-me' folder. # # 1.0 first draft # 1.1 add logging, improve comments # 1.2 adapt for User.Scripts, extend wait to 60 seconds # 1.3 add progress display; confirm by key (no wait) if standalone; fix logger # 1.4 only add progress display if unRAID version >= 6.2 version="1.4" marker="clear-me" found=0 wait=60 p=${0%%$P} # dirname of program p=${p:0:18} q="/tmp/user.scripts/" echo -e "*** Clear an unRAID array data drive *** v$version\n" # Check if array is started ls /mnt/disk[1-9]* 1>/dev/null 2>/dev/null if [ $? -ne 0 ] then echo "ERROR: Array must be started before using this script" exit fi # Look for array drive to clear n=0 echo -n "Checking all array data drives (may need to spin them up) ... " if [ "$p" == "$q" ] # running in User.Scripts then echo -e "\n" c="<font color=blue>" c0="</font>" else #set color teal c="\x1b[36;01m" c0="\x1b[39;49;00m" fi for d in /mnt/disk[1-9]* do x=`ls -A $d` z=`du -s $d` y=${z:0:1} # echo -e "d:"$d "x:"${x:0:20} "y:"$y "z:"$z # the test for marker and emptiness if [ "$x" == "$marker" -a "$y" == "0" ] then found=1 break fi let n=n+1 done #echo -e "found:"$found "d:"$d "marker:"$marker "z:"$z "n:"$n # No drives found to clear if [ $found == "0" ] then echo -e "\rChecked $n drives, did not find an empty drive ready and marked for clearing!\n" echo "To use this script, the drive must be completely empty first, no files" echo "or folders left on it. Then a single folder should be created on it" echo "with the name 'clear-me', exactly 8 characters, 7 lowercase and 1 hyphen." echo "This script is only for clearing unRAID data drives, in preparation for" echo "removing them from the array. It does not add a Preclear signature." exit fi # check unRAID version v1=`cat /etc/unraid-version` # v1 is 'version="6.2.0-rc5"' (fixme if 6.10.* happens) v2="${v1:9:1}${v1:11:1}" if [[ $v2 -ge 62 ]] then v=" status=progress" else v="" fi #echo -e "v1=$v1 v2=$v2 v=$v\n" # First, warn about the clearing, and give them a chance to abort echo -e "\rFound a marked and empty drive to clear: $c Disk ${d:9} $c0 ( $d ) " echo -e "* Disk ${d:9} will be unmounted first." echo "* Then zeroes will be written to the entire drive." echo "* Parity will be preserved throughout." echo "* Clearing while updating Parity takes a VERY long time!" echo "* The progress of the clearing will not be visible until it's done!" echo "* When complete, Disk ${d:9} will be ready for removal from array." echo -e "* Commands to be executed:\n***** $c umount $d $c0\n***** $c dd bs=1M if=/dev/zero of=/dev/md${d:9} $v $c0\n" if [ "$p" == "$q" ] # running in User.Scripts then echo -e "You have $wait seconds to cancel this script (click the red X, top right)\n" sleep $wait else echo -n "Press ! to proceed. Any other key aborts, with no changes made. " ch="" read -n 1 ch echo -e -n "\r \r" if [ "$ch" != "!" ]; then exit fi fi # Perform the clearing logger -tclear_array_drive "Clear an unRAID array data drive v$version" echo -e "\rUnmounting Disk ${d:9} ..." logger -tclear_array_drive "Unmounting Disk ${d:9} (command: umount $d ) ..." umount $d echo -e "Clearing Disk ${d:9} ..." logger -tclear_array_drive "Clearing Disk ${d:9} (command: dd bs=1M if=/dev/zero of=/dev/md${d:9} $v ) ..." dd bs=1M if=/dev/zero of=/dev/md${d:9} $v #logger -tclear_array_drive "Clearing Disk ${d:9} (command: dd bs=1M if=/dev/zero of=/dev/md${d:9} status=progress count=1000 seek=1000 ) ..." #dd bs=1M if=/dev/zero of=/dev/md${d:9} status=progress count=1000 seek=1000 # Done logger -tclear_array_drive "Clearing Disk ${d:9} is complete" echo -e "\nA message saying \"error writing ... no space left\" is expected, NOT an error.\n" echo -e "Unless errors appeared, the drive is now cleared!" echo -e "Because the drive is now unmountable, the array should be stopped," echo -e "and the drive removed (or reformatted)." exit Now I tried stopping and restarting the array, and I've seen the cache disks appear in format disk option and I can no longer access them as they appear unmountable with the message "Unmountable: Invalid pool config". I've tried turning disk 3 to the xfs format but now i still see the disk3 and the cache ssds in the format option. As I have some shares which are configured with cache yes or cache preffer I'm worried that a format may delete this data which includes appdata share which has a cache prefer option. Can you please indicate what options I have to recover this data? In the attached image, you can see the current array status with and the array operations Edited August 11, 2022 by DanielPetrica Quote Link to comment
JorgeB Posted August 11, 2022 Share Posted August 11, 2022 Diags you posted don't show the problem, please reboot and post new diags after array start. Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 Here's the updated diagnostics after the restart. tower-diagnostics-20220811-1425.zip The problem is still present as shown in this new screenshots Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 Not directly related to your problem, but why were you trying to clear anything? Unraid only requires a clear disk in one scenario, when adding an array data disk to a new slot in array that already has valid parity. This is so parity will remain valid since a clear disk is all zeros and so has no effect on parity. The script you were trying to use is a special case, used when you want fewer disks in your array. The script clears a disk while it is in the array so you can remove it without invalidating parity. Parity is updated during the clear, just as it is with all write operations, and so at the end, removing the clear disk has no effect on parity since it is already in sync with the clear disk. As far as I understand, none of this applies to your situation. Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 (edited) I was trying to remove the disk as said on the wiki. https://wiki.unraid.net/Shrink_array#The_.22Remove_Drives_Then_Rebuild_Parity.22_Method My guess is that when setting the disk3 to the same format as the cahce pool it got mixed up with them. Also I still haven't run the "Retain current configuration" option yet Edited August 11, 2022 by DanielPetrica Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 Sorry, I didn't read carefully. I haven't use the script in a long time. There are updates to the script near the end of the scripts thread, but I don't think they would make any difference. 13 minutes ago, DanielPetrica said: My guess is that when setting the disk3 to the same format as the cahce pool it got mixed up with them. And I don't think this is likely. The script code only works on md devices, which are disks in the parity array. I never bother with that anyway. Simpler and more reliable to just use the remove then rebuild parity method. Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 1 minute ago, trurl said: I don't think this is likely. The script code only works on md devices, which are disks in the parity array. And the fact that disk3 is now unmountable is further evidence that it is the disk that was cleared. Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 Just now, trurl said: I never bother with that anyway. Simpler and more reliable to just use the remove then rebuild parity method. I'm open to do it, but is there any way to mount the cache and export the data, so I don't lose it. I can't find the cache under /mnt/ Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 I'll have to let @JorgeB advise on cache. I can't imagine it, but is it possible something somehow put disk3 in the cache pool? And it was still md3? No I don't think so. Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 Yep initially when changing the disk format I chose the btrfs format, the same as the cache pool I think this is what broke the cache Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 I understood that part, but I don't think it is likely. Quote Link to comment
Solution JorgeB Posted August 11, 2022 Solution Share Posted August 11, 2022 It's happening because you had a single btrfs array drive, it's a known issue since it make parity appear to have an invalid btrfs filesystem, to fix: -unassign both pool devices -start array -format disk3 using xfs -stop array -re-assign both cache devices -start array, all should be good, if it isn't post new diags. Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 2 minutes ago, JorgeB said: you had a single btrfs array drive, it's a known issue Aren't any and all btrfs in the array single btrfs drives? How does cache get involved? Where has this issue been reported? Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 @JorgeB thank you very much Now my cache is working, and I still have all my data 1 Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 So the script didn't clear cache pool, which it shouldn't have since they are not md devices. Still wondering how formatting an array disk btrfs can break cache pool though. Would other btrfs pools also be affected? I guess it has nothing really to do with clearing then? Simply formatting an array disk btrfs can break btrfs pools? Quote Link to comment
JorgeB Posted August 11, 2022 Share Posted August 11, 2022 11 minutes ago, trurl said: Aren't any and all btrfs in the array single btrfs drives? Yes, the problem is having a single btrfs array disk, parity will have part of the superblock info and it will look like parity has an invalid btrfs filesystem during btrfs scan, and pools get confused by that, this should be fixed soon by parity no longer using a partition. Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 It does bring up another question with regard to this specific case. Assuming script did clear md3, does this parity disk issue mean that removing disk3 would invalidate parity? It isn't supposed to invalidate parity since the disk was cleared in the array. Quote Link to comment
Kilrah Posted August 11, 2022 Share Posted August 11, 2022 (edited) From what I understand clear disk was never executed yet since OP first wanted to format their drive and that broke things/caused them to halt. Edited August 11, 2022 by Kilrah Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 4 minutes ago, Kilrah said: clear disk was never executed yet 3 hours ago, DanielPetrica said: ran it but i couldn't clear disk Not clear if it refused to run or what. But question still seems relevant. And if disk wasn't clear, why was it unmountable? Does simply formatting a disk btrfs in the array result in an unmountable disk? Quote Link to comment
DanielPetrica Posted August 11, 2022 Author Share Posted August 11, 2022 (edited) 5 minutes ago, trurl said: Not clear if it refused to run or what. But question still seems relevant. And if disk wasn't clear, why was it unmountable? Does simply formatting a disk btrfs in the array result in an unmountable disk? The script refused to write the disk as it said no disk was empty At this moment my understanding is that the disk3 was still having the system wrote as btfrs, but I had chosen the new format in the array config and it prompted me to format it. But it was prompting me to format the cache disks so I stopped before messing too much my system 😅 Edited August 11, 2022 by DanielPetrica typo Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 So you hadn't yet formatted it btrfs? But shouldn't it still have been mountable with the filesystem it already had? And if it hadn't been formatted btrfs, why did the other problems arise? Curiouser and curiouser Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 So I guess it was unmountable because Unraid was expecting it to mount btrfs and the next step would have been to format it so it would mount. But since it hadn't been formatted, why did parity get involved at all and cause all this confusion? Quote Link to comment
JorgeB Posted August 11, 2022 Share Posted August 11, 2022 4 minutes ago, trurl said: But shouldn't it still have been mountable with the filesystem it already had? He formatted the disk btrfs, it mounted, then he changed the fs back to xfs, that's why it didn't mount but parity was still reflecting btrfs. Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 OK its all there in the first post Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 3 hours ago, DanielPetrica said: couldn't clear disk even if the cli was showing only the specified "clear-me" folder. This implies clear-me was created after disk was formatted btrfs, and the array disk was mounted so the folder could be created. So the script should have worked? I don't really care about the script so much though. Cache problem happened when array disk was formatted btrfs? 4 hours ago, DanielPetrica said: formatted the disk3 as it was the only disk appearing under the format disk option Did cache show unmountable as soon as the array disk was formatted btrfs, and it was unmountable while you were trying to set things up for the script? Was it just not noticed until trying to reformat array disk XFS? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.