February 4, 20242 yr Yesterday, one of the SSDs in my pool of 2 SSDs failed, leaving me with a dead drive. I replaced it with a new SSD of the same size. However, despite the replacement, BTRFS isn't initiating the repair for the RAID on this pool. Additionally, I'm receiving emails indicating that the second SSD in the pool is 'missing' only when the array is stopped. While I can restart the array, whenever I attempt to move the Vdisk to the array to replace the other SSD, the transfer speed drops drastically to 2-5Mb/s with an estimated time of 1600 hours, as the Vdisk image is 2TB. For troubleshooting so far I have: Tried different cables and ports on my Motherboard Tried rebooting the Server Tried using file explore to move the vdisk tried using Cli to copy the vdisk All attempts to move the data to the array temporarily start at normal speeds then after about a minute drop to 2-5Mbs It's worth noting that when I check the status using 'btrfs replace status /mnt/pool_name', it shows 'never started' for the pool in question. Any thoughts or advice on resolving this issue would be greatly appreciated. I'm concerned about the data on this Vdisk and want to ensure it's not lost. Thank you." tower-diagnostics-20240204-1624.zip
February 5, 20242 yr Author 3 hours ago, JorgeB said: Post the output of Output of Command: Label: none uuid: e03c4769-3927-4173-8e19-cac6dcb664a4 Total devices 4 FS bytes used 1.32TiB devid 3 size 0 used 0 path MISSING devid 4 size 1.82TiB used 905.03GiB path /dev/sdc1 devid 5 size 1.82TiB used 760.00GiB path /dev/sdb1 devid 6 size 1.82TiB used 450.00GiB path /dev/sdd1
February 5, 20242 yr Community Expert Reboot to clear the logs, start the array with the pool as is currently and post new diags after 5 minutes
February 5, 20242 yr Author Rebooted the server and let it sit for 5 mins New diags attached tower-diagnostics-20240205-1534.zip
February 5, 20242 yr Author Was getting normal ish speeds on the Pool for a while after the reboot but it is stuck at about 655Kbs currently. Attached logs from when I noticed the speeds drop. tower-diagnostics-20240205-1619.zip
February 6, 20242 yr Community Expert There are what look like power/connection issues with the PNY pool device, replace cables and try again.
February 6, 20242 yr Author I swapped the SATA power cable and connected the drive to a different SATA port with a new cable and restarted Unraid tower-diagnostics-20240206-0636.zip
February 6, 20242 yr Community Expert No errors so far, lets see if the missing device can be deleted now.
February 6, 20242 yr Author I ran "btrfs fi show" again, here is the result. Label: none uuid: e03c4769-3927-4173-8e19-cac6dcb664a4 Total devices 4 FS bytes used 1.32TiB devid 3 size 0 used 0 path MISSING devid 4 size 0 used 0 path /dev/sdaa1 MISSING devid 5 size 1.82TiB used 833.00GiB path /dev/sdb1 devid 6 size 1.82TiB used 649.00GiB path /dev/sdc1 I also noticed that the 2TB PNY is now showing in the pool and under unassigned devices? Disk Log for the 2TB PNY
February 6, 20242 yr Community Expert If the device still has issues with new cables it may be a device problem.
February 6, 20242 yr Author Rebooted and it cleared up the duplicated devices, the BTRFS Pool is doing something. pretty sure that PNY SSD is in the processes of dying... thought it was strange they both decided to bite the bullet at the same time. Is there a way to check what operation is being run on the pool? or to stop it? Tried to run "btrfs dev del missing /mnt/plexmeta" ERROR: unable to start device remove, another exclusive operation 'device remove' in progress
February 6, 20242 yr Community Expert Just now, Stanui said: ERROR: unable to start device remove, another exclusive operation 'device remove' in progress Unraid starts removing the device automatically after array start, and AFAIK it's not possible to cancel that, possibly you can try mounting the pool read only and try copying what you can.
February 6, 20242 yr Author Gave that a try, it's running but at 1.45-1.56MB/s with an ETA of 13 days. Let the transfer run to see if it errors out? I was using the pool for a Vdisk and cant really grab one file vs the entire Vdisk image.
February 6, 20242 yr Community Expert Is that with pool mounted read only or while still also removing the missing device?
February 6, 20242 yr Author With the Array started normally I followed the Mounting pool read only instructions to mount it to the temp directory. Unraid is still showing a BTRFS operation is running
February 6, 20242 yr Community Expert 50 minutes ago, Stanui said: Unraid is still showing a BTRFS operation is running OK, I wasn't sure if it would continue with the pool in read-only mode.
February 6, 20242 yr Author Gotcha, Logs show the Pool is doing the following Feb 6 10:56:33 Tower kernel: BTRFS info (device sdz1): found 4784 extents, stage: update data pointers Feb 6 10:57:00 Tower kernel: BTRFS info (device sdz1): relocating block group 2025308815360 flags data|raid1 Im assuming it's trying to rebuild since the original 2nd drive died? but not sure if its actually making progress.
February 6, 20242 yr Community Expert It's doing a balance, this is normal, and as long as new similar lines keep appearing it's making progress.
February 19, 20242 yr Author To close this out if anyone finds it in the future. I ended up pulling both the PNY SSDs from the cache pool as they were completely dead and were not even being recognized anymore. Unfortunately lost the Vdisk on the pool. I configured the pool with some new WD Blue SSDs. TLDR: drives were dead and the rebuild time went from 6 weeks to 3 months and I said forget it.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.