February 5Feb 5 started with a dockerimage corruption. noticed BTRFS errors on my BTRFS NVME cache pool. while the array was running it showed both as green. when i stopped the array (after moving everything off the cache to change to ZFS) it showed my second NVME drive as no device. I also notice its showing up in the unassaigned devices. If its can see teh device, why isn't it able to be used in the pool?Not sure if it means my drive is dead or just the filesytem got corrupted and its not sure what to do with it? Like i said, no data is on the pool. suggestions to move forward? attached current Diagnostics and when i pulled diagnostics after the dockerimage corruption.crusty-diagnostics-20260205-1542.zip crusty-diagnostics-20260203-0904.zip Edited February 5Feb 5 by PassTheSalt title edit
February 6Feb 6 Community Expert One of the devices dropped offline, try this, on Main click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" and add this to your default boot option, after "append initrd=/bzroot"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offe.g.:append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offReboot (or power cycle the server if just a reboot doesn't bring the device back) and then see if it makes a difference.
February 6Feb 6 Author 12 hours ago, JorgeB said:One of the devices dropped offline, try this, on Main click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" and add this to your default boot option, after "append initrd=/bzroot"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offe.g.:append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=offReboot (or power cycle the server if just a reboot doesn't bring the device back) and then see if it makes a difference.I gave that a try. still not showing. I'm not gonna be physically at its location to try a complete power cycle for a few days, but I did reboot. it shows as unassigned but unmountable. I guess that means its just dead? Fun time for a 2TB drive to die lol
February 7Feb 7 Community Expert 11 hours ago, PassTheSalt said:but I did rebootIt's normal to require a power cycle to get the device back when this happens.
February 10Feb 10 Author On 2/7/2026 at 2:40 AM, JorgeB said:It's normal to require a power cycle to get the device back when this happens.ok, I was able to shutdown the machine and bring it back up(even left it unplugged for a bit). still showing as unmountable. Luckily the drive is still under warranty so if its dead hopefully i can get it replaced. Edited February 10Feb 10 by PassTheSalt
February 10Feb 10 Author 22 minutes ago, JorgeB said:Please post current diags.crusty-diagnostics-20260210-1356.zip
February 11Feb 11 Community Expert Device is failing to initialize:Feb 10 12:57:15 Crusty kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10Feb 10 12:57:15 Crusty kernel: nvme nvme0: Does your device have a faulty power saving mode enabled?Feb 10 12:57:15 Crusty kernel: nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bugFeb 10 12:57:15 Crusty kernel: nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error (sct 0x3 / sc 0x71) Feb 10 12:57:15 Crusty kernel: I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0Feb 10 12:57:15 Crusty kernel: nvme 0000:02:00.0: enabling device (0000 -> 0002)Feb 10 12:57:15 Crusty kernel: nvme nvme0: Disabling device after reset failure: -19You are already using those kernel options; try swapping M.2 slots for both devices and see if now the other one fails to initialize
February 12Feb 12 Author On 2/11/2026 at 2:25 AM, JorgeB said:Device is failing to initialize:Feb 10 12:57:15 Crusty kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10Feb 10 12:57:15 Crusty kernel: nvme nvme0: Does your device have a faulty power saving mode enabled?Feb 10 12:57:15 Crusty kernel: nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and report a bugFeb 10 12:57:15 Crusty kernel: nvme0n1: I/O Cmd(0x2) @ LBA 0, 8 blocks, I/O Error (sct 0x3 / sc 0x71)Feb 10 12:57:15 Crusty kernel: I/O error, dev nvme0n1, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0Feb 10 12:57:15 Crusty kernel: nvme 0000:02:00.0: enabling device (0000 -> 0002)Feb 10 12:57:15 Crusty kernel: nvme nvme0: Disabling device after reset failure: -19You are already using those kernel options; try swapping M.2 slots for both devices and see if now the other one fails to initializeI took the drive that wasn't working and swapped it to a 3rd m.2 port on my mobo. still not showing. (the other drive is under some "ssd armor" under the cpu heatsink, would be a pain to get to) i double checked that the new m.2 slot's lanes aren't being used for other sata ports. I'm gonna take the drive out and see if i can see it in windows i guess. new diag attached crusty-diagnostics-20260212-1240.zip
February 12Feb 12 Community Expert Still the same issue, this is not typically a device problem, but it can be, so if you can test with another PC, it would be good to confirm.
February 12Feb 12 Author 46 minutes ago, JorgeB said:Still the same issue, this is not typically a device problem, but it can be, so if you can test with another PC, it would be good to confirm.it shows in windows disk management, but the only option i have is to delete the partition on it. maybe ill try reformatting? or maybe try to upgrade firmware?I have it on a usb m.2 adapter. Sandisk/WD's own dashboard software can't find it. I'm gonna see if I can get sandisk to help/replace Edited February 12Feb 12 by PassTheSalt
February 12Feb 12 Author ok, I got the ssd onto a windows machine, deleted the volume in disk management and recreated a new NTFS volume. I installed it into the server and Its now showing up as mountable. my problem now is I'm not sure how to make unraid reformat the drive to use BTRFS so i can add it to my cache? screenshot+diag attached crusty-diagnostics-20260212-1612.zip
February 12Feb 12 Author ok i did, seemingly, the same thing again, and this time it seems to have formatted the drive when i started the array. i think I'm good to go. gonna post diagnostics again if you could check if everything is looking ok. crusty-diagnostics-20260212-1634.zip
February 13Feb 13 Community Expert The device wasn't correctly added to the pool, since it was degraded, it may require some manual intervention, first reimport the pools wiht just nvme0n1on main click on the first device for that pool and then "remove pool"back on main, create a new pool with the same name and 1 slotassign the pool device nvme0n1, leave the filesystem set to autostart the array to import the poolIt should show degraded again, post new diags to confirm status.
February 16Feb 16 Author On 2/13/2026 at 2:47 AM, JorgeB said:The device wasn't correctly added to the pool, since it was degraded, it may require some manual intervention, first reimport the pools wiht just nvme0n1on main click on the first device for that pool and then "remove pool"back on main, create a new pool with the same name and 1 slotassign the pool device nvme0n1, leave the filesystem set to autostart the array to import the poolIt should show degraded again, post new diags to confirm status.OK i believe I did that as instructed. even though i made it 1 slot it still says its not an uninstalled nvme2? crusty-diagnostics-20260216-1107.zip
February 16Feb 16 Community Expert 32 minutes ago, PassTheSalt said:even though i made it 1 slot it still says its not an uninstalled nvme2?This is expected, now with the array running type:btrfs balance start -f -dconvert=single -mconvert=dup /mnt/nvmecacheWhen that finishes, typebtrfs dev remove missing /mnt/nvmecacheThen reimport the pool again, same as above, and it should now only show 1 device.
February 16Feb 16 Author 3 hours ago, JorgeB said:This is expected, now with the array running type:btrfs balance start -f -dconvert=single -mconvert=dup /mnt/nvmecacheWhen that finishes, typebtrfs dev remove missing /mnt/nvmecacheThen reimport the pool again, same as above, and it should now only show 1 device. ok, i did that and then removed the pool, readded the pool with 1 slot and added nvme0n1, it started and seems fine. crusty-diagnostics-20260216-1523.zip
February 17Feb 17 Community Expert Now try again adding the other device to the pool, it's best to wipe it first, confirm the identifier is still the same, it can change with it a reboot, thenblkdiskcard -f /dev/nvme1n1Now stop the array, change pool slots to 2, assign that device, and start the array.
February 17Feb 17 Author 6 hours ago, JorgeB said:Now try again adding the other device to the pool, it's best to wipe it first, confirm the identifier is still the same, it can change with it a reboot, thenblkdiskcard -f /dev/nvme1n1Now stop the array, change pool slots to 2, assign that device, and start the array.root@Crusty:~# blkdiskcard -f /dev/nvme1n1bash: blkdiskcard: command not foundTried that and got the above? I'm just typing that in the terminal window you get to from the >_ button at the top right?Edit: I found what you meant- blkdiscard* did the rest and attached diag. looks like that worked to me! Thanks for your help with this! crusty-diagnostics-20260217-0937.zip Edited February 17Feb 17 by PassTheSalt
February 17Feb 17 Community Expert 53 minutes ago, PassTheSalt said:I found what you meant- blkdiscard* did the rest and attached diag. looks like that worked to me! Thanks for your help with this!yes, sorry for the typo
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.