Adam Kesher Posted December 22, 2021 Share Posted December 22, 2021 hi everyone, So I had a weird sequence of events happen here with my cache disk. I got a cache pool btrfs missing warning for my cache disk. I then ran scrub with fix errors checked. Still showed the error. Then I stopped the array and restarted. It disappeared when I restarted the array. Then I restarted my whole unraid system and now it is showing up in unassigned devices. Hoping it can be remounted or salvaged somehow. I have diagnostics here. I screenshotted how it appears. It's the dev3 device. any help would be great. thank you zigplex2-diagnostics-20211222-1430.zip Quote Link to comment
dlandon Posted December 22, 2021 Share Posted December 22, 2021 3 minutes ago, Adam Kesher said: hi everyone, So I had a weird sequence of events happen here with my cache disk. I got a cache pool btrfs missing warning for my cache disk. I then ran scrub with fix errors checked. Still showed the error. Then I stopped the array and restarted. It disappeared when I restarted the array. Then I restarted my whole unraid system and now it is showing up in unassigned devices. Hoping it can be remounted or salvaged somehow. I have diagnostics here. I screenshotted how it appears. It's the dev3 device. any help would be great. thank you zigplex2-diagnostics-20211222-1430.zip 140.99 kB · 0 downloads You might try reassigning it back to the array, but I'd probably do a UD file system check first. You'd have to mount it then click on the check icon. It looks mountable in UD if you'd just rather mount it and unload it. I'd suggest setting it to read only to stop any write activity. Click on the three gears and set 'Read Only' On. Quote Link to comment
Adam Kesher Posted December 22, 2021 Author Share Posted December 22, 2021 38 minutes ago, dlandon said: You might try reassigning it back to the array, but I'd probably do a UD file system check first. You'd have to mount it then click on the check icon. It looks mountable in UD if you'd just rather mount it and unload it. I'd suggest setting it to read only to stop any write activity. Click on the three gears and set 'Read Only' On. so i did try this, tried the scrub and got this in the log. Changed the last digits of the local IPs...,but that XX ending IP is an old IP that this server had before I gave it a dedicated IP on my router at YY. Could this be part of the issue, or just an issue running the scrub? Quote Dec 22 12:09:31 Zigplex2 nginx: 2021/12/22 12:09:31 [error] 7434#7434: *7170 upstream timed out (110: Connection timed out) while reading upstream, client: 192.168.1.XX, server: , request: "GET /plugins/unassigned.devices/include/fsck.php?device=/dev/nvme0n1p1&fs=btrfs&luks=&serial=Samsung_SSD_970_EVO_1TB_S467NX0M827957N&check_type=ro&type=Done HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.1.YY", referrer: "http://192.168.1.YY/Main" Quote Link to comment
dlandon Posted December 22, 2021 Share Posted December 22, 2021 Ok, looking at the message again I see where the UD fsck had a problem. I assume that the scrub didn't show any results? Back out of any browsers you had been using to access the server and then start the browser over and try again. Quote Link to comment
Adam Kesher Posted December 22, 2021 Author Share Posted December 22, 2021 (edited) 3 hours ago, dlandon said: Ok, looking at the message again I see where the UD fsck had a problem. I assume that the scrub didn't show any results? Back out of any browsers you had been using to access the server and then start the browser over and try again. ok i ran it again adn this time that same upstream timeout error popped up but this time i also got Quote Dec 22 15:53:30 Zigplex2 kernel: BTRFS info (device nvme0n1p1): scrub: finished on devid 1 with status: 0 the pop up window for the scrub only says this (though it did say 'transferring data from IP' for a bit Quote FS: btrfs Executing file system scrub: /sbin/btrfs scrub start -B -R -d -r /dev/nvme0n1p1 2>&1 attaching diagnostics after all this zigplex2-diagnostics-20211222-1859.zip Edited December 23, 2021 by Adam Kesher missing info, adding diagnostics Quote Link to comment
JorgeB Posted December 23, 2021 Share Posted December 23, 2021 If it was a single device cache you can just re-assign it. Quote Link to comment
Adam Kesher Posted December 23, 2021 Author Share Posted December 23, 2021 5 hours ago, JorgeB said: If it was a single device cache you can just re-assign it. it is a single cache. when i stop the array to re-assign it this is how it appears in unassigned devices if i click that blue disk name i get this when i assign it to the cache pool, this historical devices shows up is this something i need to be worried about before starting the array with the disk put back in the cache pool (formatting or anything like that?) do I need to change the disk mount point first? or do I just assign the disk? Thanks for all your help everyone I am just paranoid. Quote Link to comment
JorgeB Posted December 23, 2021 Share Posted December 23, 2021 16 minutes ago, Adam Kesher said: is this something i need to be worried about before starting the array with the disk put back in the cache pool No, that's just about previous UD devices, and you can safely remove it. Quote Link to comment
JorgeB Posted December 23, 2021 Share Posted December 23, 2021 16 minutes ago, Adam Kesher said: or do I just assign the disk? Pool is showing 3 slots, if there was just one assigned it's still OK, but make sure you've started the array once before without any device assigned, there can't be a "all data on this device will be deleted" warning for the pool device. Quote Link to comment
Adam Kesher Posted December 24, 2021 Author Share Posted December 24, 2021 (edited) On 12/23/2021 at 9:28 AM, JorgeB said: Pool is showing 3 slots, if there was just one assigned it's still OK, but make sure you've started the array once before without any device assigned, there can't be a "all data on this device will be deleted" warning for the pool device. thanks for this. re-assigning it worked and i was up and running just fine yesterday but today i woke up to the same Cache pool BTRFS missing device(s) warning. is there anything else I can do to prevent this from happening? when i ran the scrub yesterday there were no errors. very frustrating. all the help has been great. zigplex2-diagnostics-20211224-1116.zip Edited December 24, 2021 by Adam Kesher added diagnostics Quote Link to comment
Adam Kesher Posted December 24, 2021 Author Share Posted December 24, 2021 I stopped the array and the cache disappeared again, but on a full shutdown and then starting back up it re-appeared and i got this it auto-mounted and was correctly assigned again. this is super confusing. when i had the issue from the previous post the 'missing devices' warning happened overnight with the machine running. no restart or shutdown. diagnostics from this boot included. zigplex2-diagnostics-20211224-1218.zip Quote Link to comment
Solution JorgeB Posted December 25, 2021 Solution Share Posted December 25, 2021 NVMe device dropped offline: Dec 23 22:48:36 Zigplex2 kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10 Dec 23 22:48:36 Zigplex2 kernel: nvme 0000:08:00.0: enabling device (0000 -> 0002) Dec 23 22:48:36 Zigplex2 kernel: nvme nvme0: Removing after probe failure status: -19 Look for a BIOS update, the below also helps sometimes, failing that try a different brand/model device or board. Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.