Alchemist Zim Posted January 6, 2023 Share Posted January 6, 2023 This has been an ongoing issue that has gotten worse with 6.11.X Every couple of days\weeks my NVMe drive will lose its connection with unRAID. Docker service stops, VMs stop, shares and GUI still work I have to reseat the drive before it is recognized by unRAID after a restart. Because of this the array never fully stops, so upon restart it does a parity check which takes about a day to run with 14TB parity drives Any help is appreciated...diagnostics attached tesseract-diagnostics-20230106-1721.zip Quote Link to comment
trurl Posted January 6, 2023 Share Posted January 6, 2023 Have you tried updating nvme firmware? Quote Link to comment
Alchemist Zim Posted January 6, 2023 Author Share Posted January 6, 2023 13 minutes ago, trurl said: Have you tried updating nvme firmware? didn't even know that was a thing until just now will try that and get back to you if it happens again, will try to get diagnostics before restarting Quote Link to comment
Solution JorgeB Posted January 7, 2023 Solution Share Posted January 7, 2023 This can sometimes help, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot": nvme_core.default_ps_max_latency_us=0 pcie_aspm=off e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off Reboot and see if it makes a difference. 10 hours ago, Alchemist Zim said: I have to reseat the drive before it is recognized by unRAID after a restart. Most likely just power cycling the server will bring it back, just a reboot usually won't. Quote Link to comment
Alchemist Zim Posted January 8, 2023 Author Share Posted January 8, 2023 happened overnight, diags posted pre and post reboot Quote This can sometimes help, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot": Will try this if it fails again Quote Most likely just power cycling the server will bring it back, just a reboot usually won't. I think I tried this previously, and it didn't work...Will try again prereboot - tesseract-diagnostics-20230108-0516.zip postreboot - tesseract-diagnostics-20230108-0530.zip Quote Link to comment
Alchemist Zim Posted January 12, 2023 Author Share Posted January 12, 2023 Just Happened...adding nvme_core.default to boot Jan 11 19:27:54 TESSERACT emhttpd: shcmd (3450934): umount /mnt/cache-nvme Jan 11 19:27:54 TESSERACT root: umount: /mnt/cache-nvme: target is busy. Jan 11 19:27:54 TESSERACT emhttpd: shcmd (3450934): exit status: 32 Jan 11 19:27:54 TESSERACT emhttpd: Retry unmounting disk share(s)... Jan 11 19:27:59 TESSERACT emhttpd: Unmounting disks... Jan 11 19:27:59 TESSERACT emhttpd: shcmd (3450935): umount /mnt/cache-nvme Jan 11 19:27:59 TESSERACT root: umount: /mnt/cache-nvme: target is busy. Jan 11 19:27:59 TESSERACT emhttpd: shcmd (3450935): exit status: 32 Jan 11 19:27:59 TESSERACT emhttpd: Retry unmounting disk share(s)... Jan 11 19:28:04 TESSERACT kernel: btrfs_dev_stat_print_on_error: 25 callbacks suppressed Jan 11 19:28:04 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187186, flush 0, corrupt 0, gen 0 Jan 11 19:28:04 TESSERACT kernel: I/O error, dev loop2, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 Jan 11 19:28:04 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187187, flush 0, corrupt 0, gen 0 Jan 11 19:28:04 TESSERACT kernel: I/O error, dev loop2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Jan 11 19:28:04 TESSERACT kernel: Buffer I/O error on dev loop2, logical block 0, async page read Jan 11 19:28:04 TESSERACT emhttpd: Unmounting disks... Jan 11 19:28:04 TESSERACT emhttpd: shcmd (3450937): umount /mnt/cache-nvme Jan 11 19:28:04 TESSERACT root: umount: /mnt/cache-nvme: target is busy. Jan 11 19:28:04 TESSERACT emhttpd: shcmd (3450937): exit status: 32 Jan 11 19:28:04 TESSERACT emhttpd: Retry unmounting disk share(s)... Jan 11 19:28:06 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187188, flush 0, corrupt 0, gen 0 Jan 11 19:28:06 TESSERACT kernel: I/O error, dev loop3, sector 0 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 Jan 11 19:28:06 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187189, flush 0, corrupt 0, gen 0 Jan 11 19:28:06 TESSERACT kernel: I/O error, dev loop3, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Jan 11 19:28:06 TESSERACT kernel: Buffer I/O error on dev loop3, logical block 0, async page read Jan 11 19:28:06 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187190, flush 0, corrupt 0, gen 0 Jan 11 19:28:06 TESSERACT kernel: I/O error, dev loop2, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Jan 11 19:28:06 TESSERACT kernel: Buffer I/O error on dev loop2, logical block 0, async page read Jan 11 19:28:06 TESSERACT kernel: BTRFS error (device nvme0n1p1: state EA): bdev /dev/nvme0n1p1 errs: wr 63, rd 187191, flush 0, corrupt 0, gen 0 Jan 11 19:28:06 TESSERACT kernel: I/O error, dev loop3, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 Jan 11 19:28:06 TESSERACT kernel: Buffer I/O error on dev loop3, logical block 0, async page read Jan 11 19:28:09 TESSERACT emhttpd: Unmounting disks... Jan 11 19:28:09 TESSERACT emhttpd: shcmd (3450938): umount /mnt/cache-nvme Jan 11 19:28:09 TESSERACT root: umount: /mnt/cache-nvme: target is busy. Jan 11 19:28:09 TESSERACT emhttpd: shcmd (3450938): exit status: 32 Jan 11 19:28:09 TESSERACT emhttpd: Retry unmounting disk share(s)... Jan 11 19:28:14 TESSERACT emhttpd: Unmounting disks... full power cycle by turning off power at the PSU did restore the nvme without reseating...thanks JorgeB tesseract-diagnostics-20230111-1918.zip 1 Quote Link to comment
Alchemist Zim Posted January 20, 2023 Author Share Posted January 20, 2023 adding nvme_core.default_ps_max_latency_us=0 pcie_aspm=off seems to have fixed it been up for a week with no issues Thanks 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.