April 21, 20215 yr Hi, I have a Threadripper 1950x + X399 Zenith extreme, used to have X399 Aorus pro. I have a bunch of NVMe SSDs and three nvidia 1070. I did have Samsung PM983 and PM963 SSDs and after month or so suddenly one SSD stopped being detected by motherboard and soon after all of them. They work perfectly in other PCs. I have RMAed the Aorus board and got refunded, so I bought new Zenith board without much success. So I switched to Kioxia exercia SSDs and after a week of so one SSD started giving me btrfs errors Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324434, rd 888470, flush 21855, corrupt 0, gen 0 Apr 21 12:27:01 Unraidbox kernel: BTRFS warning (device nvme2n1p1): lost page write due to IO error on /dev/nvme2n1p1 (-5) Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324435, rd 888470, flush 21855, corrupt 0, gen 0 Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): error writing primary super block to device 1 Now I have lost second SSD to same error. I have found, that the threadrippers were affected by PCIe bugs, so I tried to switch to PCIe gen 2, but it does not solve my problem. I feel like the problem is the CPU, unfortunately these days it is quite impossible to get hands on another TR4 CPU. Any ideas what to try?
April 22, 20215 yr Community Expert Sometimes this helps with NVMe devices dropping, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append" and before "initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference.
April 23, 20215 yr Author It is soon to tell, whether it solved the issue, but from what I found, it seems to be the right fix to my issue. I might ask to add this to fix common problems plugin
April 23, 20215 yr Author Unfortunately, issue still persists, I have added the "nvme_core.default_ps_max_latency_us=0" Apr 22 23:39:42 Unraidbox kernel: BTRFS error (device nvme3n1p1): bdev /dev/nvme3n1p1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in btrfs_commit_transaction:2377: errno=-5 IO failure (Error while writing out transaction) Apr 22 23:39:42 Unraidbox kernel: BTRFS info (device nvme3n1p1): forced readonly Apr 22 23:39:42 Unraidbox kernel: BTRFS warning (device nvme3n1p1): Skipping commit of aborted transaction. Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in cleanup_transaction:1942: errno=-5 IO failure Edited April 23, 20215 yr by Maor
April 23, 20215 yr Community Expert Then it's most likely a hardware issue, look for a BIOS update if you don't already
April 23, 20215 yr Author I am already on latest bios, I have noticed, that in early after boot in dmesg is this for the failing drive nvme nvme4: failed to set APST feature (-19) Edited April 23, 20215 yr by Maor
April 23, 20215 yr Community Expert Then don't have other suggestions, other than using different hardware.
Archived
This topic is now archived and is closed to further replies.