Maor Posted April 21, 2021 Share Posted April 21, 2021 Hi, I have a Threadripper 1950x + X399 Zenith extreme, used to have X399 Aorus pro. I have a bunch of NVMe SSDs and three nvidia 1070. I did have Samsung PM983 and PM963 SSDs and after month or so suddenly one SSD stopped being detected by motherboard and soon after all of them. They work perfectly in other PCs. I have RMAed the Aorus board and got refunded, so I bought new Zenith board without much success. So I switched to Kioxia exercia SSDs and after a week of so one SSD started giving me btrfs errors Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324434, rd 888470, flush 21855, corrupt 0, gen 0 Apr 21 12:27:01 Unraidbox kernel: BTRFS warning (device nvme2n1p1): lost page write due to IO error on /dev/nvme2n1p1 (-5) Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324435, rd 888470, flush 21855, corrupt 0, gen 0 Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): error writing primary super block to device 1 Now I have lost second SSD to same error. I have found, that the threadrippers were affected by PCIe bugs, so I tried to switch to PCIe gen 2, but it does not solve my problem. I feel like the problem is the CPU, unfortunately these days it is quite impossible to get hands on another TR4 CPU. Any ideas what to try? Quote Link to comment
JorgeB Posted April 22, 2021 Share Posted April 22, 2021 Sometimes this helps with NVMe devices dropping, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append" and before "initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 Reboot and see if it makes a difference. Quote Link to comment
Maor Posted April 23, 2021 Author Share Posted April 23, 2021 It is soon to tell, whether it solved the issue, but from what I found, it seems to be the right fix to my issue. I might ask to add this to fix common problems plugin Quote Link to comment
Maor Posted April 23, 2021 Author Share Posted April 23, 2021 (edited) Unfortunately, issue still persists, I have added the "nvme_core.default_ps_max_latency_us=0" Apr 22 23:39:42 Unraidbox kernel: BTRFS error (device nvme3n1p1): bdev /dev/nvme3n1p1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in btrfs_commit_transaction:2377: errno=-5 IO failure (Error while writing out transaction) Apr 22 23:39:42 Unraidbox kernel: BTRFS info (device nvme3n1p1): forced readonly Apr 22 23:39:42 Unraidbox kernel: BTRFS warning (device nvme3n1p1): Skipping commit of aborted transaction. Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in cleanup_transaction:1942: errno=-5 IO failure Edited April 23, 2021 by Maor Quote Link to comment
JorgeB Posted April 23, 2021 Share Posted April 23, 2021 Then it's most likely a hardware issue, look for a BIOS update if you don't already Quote Link to comment
Maor Posted April 23, 2021 Author Share Posted April 23, 2021 (edited) I am already on latest bios, I have noticed, that in early after boot in dmesg is this for the failing drive nvme nvme4: failed to set APST feature (-19) Edited April 23, 2021 by Maor Quote Link to comment
JorgeB Posted April 23, 2021 Share Posted April 23, 2021 Then don't have other suggestions, other than using different hardware. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.