Threadripper 1950x errors with nvme SSDs


Recommended Posts

Hi,

I have a Threadripper 1950x + X399 Zenith extreme, used to have X399 Aorus pro. I have a bunch of NVMe SSDs and three nvidia 1070.
I did have Samsung PM983 and PM963 SSDs and after month or so suddenly one SSD stopped being detected by motherboard and soon after all of them. They work perfectly in other PCs. I have RMAed the Aorus board and got refunded, so I bought new Zenith board without much success.
So I switched to Kioxia exercia SSDs and after a week of so one SSD started giving me btrfs errors
 

Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324434, rd 888470, flush 21855, corrupt 0, gen 0
Apr 21 12:27:01 Unraidbox kernel: BTRFS warning (device nvme2n1p1): lost page write due to IO error on /dev/nvme2n1p1 (-5)
Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 1324435, rd 888470, flush 21855, corrupt 0, gen 0
Apr 21 12:27:01 Unraidbox kernel: BTRFS error (device nvme2n1p1): error writing primary super block to device 1

Now I have lost second SSD to same error.

I have found, that the threadrippers were affected by PCIe bugs, so I tried to switch to PCIe gen 2, but it does not solve my problem.
I feel like the problem is the CPU, unfortunately these days it is quite impossible to get hands on another TR4 CPU.
Any ideas what to try?

Link to comment

Sometimes this helps with NVMe devices dropping, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append" and before "initrd=/bzroot"

 

nvme_core.default_ps_max_latency_us=0

Reboot and see if it makes a difference.

 

 

Link to comment
Posted (edited)

Unfortunately, issue still persists, I have added the "nvme_core.default_ps_max_latency_us=0"

Apr 22 23:39:42 Unraidbox kernel: BTRFS error (device nvme3n1p1): bdev /dev/nvme3n1p1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in btrfs_commit_transaction:2377: errno=-5 IO failure (Error while writing out transaction)
Apr 22 23:39:42 Unraidbox kernel: BTRFS info (device nvme3n1p1): forced readonly
Apr 22 23:39:42 Unraidbox kernel: BTRFS warning (device nvme3n1p1): Skipping commit of aborted transaction.
Apr 22 23:39:42 Unraidbox kernel: BTRFS: error (device nvme3n1p1) in cleanup_transaction:1942: errno=-5 IO failure

 

Edited by Maor
Link to comment
Posted (edited)

I am already on latest bios, I have noticed, that in early after boot in dmesg is this for the failing drive

nvme nvme4: failed to set APST feature (-19)

 

Edited by Maor
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.