samsung 990 pro 4TB disappeared, for the 3e time.


Recommended Posts

Hi all,
I upgraded my unraid server 3 month ago To the following:

  • ASRock Z690 Extreme purchase date: 26 nov 2023
  • 13th Gen Intel® Core™ i3-13100F purchase date: 26 nov 2023
  • 64 GiB DDR4 purchase date: 26 nov 2023
  • GeForce GTX 1050 Ti 
  • Samsung_SSD_980_1TB (old ssd but working fine) purchase date: 4 years ago 
  • Samsung 990 PRO 4TB (Not working, missing) purchase date: 07 jan 2024 

  • 5x 4tb hdd, 3 of with are WD red and the others are Seagate Constellation ES.3(refurbished)

 

Hey, so my setup was all good initially, but after about 1 or 2 weeks, I started getting this annoying filesystem read-only error. And guess what? After a reboot unraid says my SSD is missing. The weird thing is, when I jiggle it around a bit (yes, I'm talking about reseating it), it magically works again for another couple of weeks. But hey, I'm on vacation right now and can't do that.

So, to avoid the headache, I just removed the SSD for now. It's not like it's doing anything super important anyway.


The fist problem listed in the syslogs is: 
Feb 21 08:00:02 Tower kernel: btrfs_dev_stat_inc_and_print: 66 callbacks suppressed
Feb 21 08:00:02 Tower kernel: BTRFS error (device nvme1n1p1: state EA): bdev /dev/nvme1n1p1 errs: wr 2, rd 585789, flush 0, corrupt 0, gen 0

Yeah, that's the culprit, spamming my logs over and over again. Annoying as heck!

 

Link to comment

I don't see the device dropping in those diags, if it happens again save new ones, in the meantime you can try this:

 

On the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"

nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

e.g.:

append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off


Reboot and see if it makes a difference.

 

Link to comment

syslog-20240220-184239.txt This is the log of yesterday.

Feb 20 02:12:55 Tower kernel: nvme nvme1: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
Feb 20 02:12:55 Tower kernel: nvme nvme1: Does your device have a faulty power saving mode enabled?
Feb 20 02:12:55 Tower kernel: nvme nvme1: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug
Feb 20 02:12:55 Tower kernel: nvme1n1: I/O Cmd(0x2) @ LBA 709300912, 32 blocks, I/O Error (sct 0x3 / sc 0x71) 
Feb 20 02:12:55 Tower kernel: I/O error, dev nvme1n1, sector 709300912 op 0x0:(READ) flags 0x80700 phys_seg 3 prio class 2

I have used powertop to lower the power consumption a bit. but can't see it in there now. 

also the reboot did nothing. it did not comeback at least  

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.