(Solved) Read Errors on new Sata Drives and Lost NVME Cache


Sundune
Go to solution Solved by JorgeB,

Recommended Posts

Hello Forum, 

 

last week i installed 2 new Seagate 8TB drives in my server Expanded the array and everything worked fine. 

During the initialization there was a read error on one of the disks, which got resolved by Unraid itself. Didn't think much about it considering it is a onetime occurrence during a time of heavy load. 

But since then, it is a more common occurrence for the new disks to have read errors.

Since then the NVME Cache Drive occasionally is also lost. What makes me guess it may not be related to the SATA/SAS Controller but maybe the Mainboard/CPU being overloaded?

I checked the cables and made sure the SATA/SAS Connectors are dust free. As recommended in another Thread.

As of writing this Thread there is a Parity Check going after losing the Cache Drive.

The attached diagnostics are from right after the Cache Drive was lost and before the restart.

 

PS: The drives are Brand new with warranty in case i need to return them.

 

Here is a bit more Information about the Hardware in use:

MB: SuperMicro X11SSH-LN4F
CPU: XEON E3-1245 v6
RAM: Kingston 9965643-006.A01G 8GB
RAID-Controller: LSI SAS2008 flashed in IT Mode

anon-unraid-diagnostics-20220131-1752.zip

Edited by Sundune
Fixed Title
Link to comment
  • Sundune changed the title to Read Errors on new Sata Drives and Lost NVME Cache
  • Solution

For the disk issue see here, NVMe issue is unrelated, it dropped offline:

 

Jan 31 17:46:20 Unraid kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10
Jan 31 17:46:20 Unraid kernel: nvme 0000:06:00.0: enabling device (0000 -> 0002)
Jan 31 17:46:20 Unraid kernel: nvme nvme0: Removing after probe failure status: -19

 

Look for a BIOS update, this can also help sometimes:

 

Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on flash, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (on the top right) and add this to your default boot option, after "append initrd=/bzroot"

nvme_core.default_ps_max_latency_us=0

e.g.:

append initrd=/bzroot nvme_core.default_ps_max_latency_us=0


Reboot and see if it makes a difference.

  • Thanks 1
Link to comment

Thank you very much for the helpfull link! I will try that after the Parity Check is completed, so that will be tomorrow. 

There is actually a new BIOS Update available, so that will also be on the checklist. 

The issues with the SSD dropping started after i installed the HDDs, of course that could be coincidence. But i will also give the changes you 

 

I forgot to mention that i had to change the syslinux.cfg file to get the HDDs running in the first place, see below

  append initrd=/bzroot intel_iommu=on iommu=pt

 

I will keep this Thread updated

Link to comment
  • Sundune changed the title to (Solved) Read Errors on new Sata Drives and Lost NVME Cache

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.