Jump to content

NVME drives throwing errors, filling logs instantly. How to resolve?


Go to solution Solved by JorgeB,

Recommended Posts

  • 2 weeks later...
  • 4 weeks later...
  • 1 month later...
On 2/6/2023 at 9:51 AM, David Bott said:

Hi... I added...

 

append initrd=/bzroot pcie_aspm=off

 

...to the UnRAID OS config area.  (Click on Flash Drive to get to the right area.)  Add it in as a new line, save and reboot.

 

screenshot_3677.thumb.png.e7660e3f829f3e247caa2353d15e9af8.png

I am having the same issue. How do you add this to the config area, and where is it? Thanks

Link to comment
5 hours ago, trurl said:

Please see screenshot in the post you quoted above

I added in the Unraid OS, but still having the errors, I rebooted severals times as well. I'll post the screenshots. Let me know if I did something wrong. Thanks! 

 

I am running on Version: 6.12.0-rc5

 

 

Screenshot 2023-05-16 at 3.47.44 PM.png

Screenshot 2023-05-16 at 3.48.00 PM.png

Link to comment
  • 1 month later...
  • 4 months later...
On 12/17/2022 at 9:59 AM, Nuke said:

Am i change things correct?

image.png.6a3c2c24f128ab1cc74e9731ba9599b0.png


Hello, 

 

Appreciate all the posts from everyone here, based on the errors attached:

 

Oct 21 03:08:01 UnRAID emhttpd: read SMART /dev/sdf
Oct 21 03:09:47 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 03:09:47 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 03:09:47 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 03:09:47 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 03:16:14 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 03:16:14 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 03:16:14 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 03:16:14 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 03:19:37 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 03:19:37 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 03:19:37 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 03:19:37 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 03:49:51 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 03:49:51 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 03:49:51 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 03:49:51 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 04:03:13 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 04:03:13 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 04:03:13 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 04:03:13 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 04:07:04 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 04:07:04 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 04:07:04 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 04:07:04 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 04:10:54 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 04:10:54 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 04:10:54 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000
Oct 21 04:10:54 UnRAID kernel: pcieport 0000:00:01.0:    [ 6] BadTLP                
Oct 21 04:19:07 UnRAID emhttpd: read SMART /dev/sde
Oct 21 04:19:07 UnRAID emhttpd: read SMART /dev/sdo
Oct 21 05:04:47 UnRAID kernel: pcieport 0000:00:01.0: AER: Multiple Corrected error received: 0000:00:01.0
Oct 21 05:04:47 UnRAID kernel: pcieport 0000:00:01.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
Oct 21 05:04:47 UnRAID kernel: pcieport 0000:00:01.0:   device [8086:6f02] error status/mask=00000040/00002000

unraid-error.thumb.jpg.059944049a2cff6b80be910edb477ee6.jpg

 

Added the following:


Select > Main
Select > Flash
Scroll down > Syslinux Configuration
 

Unraid OS

kernel /bzimage
append pcie_aspm=off, initrd=/bzroot pci=noaer

image.png.3463b468a9cb0642b8fdf1e27bf402dd.png
 

Also referenced this post regarding pci=noaer

 

Since reboot, there is no further errors. 
Appreciate the assist here everyone!
Thank you!

Link to comment
  • 1 month later...
  • 5 months later...

The "pcie_aspm=off" kernel parameter hides a problem.  I would really like to fix the underlying problem so the parameter isn't needed.

 

If anybody is willing to help fix it, please open a bug report at https://bugzilla.kernel.org/, product Drivers/PCI, mention the hardware platform, and attach:

  • complete dmesg log (I assume this will include some Correctable Errors)
  • output of "sudo lspci -vv"

Try booting with the "pcie_aspm=off" kernel parameter to see if it makes any difference. If it does, please also attach similar dmesg and lspci output for this boot.

This seems similar to https://bugzilla.kernel.org/show_bug.cgi?id=215027, which we originally thought was related to Intel VMD and/or the Samsung NVMe device you have, but I now suspect we might have an ASPM configuration problem.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...