Constant errors on logs after nvme upgrade

Moises · September 9, 2021

Quote

Sep 9 18:43:28 Tower kernel: pcieport 0000:00:1b.0: AER: Corrected error received: 0000:02:00.0
Sep 9 18:43:28 Tower kernel: nvme 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Sep 9 18:43:28 Tower kernel: nvme 0000:02:00.0: device [15b7:5006] error status/mask=00000001/0000e000

So after upgrading from an old generic ssd to a wd black sn750, I am seeing this message constantly on my log, nothing seems to be wrong with the server but makes any kind of debugging impossible since it happens every 5-10 seconds. Why is this happening and how do I fix it?

SimonF · September 9, 2021

5 minutes ago, Moises said:

So after upgrading from an old generic ssd to a wd black sn750, I am seeing this message constantly on my log, nothing seems to be wrong with the server but makes any kind of debugging impossible since it happens every 5-10 seconds. Why is this happening and how do I fix it?

Have a look at this post from Jorge.

https://lime-technology.com/forums/topic/72837-error-log-filled-to-100-in-1day-47min-with-this/?do=findComment&comment=669775

Moises · September 9, 2021

Sadly this did not fix it, I don't have another m.2 slot on my motherboard so can't move it around and adding

pci=nommconfto did not do anything

Edit: not sure how much this matters but I am booting in legacy mode on my motherboard, UEFI boot makes unraid not able to see my nvme for some reason

Edited September 10, 2021 by Moises

Mattitude · September 21, 2021

Did you manage to resolve this?

I'm getting the same thing for my NVME drive as well and it fills the log file in a matter of hours.

kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:02:00.0
kernel: nvme 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
kernel: nvme 0000:02:00.0: device [144d:a804] error status/mask=00000001/00006000
kernel: nvme 0000:02:00.0: [ 0] RxErr

Mattitude · September 21, 2021

I've managed to find a solution by using a different command in the syslinux configuration.

append initrd=/bzroot pci=noaer

Apparently pci=noaer disables Advanced Error Reporting. Unfortunately its a bit like pulling the bulb behind your check engine light on your car but not sure how else to disable these warnings. There are some other options as well which i might try and see if they also work.

pci=noaer

or

pci=nomsi

or

pci=nommcon

Found this info here:

See more details here:

And Here: https://askubuntu.com/questions/1104219/what-does-pci-noaer-or-pci-nomsi-mean

JonasH · January 26, 2022

You need to add this to your syslinux. It prevents the nvme drive to go in the deepest sleep mode. Remove pci=noaer, because it only hides the problem.

nvme_core.default_ps_max_latency_us=5500

Edited January 26, 2022 by JonasH

Constant errors on logs after nvme upgrade

Recommended Posts

Moises

Link to comment

SimonF

Link to comment

Moises

Link to comment

Mattitude

Link to comment

Mattitude

Link to comment

JonasH

Link to comment

Join the conversation