March 4, 20188 yr I have started to have crashes/Freeze of the entire Unraide It has started while using plex but not on load so not sure if it's related. This is the error log, but I don't understand it. ANy idea? where should I start to troubleshoot this? Mar 4 07:52:17 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:52:17 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:52:17 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:52:17 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:52:50 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:52:50 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:52:50 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:52:50 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:53:45 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:53:45 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:53:45 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:53:45 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:54:10 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:54:10 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:54:10 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:54:10 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:54:14 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:54:14 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:54:14 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:54:14 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:54:47 MediaCenter root: Fix Common Problems Version 2018.02.18 Mar 4 07:54:49 MediaCenter root: Fix Common Problems: Error: Default docker appdata location is not a cache-only share Mar 4 07:54:49 MediaCenter root: Fix Common Problems Version 2018.02.18 Mar 4 07:54:50 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:54:50 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:54:50 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:54:50 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:54:51 MediaCenter root: Fix Common Problems: Error: Default docker appdata location is not a cache-only share Mar 4 07:54:51 MediaCenter root: Fix Common Problems: Error: unclean shutdown detected of your server Mar 4 07:54:51 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:54:51 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:54:51 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:54:51 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:54:53 MediaCenter root: Fix Common Problems: Error: unclean shutdown detected of your server Mar 4 07:55:03 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:55:03 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:55:03 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:55:03 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:55:17 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:55:17 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:55:17 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 4 07:55:17 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 4 07:55:22 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 4 07:55:22 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 4 07:55:22 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Edited March 4, 20188 yr by L0rdRaiden
March 4, 20188 yr Author 1 minute ago, Squid said: You really should post the entire diagnostics to put the snippet into perspective. I have attached the file. Where should I start to look? Thanks mediacenter-diagnostics-20180304-0808.zip
March 4, 20188 yr Author It happend again after I unrar a file using my PC directly in one of the unraid shares through SMB. Could this be the issue? How can I fix it? Edited March 4, 20188 yr by L0rdRaiden
March 4, 20188 yr Author Please I need help, it happend again while I was just using radarr and sending files to ruTorrent, all I have are docker containers but if a docker container fails isn't supponse to kill the host. Edited March 4, 20188 yr by L0rdRaiden
March 4, 20188 yr Seems to me that you've got a hardware problem and/or BIOS problem. Perhaps someone like @johnnie.black might know more info. I myself haven't seen a dummy host bridge, and that's what your logs are referencing. 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452] 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d3] Kernel driver in use: pcieport
March 4, 20188 yr Author 1 hour ago, Squid said: Seems to me that you've got a hardware problem and/or BIOS problem. Perhaps someone like @johnnie.black might know more info. I myself haven't seen a dummy host bridge, and that's what your logs are referencing. 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452] 00:01.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d3] Kernel driver in use: pcieport Thanks for the help These are the devices IOMMU group 1:[1022:15d3] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 15d3 IOMMU group 0:[1022:1452] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge IOMMU group 2:[1022:1452] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge I'm only using a PCI slot where I have an ethernet card with 2 intel NICs. I350 I already applied this before the freezes Quote We ported a simplified version of the zenstates.py utility to C (to avoid including python in bzroot) which may be used to disable Ryzen C6 states (as workaround for Ryzen idle freeze issue). We have found that sometimes bios option to disable C6 does not exist or does not do the right thing. If you want to use this utility, we suggest that you edit the config/go file on your USB flash device. Add this line just before emhttp is invoked: /usr/local/sbin/zenstates --c6-disable Edited March 4, 20188 yr by L0rdRaiden
March 5, 20188 yr 18 hours ago, L0rdRaiden said: Mar 4 07:52:50 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 These errors can usually be fixed with a bios update or by using the offending PCIe card in a different slot, preferably swapping from a CPU slot to a chipset slot, or vice versa.
March 5, 20188 yr Author 11 minutes ago, johnnie.black said: These errors can usually be fixed with a bios update or by using the offending PCIe card in a different slot, preferably swapping from a CPU slot to a chipset slot, or vice versa. For now I have changed a BIOS setting (RYZEN here) related to the psu iddle state, it looks like it's working, we will see.
March 7, 20188 yr Author On 3/5/2018 at 11:12 AM, johnnie.black said: These errors can usually be fixed with a bios update or by using the offending PCIe card in a different slot, preferably swapping from a CPU slot to a chipset slot, or vice versa. I'm having again the freeze. Could you please confirm me if this is related whith the Ryzen c6 state issue or not? or should I try to switch the PCI card to another slot? If it's a problem with the PCI card what is the root cause? a hardware fail? @limetech Please help me here, I'm about the RMA the processor and getting something else. C6 State Ryzeb bug https://bugzilla.kernel.org/show_bug.cgi?id=196683#c194 Mar 7 23:54:50 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 7 23:54:50 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 7 23:54:50 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 7 23:55:02 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 7 23:55:02 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 7 23:55:02 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 7 23:55:02 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 7 23:55:07 MediaCenter kernel: vethfda2b9d: renamed from eth0 Mar 7 23:55:08 MediaCenter login[6512]: ROOT LOGIN on '/dev/pts/1' Mar 7 23:55:09 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 7 23:55:09 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 7 23:55:09 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 7 23:55:09 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 7 23:55:15 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 7 23:55:15 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 7 23:55:15 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 Mar 7 23:55:15 MediaCenter kernel: pcieport 0000:00:01.2: [12] Replay Timer Timeout Mar 7 23:55:19 MediaCenter kernel: pcieport 0000:00:01.2: AER: Corrected error received: id=0008 Mar 7 23:55:19 MediaCenter kernel: pcieport 0000:00:01.2: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=000a(Transmitter ID) Mar 7 23:55:19 MediaCenter kernel: pcieport 0000:00:01.2: device [1022:15d3] error status/mask=00001000/00006000 mediacenter-diagnostics-20180307-2357.zip Edited March 7, 20188 yr by L0rdRaiden
March 7, 20188 yr 7 minutes ago, L0rdRaiden said: I'm having again the freeze. Could you please confirm me if this is related whith the Ryzen c6 state issue or not? or should I try to switch the PCI card to another slot? I can't confirm nothing, those errors might be unrelated to the crash, but they are obviously not good and I already told you what you can do to try to get rid of them, if a bios update and/or changing slots won't help only a different mother board or whichever pcie card is causing them, this usually isn't bad hardware but a compatibility issue.
March 7, 20188 yr Author 2 minutes ago, johnnie.black said: I can't confirm nothing, those errors might be unrelated to the crash, but they are obviously not good and I already told you what you can do to try to get rid of them, if a bios update and/or changing slots won't help only a different mother board or whichever pcie card is causing them, this usually isn't bad hardware but a compatibility issue. Thanks, I already have the latest bios, I will try other PCI port It seems that I'm not the only one https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1521173 Edited March 7, 20188 yr by L0rdRaiden
Archived
This topic is now archived and is closed to further replies.