Muath Posted November 5, 2019 Share Posted November 5, 2019 (edited) Hi all, This week i was struggling with my system and could't find the issue! but also i'm new to unraid so maybe there's some kind of investe and digonaste error i don't know. Anyhow .. my issue is .. a weird one i think, my system keep shutdown without the pc itself so only the OS and the USB flash light stop blinking, when i was on version 6.7 it happen but after a while of use (less than 24h), but now after upgrading the system to 6.8 rc5 since i was thinking the issue may fix itself but it got worse now after i start the array by 10 minutes first the system getting very slow then it shut itself down (only the OS but the PC still running). Diagnostics file before the system got shut down attached. also i notice something in the logs: System Log Nov 5 19:42:09 MoathCenterr kernel: ACPI: Early table checksum verification disabled Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:02.0: BAR 14: failed to assign [mem size 0x00600000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 6: failed to assign [mem size 0x00080000 pref] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 3: failed to assign [mem size 0x00040000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 1: failed to assign [mem size 0x00004000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:02:00.0: BAR 14: failed to assign [mem size 0x00c00000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:01.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:02.0: BAR 14: failed to assign [mem size 0x00600000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:03.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:08.0: BAR 14: failed to assign [mem size 0x00200000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:09.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:03:0a.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 3: failed to assign [mem size 0x00040000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 1: failed to assign [mem size 0x00004000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:06:00.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:07:05.0: BAR 14: failed to assign [mem size 0x00100000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0a:00.0: BAR 0: failed to assign [mem size 0x00020000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0a:00.0: BAR 3: failed to assign [mem size 0x00004000] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0c:00.1: BAR 0: failed to assign [mem size 0x00100000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0c:00.3: BAR 0: failed to assign [mem size 0x00100000 64bit] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0d:00.0: BAR 5: failed to assign [mem size 0x00000800] Nov 5 19:42:09 MoathCenterr kernel: pci 0000:0e:00.0: BAR 5: failed to assign [mem size 0x00000800] Nov 5 19:42:09 MoathCenterr kernel: floppy0: no floppy controllers found Nov 5 19:42:09 MoathCenterr kernel: random: 7 urandom warning(s) missed due to ratelimiting Nov 5 19:42:09 MoathCenterr kernel: ccp 0000:11:00.1: SEV: failed to get status. Error: 0x0 Nov 5 19:42:10 MoathCenterr rpc.statd[2043]: Failed to read /var/lib/nfs/state: Success Nov 5 19:42:10 MoathCenterr ntpd[2073]: bind(19) AF_INET6 fe80::dcb6:50ff:fe5a:f97b%10#123 flags 0x11 failed: Cannot assign requested address Nov 5 19:42:10 MoathCenterr ntpd[2073]: failed to init interface for address fe80::dcb6:50ff:fe5a:f97b%10 Nov 5 19:42:17 MoathCenterr ntpd[2073]: bind(22) AF_INET6 2001:16a2:953:5d00:7285:c2ff:fed3:fabb#123 flags 0x11 failed: Cannot assign requested address Nov 5 19:42:17 MoathCenterr ntpd[2073]: failed to init interface for address 2001:16a2:953:5d00:7285:c2ff:fed3:fabb Nov 5 19:42:33 MoathCenterr avahi-daemon[6713]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns! now I will return to the previous version 6.7 at least I could watch one or two movies 🙄. Thank you for helping in advance. moathcenterr-diagnostics-20191105-1444.zip Edited November 5, 2019 by Muath Quote Link to comment
Squid Posted November 5, 2019 Share Posted November 5, 2019 Have you run a memtest yet?Sent from my NSA monitored device 1 Quote Link to comment
Muath Posted November 5, 2019 Author Share Posted November 5, 2019 1 hour ago, Squid said: Have you run a memtest yet? Sent from my NSA monitored device thank you for your quick reply, i have run it after you mentioned, but it won't work i have change the ram and used different slots but it doesn't work. once i select memtest86+ it goes black for 2sec then returns to motherboard manufacturer's logo! Quote Link to comment
John_M Posted November 5, 2019 Share Posted November 5, 2019 5 minutes ago, Muath said: once i select memtest86+ it goes black for 2sec then returns to motherboard manufacturer's logo! That's a symptom of UEFI booting. You need to legacy boot to be able to select MemTest86+ from the boot menu. 1 Quote Link to comment
Muath Posted November 6, 2019 Author Share Posted November 6, 2019 20 hours ago, John_M said: That's a symptom of UEFI booting. You need to legacy boot to be able to select MemTest86+ from the boot menu. thank you for the notifying, it worked. so now based on the photo did my RAMs passed the test? or should i leave it for more time? or is there some specific test i should try? Quote Link to comment
John_M Posted November 6, 2019 Share Posted November 6, 2019 No errors so far, which is good. I'd let it run for 24 hours or more. 1 Quote Link to comment
Muath Posted November 8, 2019 Author Share Posted November 8, 2019 (edited) I ran the test for more than a 24h and no errors. but still, I'm kind of facing the same issue. now I'm on 6.7v which at least I could use it before it hangs but then after less than a 24h it hangs but now I can navigate UNRAID page but I can't run firefox also when I command (reboot) it doesn't respond and reboot the system!. Maybe the USB flash what causing this issue? even though it's new and USB 2v which is the recommended. when I logged out it showing me UNRAID logo without option to log in (as the photo attached) Edited November 8, 2019 by Muath Quote Link to comment
John_M Posted November 8, 2019 Share Posted November 8, 2019 The next thing to test is the USB stick itself. Shutdown your server and plug your USB stick into a PC and check/repair the file system on it. With Windows you can right click and choose Properties, then Tools and check/repair the device. With macOS you can use the Disk Utility. 1 Quote Link to comment
Muath Posted November 17, 2019 Author Share Posted November 17, 2019 (edited) so, after scanning the USB stick by Windows it shows no issue with it, but then when returned it to the server it works! it lasts for a weak so I thought the issue been fixed now .. but then when I upgrade it to v6.8 rc6 it worked for an hour which is better than before but still went down after 😔. I've run Fix Common Problems plug-in and no issue found. I have the things below which may cause the issue: 1- APC UPC connects via a USB. 2- (SAS9211-8I 8PORT Int 6GB Sata+SAS Pcie 2.0). logs still show the same warnings: Nov 17 17:45:48 MoathCenterr kernel: ACPI: Early table checksum verification disabled Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:02.0: BAR 14: failed to assign [mem size 0x00600000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 6: failed to assign [mem size 0x00080000 pref] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 3: failed to assign [mem size 0x00040000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 1: failed to assign [mem size 0x00004000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:02:00.0: BAR 14: failed to assign [mem size 0x00c00000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:01.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:02.0: BAR 14: failed to assign [mem size 0x00600000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:03.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:08.0: BAR 14: failed to assign [mem size 0x00200000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:09.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:03:0a.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:04:00.0: BAR 0: failed to assign [mem size 0x00004000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 3: failed to assign [mem size 0x00040000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 9: failed to assign [mem size 0x00400000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 1: failed to assign [mem size 0x00004000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:05:00.0: BAR 7: failed to assign [mem size 0x00040000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:06:00.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:07:05.0: BAR 14: failed to assign [mem size 0x00100000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0a:00.0: BAR 0: failed to assign [mem size 0x00020000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0a:00.0: BAR 3: failed to assign [mem size 0x00004000] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0c:00.1: BAR 0: failed to assign [mem size 0x00100000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0c:00.3: BAR 0: failed to assign [mem size 0x00100000 64bit] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0d:00.0: BAR 5: failed to assign [mem size 0x00000800] Nov 17 17:45:48 MoathCenterr kernel: pci 0000:0e:00.0: BAR 5: failed to assign [mem size 0x00000800] Nov 17 17:45:48 MoathCenterr kernel: floppy0: no floppy controllers found Nov 17 17:45:48 MoathCenterr kernel: random: 7 urandom warning(s) missed due to ratelimiting Nov 17 17:45:48 MoathCenterr kernel: ccp 0000:11:00.1: SEV: failed to get status. Error: 0x0 Nov 17 17:45:49 MoathCenterr rpc.statd[2051]: Failed to read /var/lib/nfs/state: Success Nov 17 17:45:49 MoathCenterr ntpd[2081]: bind(19) AF_INET6 fe80::a861:8fff:fe9b:94c9%10#123 flags 0x11 failed: Cannot assign requested address Nov 17 17:45:49 MoathCenterr ntpd[2081]: failed to init interface for address fe80::a861:8fff:fe9b:94c9%10 Nov 17 17:46:08 MoathCenterr avahi-daemon[6501]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns! Nov 17 19:10:05 MoathCenterr root: error: /webGui/include/ProcessStatus.php: wrong csrf_token Nov 17 19:10:10 MoathCenterr kernel: XFS (md1): Per-AG reservation for AG 7 failed. Filesystem may run out of space. Nov 17 19:47:03 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 19:47:03 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 19:47:03 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:10 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:10 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:10 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:13 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:51 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:51 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:51 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:59 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:59 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:59 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:02:02 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:02:02 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:02:02 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:02:03 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:02:03 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:02:03 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:02:09 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:02:09 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:02:09 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:02:12 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:02:12 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:02:12 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 There's new errors now which is: Nov 17 20:02:12 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:02:12 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 moathcenterr-diagnostics-20191117-1734.zip Edited November 17, 2019 by Muath Quote Link to comment
John_M Posted November 17, 2019 Share Posted November 17, 2019 This thread might help with the PCIe Bus Error: 1 Quote Link to comment
Muath Posted November 25, 2019 Author Share Posted November 25, 2019 On 11/17/2019 at 11:07 PM, John_M said: This thread might help with the PCIe Bus Error: Thank you, Seems this issue fixed it self, but main issue still remain . Does below logs help to figure the issue? Nov 17 20:01:23 MoathCenterr rsyslogd: [origin software="rsyslogd" swVersion="8.1903.0" x-pid="1934" x-info="https://www.rsyslog.com"] start Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:23 MoathCenterr kernel: pcieport 0000:00:01.1: [ 6] BadTLP Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Nov 17 20:01:31 MoathCenterr kernel: pcieport 0000:00:01.1: [ 6] BadTLP Nov 17 20:01:35 MoathCenterr ool www[26632]: /usr/local/emhttp/plugins/dynamix/scripts/rsyslog_config Nov 25 16:54:36 MoathCenterr rsyslogd: [origin software="rsyslogd" swVersion="8.1903.0" x-pid="2153" x-info="https://www.rsyslog.com"] start Nov 25 16:54:48 MoathCenterr ntpd[2044]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized Quote Link to comment
Muath Posted November 29, 2019 Author Share Posted November 29, 2019 Well .. I've updated the BIOS and now it's not working at all 😔💔 Quote Link to comment
trurl Posted November 29, 2019 Share Posted November 29, 2019 22 minutes ago, Muath said: Well .. I've updated the BIOS and now it's not working at all 😔💔 What do you get when you try to boot now? Possibly it is just a case of the BIOS resetting so it isn't trying to boot from the flash now. 1 Quote Link to comment
Muath Posted November 29, 2019 Author Share Posted November 29, 2019 (edited) 36 minutes ago, trurl said: What do you get when you try to boot now? Possibly it is just a case of the BIOS resetting so it isn't trying to boot from the flash now. I've set to boot from the flash, but after the motherboard logo and attempt to boot UNRAID it goes into a black screen and all the fans work weirdly some fans speed up and other slow down or even stopped, and when I try to forcibly shut it (via the button), it does not respond! so the only way to shut it down is by PSU! I'm thinking maybe all these issues from the motherboard (ASRock X570 Steel Legend). *I remember now, any fan connect to the Motherboard don't work properly, as the picture show now CPU fan doesn't work at all after booting UNRAID. GPU fans working fine. Edited November 29, 2019 by Muath Quote Link to comment
John_M Posted December 7, 2019 Share Posted December 7, 2019 (edited) Maybe the BIOS is corrupt. I'd first try resetting the CMOS to see if that helps. Then redo the BIOS update, if it doesn't. The CPU fan should not stop when you boot an OS! Many consumer BIOSes have user-selectable fan profiles so maybe something got messed up. Edited December 7, 2019 by John_M 1 Quote Link to comment
Muath Posted December 7, 2019 Author Share Posted December 7, 2019 4 hours ago, John_M said: Maybe the BIOS is corrupt. I'd first try resetting the CMOS to see if that helps. Then redo the BIOS update, if it doesn't. The CPU fan should not stop when you boot an OS! Many consumer BIOSes have user-selectable fan profiles so maybe something got messed up. sorry I didn't update my current situation, I tried to clear the CMOS then update the BIOS but didn't work, then I tried to update the BIOS to a different version and still not working, then I returned back to release version and it worked and the system boot-up!. (BIOS Update page.) then the main issue happened again to me, so after some search seems there's an issue with ASrock motherboards with Linux in general, but there's some suggestion to fix the issue one of which: pcie_aspm=off to "Syslinux Configuration" (not sure if I did it correctly kindly see the picture if it's correct or not) so now server will work for a couple of days or so, then OS will stop (the main issue). I'm starting to believe the issue caused by the motherboard so I ordered another one from ASUS, and I hope it fixes the issue and the motherboard is the real cause. btw, is there something required before changing the motherboard? and for the Logs the 3 line error below keep repeating: Dec 7 08:47:10 MoathCenterr kernel: pcieport 0000:00:01.1: AER: Corrected error received: 0000:00:01.1 Dec 7 08:47:10 MoathCenterr kernel: pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Dec 7 08:47:10 MoathCenterr kernel: pcieport 0000:00:01.1: device [1022:1483] error status/mask=00000040/00006000 Quote Link to comment
Muath Posted December 7, 2019 Author Share Posted December 7, 2019 (edited) I noticed something just now, CPU usage went way high (~75%) which is very rare then OS stopped! * sysLog attached. syslog 2019-12-07 Edited December 7, 2019 by Muath Quote Link to comment
raidserver Posted December 9, 2019 Share Posted December 9, 2019 (edited) My original issue with the PCIE x4 port which i use for the HBA card/HDD`s never gave me any issues other than spamming the log with errors. Its a power management issue with the HBA trying to save power. Turning the pcie aspm off for the x4 socket solved the log errors. Hope new motherboard swap fixes your issue @Muath Edit Switched off aspm via BIOS Edited December 9, 2019 by raidserver for clarity 1 Quote Link to comment
trurl Posted December 9, 2019 Share Posted December 9, 2019 On 12/7/2019 at 9:29 AM, Muath said: pcie_aspm=off to "Syslinux Configuration" (not sure if I did it correctly kindly see the picture if it's correct or not) Not sure but I think that would need to be added to the append line. 1 Quote Link to comment
Muath Posted December 13, 2019 Author Share Posted December 13, 2019 (edited) On 12/9/2019 at 4:24 PM, raidserver said: My original issue with the PCIE x4 port which i use for the HBA card/HDD`s never gave me any issues other than spamming the log with errors. Its a power management issue with the HBA trying to save power. Turning the pcie aspm off for the x4 socket solved the log errors. Hope new motherboard swap fixes your issue @Muath Edit Switched off aspm via BIOS Thank you very much, unfortunately, swapping MB didn't help 😔, when I had the ASrock motherboard I couldn't find how to disable it through the BIOS, I will search it now with the new motherboard On 12/9/2019 at 6:12 PM, trurl said: Not sure but I think that would need to be added to the append line. at first, I added it to the append line but then the OS won't boot up. # so now, I swapped MB then upgrade unRAID to 6.8 and now OS went down 😔. the weird thing is the power draw didn't change much so it's more like OS is hung (when the OS is up and running flash stick usually blinking ) so is it maybe the USB flash need to be replaced or SAS controller is the cause of the issue? I just want to enjoy my movies 😔 btw, this issue happen after I bought second nvme m.2. and configure it as RAID 0, do you think this may cause this issue? Edited December 13, 2019 by Muath Quote Link to comment
Muath Posted February 16, 2020 Author Share Posted February 16, 2020 (edited) a quick update on my issue here, so this issue is still occurring every two weeks~ (since the server working fine the rest of the almost two weeks I adapt to this issue) after last hang parity-check triggered due to forcing the shutdown, now the parity-check is stuck!! * This is not the first time happen to me, when I tried to reboot it will hang moathcenterr-diagnostics-20200216-1927.zip Edited February 16, 2020 by Muath Quote Link to comment
JorgeB Posted February 17, 2020 Share Posted February 17, 2020 Can't see nothing on the log to why it's stuck, what happens if you pause and then resume? 1 Quote Link to comment
Muath Posted February 19, 2020 Author Share Posted February 19, 2020 (edited) On 2/17/2020 at 10:40 AM, johnnie.black said: Can't see nothing on the log to why it's stuck, what happens if you pause and then resume? Below link is a recording when I tried to pause it, and when I reboot or shut down using the system it hung so then I force shutting it down. https://drive.google.com/open?id=1I_bq1_zauobcCLPmcEK_nWyzWH2SOBYL logs which shown in the end is: Feb 19 00:03:51 MoathCenterr nginx: 2020/02/19 00:03:51 [error] 5804#5804: *3929950 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main" Feb 19 00:03:57 MoathCenterr nginx: 2020/02/19 00:03:57 [error] 5804#5804: *3929942 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main" Feb 19 00:04:03 MoathCenterr nginx: 2020/02/19 00:04:03 [error] 5804#5804: *3930103 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /update.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/update.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main" Feb 19 00:04:12 MoathCenterr nginx: 2020/02/19 00:04:12 [error] 5804#5804: *3930061 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /logging.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/logging.htm", host: "moathcenterr", referrer: "http://moathcenterr/Main" Feb 19 00:04:23 MoathCenterr nginx: 2020/02/19 00:04:23 [error] 5804#5804: *3930278 connect() to unix:/var/run/emhttpd.socket failed (11: Resource temporarily unavailable) while connecting to upstream, client: 192.168.100.35, server: , request: "POST /logging.htm HTTP/1.1", upstream: "http://unix:/var/run/emhttpd.socket:/logging.htm", host: "moathcenterr", referrer: "http://moathcenterr/Tools" moathcenterr-diagnostics-20200219-0019.zip syslog Edited February 19, 2020 by Muath Quote Link to comment
JorgeB Posted February 19, 2020 Share Posted February 19, 2020 That looks like an Nginx problem, and outside my knowledge, maybe someone else can help, rebooting should fix it, but parity check will restart. 1 Quote Link to comment
Dissones4U Posted February 19, 2020 Share Posted February 19, 2020 On 12/13/2019 at 10:17 AM, Muath said: btw, this issue happen after I bought second nvme m.2. and configure it as RAID 0, do you think this may cause this issue? This may be elementary but did you try to remove the new hardware and revert to the prior "working" configuration? 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.