December 21, 20178 yr Once again my tower locked up, this time while it was streaming Plex content to my appletv... Every time this happens - which are alarmingly frequent after upgrading to 6.3.5 - I lose control on my dedicated console keyboard and screen, as well as the network dropping. After a hard reboot I get my array up and running again, I find this in the Syslog: Dec 21 16:39:42 mothership kernel: ------------[ cut here ]------------ Dec 21 16:39:42 mothership kernel: WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x181/0x1dc Dec 21 16:39:42 mothership kernel: NETDEV WATCHDOG: eth0 (e1000): transmit queue 0 timed out Dec 21 16:39:42 mothership kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables vhost_net tun vhost macvtap macvlan xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod mxm_wmi i2c_i801 i2c_smbus x86_pkg_temp_thermal coretemp i2c_core kvm_intel kvm ahci e1000 libahci video wmi backlight [last unloaded: md_mod] Dec 21 16:39:42 mothership kernel: CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.9.30-unRAID #1 Dec 21 16:39:42 mothership kernel: Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F22c 12/01/2017 Dec 21 16:39:42 mothership kernel: ffff88084ed83db0 ffffffff813a4a1b ffff88084ed83e00 ffffffff819aa12f Dec 21 16:39:42 mothership kernel: ffff88084ed83df0 ffffffff8104d0d9 0000013c4ed83e68 ffff880826ff0000 Dec 21 16:39:42 mothership kernel: ffff880826d2a800 ffff880826ff03a0 0000000000000003 0000000000000001 Dec 21 16:39:42 mothership kernel: Call Trace: Dec 21 16:39:42 mothership kernel: <IRQ> Dec 21 16:39:42 mothership kernel: [<ffffffff813a4a1b>] dump_stack+0x61/0x7e Dec 21 16:39:42 mothership kernel: [<ffffffff8104d0d9>] __warn+0xb8/0xd3 Dec 21 16:39:42 mothership kernel: [<ffffffff8104d13a>] warn_slowpath_fmt+0x46/0x4e Dec 21 16:39:42 mothership kernel: [<ffffffff815a848d>] dev_watchdog+0x181/0x1dc Dec 21 16:39:42 mothership kernel: [<ffffffff815a830c>] ? qdisc_rcu_free+0x39/0x39 Dec 21 16:39:42 mothership kernel: [<ffffffff815a830c>] ? qdisc_rcu_free+0x39/0x39 Dec 21 16:39:42 mothership kernel: [<ffffffff81090ccc>] call_timer_fn.isra.5+0x17/0x6b Dec 21 16:39:42 mothership kernel: [<ffffffff81090da5>] expire_timers+0x85/0x98 Dec 21 16:39:42 mothership kernel: [<ffffffff81090ea5>] run_timer_softirq+0x69/0x8f Dec 21 16:39:42 mothership kernel: [<ffffffff8103642b>] ? lapic_next_deadline+0x21/0x27 Dec 21 16:39:42 mothership kernel: [<ffffffff8109b347>] ? clockevents_program_event+0xd0/0xe8 Dec 21 16:39:42 mothership kernel: [<ffffffff81050f59>] __do_softirq+0xbb/0x1af Dec 21 16:39:42 mothership kernel: [<ffffffff810511fd>] irq_exit+0x53/0x94 Dec 21 16:39:42 mothership kernel: [<ffffffff81036e19>] smp_trace_apic_timer_interrupt+0x7b/0x88 Dec 21 16:39:42 mothership kernel: [<ffffffff81036e2f>] smp_apic_timer_interrupt+0x9/0xb Dec 21 16:39:42 mothership kernel: [<ffffffff81680172>] apic_timer_interrupt+0x82/0x90 Dec 21 16:39:42 mothership kernel: <EOI> Dec 21 16:39:42 mothership kernel: [<ffffffff815533e4>] ? cpuidle_enter_state+0xfe/0x156 Dec 21 16:39:42 mothership kernel: [<ffffffff8155345e>] cpuidle_enter+0x12/0x14 Dec 21 16:39:42 mothership kernel: [<ffffffff8107c545>] call_cpuidle+0x33/0x35 Dec 21 16:39:42 mothership kernel: [<ffffffff8107c727>] cpu_startup_entry+0x13a/0x1b2 Dec 21 16:39:42 mothership kernel: [<ffffffff81035482>] start_secondary+0xf5/0xf8 Dec 21 16:39:42 mothership kernel: ---[ end trace 83c675411c5afb19 ]--- Also, just below this message keeps occuring: Dec 21 16:39:42 mothership kernel: e1000 0000:04:01.0 eth0: Reset adapter Dec 21 16:39:42 mothership kernel: br0: port 1(eth0) entered disabled state Dec 21 16:39:42 mothership kernel: br0: topology change detected, propagating Dec 21 16:39:46 mothership kernel: e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Dec 21 16:39:46 mothership kernel: br0: port 1(eth0) entered blocking state Dec 21 16:39:46 mothership kernel: br0: port 1(eth0) entered listening state Dec 21 16:39:46 mothership kernel: dmar_fault: 215 callbacks suppressed Dec 21 16:39:46 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:46 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffc40000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: br0: port 1(eth0) entered learning state Dec 21 16:39:48 mothership kernel: br0: port 1(eth0) entered forwarding state Dec 21 16:39:48 mothership kernel: br0: topology change detected, sending tcn bpdu Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffcde000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffc20000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffd74000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffc4c000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr fff90000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ff61b000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffc2b000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ff5f1000 [fault reason 06] PTE Read access is not set Dec 21 16:39:48 mothership kernel: DMAR: DRHD: handling fault status reg 3 Dec 21 16:39:48 mothership kernel: DMAR: [DMA Read] Request device [04:00.0] fault addr ffde6000 [fault reason 06] PTE Read access is not set Dec 21 16:39:51 mothership kernel: dmar_fault: 4749 callbacks suppressed Whether it is relevant to my troubles, I do not know. I've just about had it with 6.3.5. If anyone would care to give me some pointers on how to get to the bottom of this, I am all ears. If downgrading to 6.1.9 (which I believe was automatically backup up onto my UnRaid-USB stick on upgrade) is an option, I'd also appreciate a guide on how to go about doing it.
December 21, 20178 yr Community Expert There's another user with a Gigabyte board and similar errors, so most likely board related, you could try v6.4 which uses a newer kernel.
December 21, 20178 yr @blodfjert I have had similar issues and upgraded to v6.4, click here for details, this morning which seems to have stopped the [DMA Read[ issues but the system is still hanging. I am going to change to a LSI h/d controller and see if this solves the issue as the SupermIcro Marvel controller that I presently use is an issue. I am re-enabling VT-d on the motherboard to see if this solves the new issue before purchasing a new controller.
December 21, 20178 yr Author Hi johnnie.black and trinikojak This is most interesting! The reason to our problems may very well be the same. Just for reference, this is my build: Model: Own build M/B: Gigabyte Technology Co., Ltd. - Z170XP-SLI-CF CPU: Intel® Core™ i5-6600 CPU @ 3.30GHz HVM: Enabled IOMMU: Enabled Cache: 256 kB, 1024 kB, 6144 kB Memory: 32 GB (max. installable capacity 64 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 Kernel: Linux 4.9.30-unRAID x86_64 OpenSSL: 1.0.2k When reading that you have upgraded to v6.4 - yet still encounter instability - I'd rather go the other way; downgrading. As I stated in my first post, my system started acting very unstable the moment I did the upgrade from 6.1.9 to 6.3.5. Prior to upgrading, it was rock stable (with the exact same hardware, except for the GPU which was switched from Ati HD Radeon 5850 to EVGA GeForce 1060 3GB the same day). My system would have an uptime of several months at the time, running dockers and VMs with passthrough and whatnot. Also, to my knowledge there are no "Marvel" chipset/controllers on my motherboard. I may be wrong. I'll se if I can find any information on downgrading to 6.1.9 without jeopardizing my setup. Let's keep eachother up to speed on the development
Archived
This topic is now archived and is closed to further replies.