What's going on with my server?


Recommended Posts

Hi guys,

 

I keep getting these errors whenever I try to do any disk intensive tasks, usually tend to happen when writing to the array, nothing has changed, I am on the latest version of unraid:

 

Jun 23 16:51:43 Tower kernel: ------------[ cut here ]------------
Jun 23 16:51:43 Tower kernel: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x181/0x1dc
Jun 23 16:51:43 Tower kernel: NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Jun 23 16:51:43 Tower kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables vhost_net tun vhost macvtap macvlan xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat md_mod bonding hid_logitech_hidpp mxm_wmi x86_pkg_temp_thermal coretemp kvm_intel kvm e1000e i2c_i801 i2c_smbus i2c_core ptp hid_logitech_dj nvme pps_core ahci libahci nvme_core wmi
Jun 23 16:51:43 Tower kernel: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.30-unRAID #1
Jun 23 16:51:43 Tower kernel: Hardware name: ASUS All Series/X99-A II, BIOS 1701 03/31/2017
Jun 23 16:51:43 Tower kernel: ffff880c2f203db0 ffffffff813a4a1b ffff880c2f203e00 ffffffff819aa12f
Jun 23 16:51:43 Tower kernel: ffff880c2f203df0 ffffffff8104d0d9 0000013c2f203e68 ffff880c25064000
Jun 23 16:51:43 Tower kernel: ffff880c23aa7800 ffff880c250643a0 0000000000000000 0000000000000001
Jun 23 16:51:43 Tower kernel: Call Trace:
Jun 23 16:51:43 Tower kernel: <IRQ> 
Jun 23 16:51:43 Tower kernel: [<ffffffff813a4a1b>] dump_stack+0x61/0x7e
Jun 23 16:51:43 Tower kernel: [<ffffffff8104d0d9>] __warn+0xb8/0xd3
Jun 23 16:51:43 Tower kernel: [<ffffffff8104d13a>] warn_slowpath_fmt+0x46/0x4e
Jun 23 16:51:43 Tower kernel: [<ffffffff815a848d>] dev_watchdog+0x181/0x1dc
Jun 23 16:51:43 Tower kernel: [<ffffffff815a830c>] ? qdisc_rcu_free+0x39/0x39
Jun 23 16:51:43 Tower kernel: [<ffffffff815a830c>] ? qdisc_rcu_free+0x39/0x39
Jun 23 16:51:43 Tower kernel: [<ffffffff81090ccc>] call_timer_fn.isra.5+0x17/0x6b
Jun 23 16:51:43 Tower kernel: [<ffffffff81090da5>] expire_timers+0x85/0x98
Jun 23 16:51:43 Tower kernel: [<ffffffff81090ea5>] run_timer_softirq+0x69/0x8f
Jun 23 16:51:43 Tower kernel: [<ffffffff8103642b>] ? lapic_next_deadline+0x21/0x27
Jun 23 16:51:43 Tower kernel: [<ffffffff8109b347>] ? clockevents_program_event+0xd0/0xe8
Jun 23 16:51:43 Tower kernel: [<ffffffff81050f59>] __do_softirq+0xbb/0x1af
Jun 23 16:51:43 Tower kernel: [<ffffffff810511fd>] irq_exit+0x53/0x94
Jun 23 16:51:43 Tower kernel: [<ffffffff81036e19>] smp_trace_apic_timer_interrupt+0x7b/0x88
Jun 23 16:51:43 Tower kernel: [<ffffffff81036e2f>] smp_apic_timer_interrupt+0x9/0xb
Jun 23 16:51:43 Tower kernel: [<ffffffff81680172>] apic_timer_interrupt+0x82/0x90
Jun 23 16:51:43 Tower kernel: <EOI> 
Jun 23 16:51:43 Tower kernel: [<ffffffff815533e4>] ? cpuidle_enter_state+0xfe/0x156
Jun 23 16:51:43 Tower kernel: [<ffffffff8155345e>] cpuidle_enter+0x12/0x14
Jun 23 16:51:43 Tower kernel: [<ffffffff8107c545>] call_cpuidle+0x33/0x35
Jun 23 16:51:43 Tower kernel: [<ffffffff8107c727>] cpu_startup_entry+0x13a/0x1b2
Jun 23 16:51:43 Tower kernel: [<ffffffff816746d8>] rest_init+0x7f/0x82
Jun 23 16:51:43 Tower kernel: [<ffffffff81ccbe8e>] start_kernel+0x3cf/0x3dc
Jun 23 16:51:43 Tower kernel: [<ffffffff81ccb120>] ? early_idt_handler_array+0x120/0x120
Jun 23 16:51:43 Tower kernel: [<ffffffff81ccb2d6>] x86_64_start_reservations+0x2a/0x2c
Jun 23 16:51:43 Tower kernel: [<ffffffff81ccb3be>] x86_64_start_kernel+0xe6/0xf3
Jun 23 16:51:43 Tower kernel: ---[ end trace d37413df375134d1 ]---
Jun 23 16:51:43 Tower kernel: e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly
Jun 23 16:51:43 Tower kernel: bond0: link status definitely down for interface eth0, disabling it
Jun 23 16:51:43 Tower kernel: device eth0 left promiscuous mode
Jun 23 16:51:43 Tower kernel: bond0: now running without any active interface!
Jun 23 16:51:44 Tower kernel: br0: port 1(bond0) entered disabled state
Jun 23 16:51:47 Tower kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jun 23 16:51:47 Tower kernel: bond0: link status definitely up for interface eth0, 1000 Mbps full duplex
Jun 23 16:51:47 Tower kernel: bond0: making interface eth0 the new active one
Jun 23 16:51:47 Tower kernel: device eth0 entered promiscuous mode
Jun 23 16:51:47 Tower kernel: bond0: first active interface up!
Jun 23 16:51:47 Tower kernel: br0: port 1(bond0) entered blocking state
Jun 23 16:51:47 Tower kernel: br0: port 1(bond0) entered forwarding state

Any help would be highly appreciated. Thanks

Edit: I have attached the full diagnostics, here are the issues I am having:

 

- When trying to copy a file to the array, it starts of really fast then slows down to a crawl, the server utilization goes upto 100%, and then the UI hangs. Like when I type http://tower it keeps displaying trying to connect to tower. Dockers hang, like stopping/starting a docker hangs the UI.

- Randomly my dockers cause my web UI unresponsive, similiar to the problem above but not necessarily has to do with copying a file. When starting/stopping a docker the UI becomes unresponsive and I have to manually boot the server for it to come back.

- A lot of issues are associated with the UI hanging, making restarting the server impossible, I can SSH in and try reboot/shutdown to no effect, it says shutting down but nothing happens.

 

tower-diagnostics-20170623-2209.zip

Edited by machineshake123
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.