Error: Call Traces found on your server


Recommended Posts

5 minutes ago, Dhagon said:

Hi, I also just got a "Call Traces" error as well, it seems to have happened when I accessed the webui and manually updated my dockers after seeing they failed auto-update for some reason. I recently setup Pihole, and forgot to change the server dns settings, so my guess is that once it tried to auto-update, it shut down Pihole and couldn't use it as dns and the update failed.

 

The server seems to work as usual as far as I can tell, but I don't know what the message means exactly, hopefully someone can clear it up for me. I've attached a txt with the call traces isolated along with the diagnostics.zip, if that makes it easier on you.

 

Possibly a corrupt docker image, since it's easy better to delete and recreate it.

Link to comment
  • 2 weeks later...

Checked Fix common problems today and I got this error. 

Your server has issued one or more call traces. This could be caused by a Kernel Issue, Bad Memory, etc.

Then followed by a message telling me to post my diagnostics on the form and get help. 

Thanks in advance for taking a look at my diagnostic file. Hopefully, my system isn't into bad of a situation.

 

tower-diagnostics-20180410-1414.zip

Link to comment
On 10/04/2018 at 8:19 PM, blinside995 said:

Checked Fix common problems today and I got this error. 

Your server has issued one or more call traces. This could be caused by a Kernel Issue, Bad Memory, etc.

Then followed by a message telling me to post my diagnostics on the form and get help. 

Thanks in advance for taking a look at my diagnostic file. Hopefully, my system isn't into bad of a situation.

 

tower-diagnostics-20180410-1414.zip

Related to docker with custom IP addresses, try updating to latest rc.

  • Like 1
Link to comment
11 hours ago, johnnie.black said:

Related to docker with custom IP addresses, try updating to latest rc.

 

I get these ip/macvlan call traces regularly.  As noted, they are related to dockers with custom IP addresses. Doesn't matter what the docker is, just that one or more have their own IP address.  I have never found a way to eliminate them completely, but, changing the NIC used for unRAID and upgrading to unRAID 6.5.1 RC6 has reduced them to every 2-3 days rather than every 2-3 hours. 

Apr 18 17:35:59 MediaNAS kernel: CPU: 0 PID: 22366 Comm: kworker/0:2 Not tainted 4.14.34-unRAID #1
Apr 18 17:35:59 MediaNAS kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./C236 WSI, BIOS P2.50 12/12/2017
Apr 18 17:35:59 MediaNAS kernel: Workqueue: events macvlan_process_broadcast [macvlan]
Apr 18 17:35:59 MediaNAS kernel: task: ffff8807ec62c880 task.stack: ffffc90008698000
Apr 18 17:35:59 MediaNAS kernel: RIP: 0010:__nf_conntrack_confirm+0x97/0x4d6
Apr 18 17:35:59 MediaNAS kernel: RSP: 0018:ffff88086dc03d30 EFLAGS: 00010202
Apr 18 17:35:59 MediaNAS kernel: RAX: 0000000000000188 RBX: 0000000000004773 RCX: 0000000000000001
Apr 18 17:35:59 MediaNAS kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffff81c093cc
Apr 18 17:35:59 MediaNAS kernel: RBP: ffff8807470d5e00 R08: 0000000000000101 R09: ffff88079edce400
Apr 18 17:35:59 MediaNAS kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81c8b080
Apr 18 17:35:59 MediaNAS kernel: R13: 000000000000f97e R14: ffff8806b497cdc0 R15: ffff8806b497ce18
Apr 18 17:35:59 MediaNAS kernel: FS:  0000000000000000(0000) GS:ffff88086dc00000(0000) knlGS:0000000000000000
Apr 18 17:35:59 MediaNAS kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 18 17:35:59 MediaNAS kernel: CR2: 0000146bfedda000 CR3: 0000000001c0a004 CR4: 00000000003606f0
Apr 18 17:35:59 MediaNAS kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 18 17:35:59 MediaNAS kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Apr 18 17:35:59 MediaNAS kernel: Call Trace:
Apr 18 17:35:59 MediaNAS kernel: <IRQ>
Apr 18 17:35:59 MediaNAS kernel: ipv4_confirm+0xac/0xb4 [nf_conntrack_ipv4]
Apr 18 17:35:59 MediaNAS kernel: nf_hook_slow+0x37/0x96
Apr 18 17:35:59 MediaNAS kernel: ip_local_deliver+0xab/0xd3
Apr 18 17:35:59 MediaNAS kernel: ? inet_del_offload+0x3e/0x3e
Apr 18 17:35:59 MediaNAS kernel: ip_rcv+0x311/0x346
Apr 18 17:35:59 MediaNAS kernel: ? ip_local_deliver_finish+0x1b8/0x1b8
Apr 18 17:35:59 MediaNAS kernel: __netif_receive_skb_core+0x6ba/0x733
Apr 18 17:35:59 MediaNAS kernel: ? enqueue_task_fair+0x94/0x42c
Apr 18 17:35:59 MediaNAS kernel: process_backlog+0x8c/0x12d
Apr 18 17:35:59 MediaNAS kernel: net_rx_action+0xfb/0x24f
Apr 18 17:35:59 MediaNAS kernel: __do_softirq+0xcd/0x1c2
Apr 18 17:35:59 MediaNAS kernel: do_softirq_own_stack+0x2a/0x40
Apr 18 17:35:59 MediaNAS kernel: </IRQ>
Apr 18 17:35:59 MediaNAS kernel: do_softirq+0x46/0x52
Apr 18 17:35:59 MediaNAS kernel: netif_rx_ni+0x21/0x35
Apr 18 17:35:59 MediaNAS kernel: macvlan_broadcast+0x117/0x14f [macvlan]
Apr 18 17:35:59 MediaNAS kernel: ? __switch_to_asm+0x24/0x60
Apr 18 17:35:59 MediaNAS kernel: macvlan_process_broadcast+0xe4/0x114 [macvlan]
Apr 18 17:35:59 MediaNAS kernel: process_one_work+0x14c/0x23f
Apr 18 17:35:59 MediaNAS kernel: ? rescuer_thread+0x258/0x258
Apr 18 17:35:59 MediaNAS kernel: worker_thread+0x1c3/0x292
Apr 18 17:35:59 MediaNAS kernel: kthread+0x111/0x119
Apr 18 17:35:59 MediaNAS kernel: ? kthread_create_on_node+0x3a/0x3a
Apr 18 17:35:59 MediaNAS kernel: ? SyS_exit_group+0xb/0xb
Apr 18 17:35:59 MediaNAS kernel: ret_from_fork+0x35/0x40
Apr 18 17:35:59 MediaNAS kernel: Code: 48 c1 eb 20 89 1c 24 e8 24 f9 ff ff 8b 54 24 04 89 df 89 c6 41 89 c5 e8 a9 fa ff ff 84 c0 75 b9 49 8b 86 80 00 00 00 a8 08 74 02 <0f> 0b 4c 89 f7 e8 03 ff ff ff 49 8b 86 80 00 00 00 0f ba e0 09 
Apr 18 17:35:59 MediaNAS kernel: ---[ end trace 9c114a22f8d955d0 ]---

I have seen 3-4 other users experiencing these macvlan call traces but devs have not been able to reproduce.

 

 

Link to comment

Hi I'm having a call trace I could use help with as well:

Apr 22 00:12:29 Tower kernel: Call Trace:
Apr 22 00:12:29 Tower kernel: i915_driver_load+0xd43/0x133e [i915]
Apr 22 00:12:29 Tower kernel: local_pci_probe+0x3c/0x7a
Apr 22 00:12:29 Tower kernel: pci_device_probe+0x11b/0x154
Apr 22 00:12:29 Tower kernel: driver_probe_device+0x142/0x2a6
Apr 22 00:12:29 Tower kernel: ? driver_allows_async_probing+0x27/0x27
Apr 22 00:12:29 Tower kernel: bus_for_each_drv+0x6e/0x7d
Apr 22 00:12:29 Tower kernel: __device_attach+0x96/0xf5
Apr 22 00:12:29 Tower kernel: bus_rescan_devices_helper+0x2a/0x4a
Apr 22 00:12:29 Tower kernel: store_drivers_probe+0x28/0x40
Apr 22 00:12:29 Tower kernel: kernfs_fop_write+0xf4/0x136
Apr 22 00:12:29 Tower kernel: __vfs_write+0x1e/0x109
Apr 22 00:12:29 Tower kernel: ? kernfs_put_open_node.isra.3+0x81/0x8d
Apr 22 00:12:29 Tower kernel: ? rcu_is_watching+0xc/0x1e
Apr 22 00:12:29 Tower kernel: vfs_write+0xc3/0x166
Apr 22 00:12:29 Tower kernel: SyS_write+0x48/0x81
Apr 22 00:12:29 Tower kernel: do_syscall_64+0x6d/0xfe
Apr 22 00:12:29 Tower kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Apr 22 00:12:29 Tower kernel: RIP: 0033:0x14d8720beeb7
Apr 22 00:12:29 Tower kernel: RSP: 002b:000014d86efc4890 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Apr 22 00:12:29 Tower kernel: RAX: ffffffffffffffda RBX: 0000000000000016 RCX: 000014d8720beeb7
Apr 22 00:12:29 Tower kernel: RDX: 000000000000000c RSI: 000014d83409e7d4 RDI: 0000000000000016
Apr 22 00:12:29 Tower kernel: RBP: 000014d83409e7d4 R08: 0000000000000000 R09: 0000000000000000
Apr 22 00:12:29 Tower kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000c
Apr 22 00:12:29 Tower kernel: R13: 0000000000000000 R14: 0000000000000016 R15: 000014d850002540
Apr 22 00:12:29 Tower kernel: Code: 25 06 00 00 04 76 26 83 ce ff e8 1d 1f 01 00 85 c0 74 1a 83 f8 ed 74 15 48 c7 c6 7a b0 45 a0 48 c7 c7 38 ad 45 a0 e8 57 f0 ca e0 <0f> 0b c3 41 54 55 53 48 89 fb e8 29 f7 fe ff 4c 8d 63 70 48 89 
Apr 22 00:12:29 Tower kernel: ---[ end trace 30523a58b7e7efb3 ]---

tower-diagnostics-20180422-0023 (1).zip

Link to comment
On 4/29/2018 at 9:32 AM, Johan76 said:

Some call Traces warning.

 

You should start your own thread because call traces have many causes and yours is different from the OP's.

Apr 28 00:40:06 Lagret kernel: ------------[ cut here ]------------
Apr 28 00:40:06 Lagret kernel: WARNING: CPU: 3 PID: 2833 at drivers/gpu/drm/i915/i915_drv.c:242 i915_driver_load+0x769/0x133e [i915]

Yes. It looks as though yours is caused by the i915 driver. Try undoing the setting and see if that fixes it.

 

Link to comment
 
You should start your own thread because call traces have many causes and yours is different from the OP's.
Apr 28 00:40:06 Lagret kernel: ------------[ cut here ]------------Apr 28 00:40:06 Lagret kernel: WARNING: CPU: 3 PID: 2833 at drivers/gpu/drm/i915/i915_drv.c:242 i915_driver_load+0x769/0x133e [i915]

Yes. It looks as though yours is caused by the i915 driver. Try undoing the setting and see if that fixes it.
 

Thanks for info. I havn't had any problems and havn't had this error for a few days so seems to have fixed itself.

I need this option on to be able to get hw acceleration in plex. I will let it be if I don't seem to get any problem with my system.

Thanks for confirming it was this setting who likely triggered the calltraces. Now I know it's not something else that's wrong.

Skickat från min SM-G950F via Tapatalk

Link to comment
  • 2 weeks later...

I seem to be getting NIC interuptions as well. Only since the latest 6.5.2 and the RCs from what I can tell.

 

seeing this numerous times today

 

[Wed May 16 09:32:51 2018] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
                             TDH                  <60>
                             TDT                  <7a>
                             next_to_use          <7a>
                             next_to_clean        <5f>
                           buffer_info[next_to_clean]:
                             time_stamp           <1023d45d8>
                             next_to_watch        <60>
                             jiffies              <1023d6fc1>
                             next_to_watch.status <0>
                           MAC Status             <40080083>
                           PHY Status             <796d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
[Wed May 16 09:32:52 2018] NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
[Wed May 16 09:32:52 2018] ------------[ cut here ]------------
[Wed May 16 09:32:52 2018] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x157/0x1b8
[Wed May 16 09:32:52 2018] Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables vhost_net tun vhost tap xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs nfsd lockd grace sunrpc md_mod nct6775 hwmon_vid x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper e1000e cryptd intel_cstate ahci wmi_bmof libahci wmi ptp intel_uncore pps_core video i2c_i801 intel_rapl_perf i2c_core ie31200_edac backlight thermal button fan
[Wed May 16 09:32:52 2018] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.14.40-unRAID #1
[Wed May 16 09:32:52 2018] Hardware name: System manufacturer System Product Name/P8Z77-I DELUXE, BIOS 1201 06/20/2014
[Wed May 16 09:32:52 2018] task: ffff88040d5b5280 task.stack: ffffc90001928000
[Wed May 16 09:32:52 2018] RIP: 0010:dev_watchdog+0x157/0x1b8
[Wed May 16 09:32:52 2018] RSP: 0018:ffff88041fb43e98 EFLAGS: 00010292
[Wed May 16 09:32:52 2018] RAX: 000000000000003a RBX: ffff880409b1c000 RCX: 0000000000000000
[Wed May 16 09:32:52 2018] RDX: 0000000000040001 RSI: 00000000000000f6 RDI: 0000000000000300
[Wed May 16 09:32:52 2018] RBP: 0000000000000000 R08: 0000000000000003 R09: ffffffff81fe7200
[Wed May 16 09:32:52 2018] R10: 0000000000000082 R11: 000000000000005c R12: ffff880409b1c39c
[Wed May 16 09:32:52 2018] R13: 0000000000000005 R14: 0000000000000001 R15: ffff88040d267880
[Wed May 16 09:32:52 2018] FS:  0000000000000000(0000) GS:ffff88041fb40000(0000) knlGS:0000000000000000
[Wed May 16 09:32:52 2018] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 16 09:32:52 2018] CR2: 00001529a39fa000 CR3: 0000000004c0a001 CR4: 00000000001626e0
[Wed May 16 09:32:52 2018] Call Trace:
[Wed May 16 09:32:52 2018]  <IRQ>
[Wed May 16 09:32:52 2018]  ? qdisc_rcu_free+0x39/0x39
[Wed May 16 09:32:52 2018]  ? qdisc_rcu_free+0x39/0x39
[Wed May 16 09:32:52 2018]  call_timer_fn.isra.4+0x14/0x70
[Wed May 16 09:32:52 2018]  expire_timers+0x82/0x95
[Wed May 16 09:32:52 2018]  run_timer_softirq+0x62/0xe5
[Wed May 16 09:32:52 2018]  ? tick_sched_timer+0x33/0x61
[Wed May 16 09:32:52 2018]  ? recalibrate_cpu_khz+0x6/0x6
[Wed May 16 09:32:52 2018]  ? ktime_get+0x3a/0x8d
[Wed May 16 09:32:52 2018]  __do_softirq+0xcd/0x1c2
[Wed May 16 09:32:52 2018]  irq_exit+0x4f/0x8e
[Wed May 16 09:32:52 2018]  smp_apic_timer_interrupt+0x7a/0x85
[Wed May 16 09:32:52 2018]  apic_timer_interrupt+0x7d/0x90
[Wed May 16 09:32:52 2018]  </IRQ>
[Wed May 16 09:32:52 2018] RIP: 0010:cpuidle_enter_state+0xe3/0x135
[Wed May 16 09:32:52 2018] RSP: 0018:ffffc9000192bef8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
[Wed May 16 09:32:52 2018] RAX: ffff88041fb60980 RBX: 0000000000000000 RCX: 000000000000001f
[Wed May 16 09:32:52 2018] RDX: 0000227425969198 RSI: 0000000000020180 RDI: 0000000000000000
[Wed May 16 09:32:52 2018] RBP: ffff88041fb68700 R08: 00007944edee82d9 R09: 0000000000000060
[Wed May 16 09:32:52 2018] R10: ffffc9000192bed8 R11: 000000000c7ac3f4 R12: 0000000000000004
[Wed May 16 09:32:52 2018] R13: 0000227425969198 R14: ffffffff81c59258 R15: 0000227424d0f9bb
[Wed May 16 09:32:52 2018]  ? cpuidle_enter_state+0xbb/0x135
[Wed May 16 09:32:52 2018]  do_idle+0x11a/0x179
[Wed May 16 09:32:52 2018]  cpu_startup_entry+0x18/0x1a
[Wed May 16 09:32:52 2018]  secondary_startup_64+0xa5/0xb0
[Wed May 16 09:32:52 2018] Code: 3d a5 44 7c 00 00 75 35 48 89 df c6 05 99 44 7c 00 01 e8 6c 2b fe ff 89 e9 48 89 de 48 c7 c7 70 71 b8 81 48 89 c2 e8 ca 61 ba ff <0f> 0b eb 0e ff c5 48 05 40 01 00 00 39 cd 75 9d eb 13 48 8b 83 
[Wed May 16 09:32:52 2018] ---[ end trace e0df5652918bcbb6 ]---
[Wed May 16 09:32:52 2018] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly

 

[Wed May 16 09:36:39 2018] e1000e 0000:00:19.0 eth0: Detected Hardware Unit Hang:
                             TDH                  <38>
                             TDT                  <57>
                             next_to_use          <57>
                             next_to_clean        <37>
                           buffer_info[next_to_clean]:
                             time_stamp           <10240c4ee>
                             next_to_watch        <38>
                             jiffies              <10240ea80>
                             next_to_watch.status <0>
                           MAC Status             <40080083>
                           PHY Status             <796d>
                           PHY 1000BASE-T Status  <3800>
                           PHY Extended Status    <3000>
                           PCI Status             <10>
[Wed May 16 09:36:40 2018] e1000e 0000:00:19.0 eth0: Reset adapter unexpectedly

 

Edited by spyd4r
Link to comment
2 hours ago, spyd4r said:

I seem to be getting NIC interuptions as well.

 

Syslog fragments don't really help with diagnosis. The complete diagnostics zip is needed. It could be a disabled IRQ. Try searching your syslog for "nobody cared" and see if the IRQ number matches that of your NIC.

Link to comment
4 hours ago, Tise said:

Installed fix common problems and got the call traces.

 

According to your syslog,

May 25 12:46:29 Tower kernel: Your BIOS is broken; DMAR reported at address fed90000 returns all ones!
May 25 12:46:29 Tower kernel: BIOS vendor: American Megatrends Inc.; Ver: 1703   ; Product Version: System Version

I'd suggest updating your BIOS if there was an update, but I think 1703 is the latest anyway for that motherboard. It appears to be related to IOMMU but your CPU doesn't support VT-d anyway. There's more here but it's an old thread. Essentially your BIOS really is broken and the kernel tries to work around it. You ought to update unRAID to the latest version 6.5.2 as the newer kernel might help. If things work as expected then I suppose you can ignore it as your options are limited. Maybe disable VMs as you don't really have enough RAM to run any alongside your dockers and plugins anyway.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.