gdeyoung

Members
  • Content Count

    23
  • Joined

  • Last visited

Community Reputation

0 Neutral

About gdeyoung

  • Rank
    Member

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Server 3 just panic'd again. again this is a 10G server. also on it's second 10G Intel nic. It appears the panics happen more under large file copy loads on the 10G connection. Will move it back to 1G to see if it makes a difference. Jan 19 16:27:05 Homeserver kernel: Call Trace: Jan 19 16:27:05 Homeserver kernel: <IRQ> Jan 19 16:27:05 Homeserver kernel: dump_stack+0x67/0x83 Jan 19 16:27:05 Homeserver kernel: nmi_cpu_backtrace+0x71/0x83 Jan 19 16:27:05 Homeserver kernel: ? lapic_can_unplug_cpu+0x97/0x97 Jan 19 16:27:05 Homeserver kernel: nmi_trigger_cpumask_back
  2. Ok to update this thread. I tried going back to 6.8.3 on the 2nd and 3rd of my 4 servers that are kernel panicking and they still having panics and crashes daily. My only server that is not experiencing any issues is my 4thone that is 1G connected one. All of my 10G are panicking, and I have replaced the nics to intel server class 10g nics. I finally took my 2nd server back to a 1G connection to see if that stays stable. I have more log snippets from the 10G servers. It looks like they are also having a native_queued_spin_lock_slowpath error in the panic. Call Trac
  3. So my second server just crashed with a kernel panic, all three are having panics and they are all different hardware. Any idea from this trace? Jan 15 22:36:42 Mediaserver kernel: rcu: INFO: rcu_sched self-detected stall on CPU Jan 15 22:36:42 Mediaserver kernel: rcu: #0110-....: (59999 ticks this GP) idle=e7a/1/0x4000000000000000 softirq=11770626/11770626 fqs=14993 Jan 15 22:36:42 Mediaserver kernel: #011(t=60000 jiffies g=13660245 q=3404623) Jan 15 22:36:42 Mediaserver kernel: NMI backtrace for cpu 0 Jan 15 22:36:42 Mediaserver kernel: CPU: 0 PID: 28592 Comm: kworker/u2
  4. Ok, I figured it out. I was going about it backwards. In the network settings you can arrange the MAC addresses of the NIC's to what Eth port you want to assign them to. I just rearranged the port 0 MAC address to the Eth0 configuration To simplify the networking I turned off the bond for Eth 0-2 that was set to active-passive (that was the unraid default BTW) . I'm betting it was bouncing since I only had 10G port 1 (Eth1) plugged in. I will report back on the stability
  5. So I have a MB with a integrated 1G ethernet that is mounted as Eth0 I have a Intel 10G 2port SFP+ card that is mounted at Eth1 and Eth2 I have a single DAC cable in Eth1 of the 10G card It is configured as a active bridge on br0 for Eth0, Eth1, Eth2 This is all the default config I went into the BIOS and turned off the built in 1G card mounted as Eth0 I wanted the system to default to the port 0 of the 10G as Eth0 On bootup it has an error that Eth0 can't be found. How do I make the server forget the disabled 1G port and make the 10G port 0 as Et
  6. Yes, I have the GPU stat plugin installed. Any insight on the kernel panic trace above?
  7. Does the above traces connect with the Nvidia driver at all? I'm seeing this in the log this morning after a reboot repeated a lot. Jan 15 10:12:30 Homeserver kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jan 15 10:12:30 Homeserver kernel: caller _nv000709rm+0x1af/0x200 [nvidia] mapping multiple BARs Jan 15 10:12:32 Homeserver kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] J
  8. Happened again 24hrs later. Page faulted: Jan 11 08:30:00 Homeserver kernel: BUG: unable to handle page fault for address: 00000000000053d8 Whole trace: an 11 04:07:30 Homeserver kernel: br0: port 1(bond0) entered forwarding state Jan 11 04:08:25 Homeserver flash_backup: adding task: php /usr/local/emhttp/plugins/dynamix.unraid.net/include/UpdateFlashBackup.php update Jan 11 04:15:05 Homeserver kernel: br0: received packet on bond0 with own address as source address (addr:30:9c:23:af:51:e0, vlan:0) Jan 11 04:15:05 Homeserver kernel: br0: received pac
  9. Can anyone take a look and let me know what this kernel panic is caused by. Here is the trace from the syslog. I'm actually getting these somewhat regularlly on three different 6.9-RC@ unraid servers. Jan 12 07:15:08 Homeserver kernel: ------------[ cut here ]------------ Jan 12 07:15:08 Homeserver kernel: WARNING: CPU: 0 PID: 0 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x99/0x1e1 Jan 12 07:15:08 Homeserver kernel: Modules linked in: xt_CHECKSUM ipt_REJECT macvlan ip6table_mangle ip6table_nat iptable_mangle ip6table_filter ip6_tables vhost
  10. Having one of my servers crash daily. Any ideas on what is generating the docker networking issues below? I see them constantly in the logs then the server will freeze or throw a kernel error and crash. I will see docker0: port 2-7 in the errors. The server is a Intel 9700K with a Intel 10gb NIC. The NIC has been replaced and the SFP+ DAC cable and still get the error. I have swapped most of the hardware out to check for hardware fault and no luck (New proc/MB/PSU/RAID card). I have also moved data off each drive and rebuilt the partition on each drive and moved data back to account for
  11. So trying out the RC branch for the first time and installed RC5. Went to setup wiregaurd with the Dynamix plugin. Ran into a show stopper up front. I use a DDNS with a .network FQDN. When I entered my .network domain in the local endpoint input box it generates an error that my .network domain is not a true FQDN. I assume someone just hard coded the vanilla .com / .net / .org in the error checking for that field. Can we get that fixed?
  12. So trying out the RC branch for the first time. Went to setup wiregaurd with the Dynamix plugin. Ran into a show stopper up front. I use a DDNS with a .network FQDN. When I entered my .network domain in the local endpoint input box it generates an error that my .network domain is not a true FQDN. I assume someone just hard coded the vanilla .com / .net / .org in the error checking for that field. Can we get that fixed?
  13. I found that earlier. However, it does not have an example of the actual string that needs to be used in the field. I understand I need to mount it as a slave. I'm looking for the actual command to use.
  14. Can someone give me an example of what to put in the host 5 field in place of /mnt/disks? The mount path I'm using is /mnt/disks/SSD . I'm wrestling with the same error.