March 12, 20224 yr Hi everyone. I switched the server that I have unraid running on from a custom built Norco case with a mellanox 10g card to a dell R720xd with the 57800S 2 port SFP+ daughter card. I had also installed a Quadro P2000 for plex transcodings at the same time. I also have a 9102-16E HBA card that is connected to a 45 bay supermicro chassis (with the built in port expander backplane) and a 24 bay sff supermicro chassis with a standard backplane (6x SAS connectors) that is connected via a SFF-8088 to SFF-8087 pass-through bracket. I'm now getting kernel panics at random times of the day but at least twice a week and I need to perform a hard reboot. I finally got diagnostics where I THINK it's caused by the network driver but I can't be 100% sure. Can anyone verify? If you require more logs let me know. I'm new to storage chassis and SFF--8088 cards/cables, but I think everything is ok there. Mar 11 21:24:15 Tower2 kernel: WARNING: CPU: 24 PID: 24271 at lib/vsprintf.c:2556 vsnprintf+0x30/0x4ef Mar 11 21:24:15 Tower2 kernel: Modules linked in: nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) drm backlight agpgart ip6table_filter ip6_tables iptable_filter ip_tables x_tables bnx2x mdio sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd ipmi_ssif i2c_core glue_helper rapl intel_cstate input_leds mpt3sas intel_uncore nvme acpi_power_meter led_class raid_class scsi_transport_sas nvme_core wmi ipmi_si button [last unloaded: mdio] Mar 11 21:24:15 Tower2 kernel: CPU: 24 PID: 24271 Comm: ethtool Tainted: P O 5.10.28-Unraid #1 Mar 11 21:24:15 Tower2 kernel: Hardware name: Dell Inc. PowerEdge R720xd/0020HJ, BIOS 2.9.0 12/06/2019 Mar 11 21:24:15 Tower2 kernel: RIP: 0010:vsnprintf+0x30/0x4ef Mar 11 21:24:15 Tower2 kernel: Code: 41 54 55 53 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 48 81 fe ff ff ff 7f 48 c7 44 24 08 00 00 00 00 76 07 <0f> 0b e9 94 04 00 00 48 89 fd 49 89 fc 49 89 f5 48 01 f5 49 89 d0 Mar 11 21:24:15 Tower2 kernel: RSP: 0018:ffffc900285e7a90 EFLAGS: 00010296 Mar 11 21:24:15 Tower2 kernel: RAX: 0000000000000000 RBX: 0000000000070a0b RCX: ffffc900285e7ae0 Mar 11 21:24:15 Tower2 kernel: RDX: ffffffffa00cba0f RSI: fffffffffffffff8 RDI: ffffc900285e7bc8 Mar 11 21:24:15 Tower2 kernel: RBP: ffffc900285e7b30 R08: 000000000000000a R09: 000000000000000b Mar 11 21:24:15 Tower2 kernel: R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000020 Mar 11 21:24:15 Tower2 kernel: R13: ffffc900285e7b54 R14: ffff888124299db8 R15: 00000000ffffffed Mar 11 21:24:15 Tower2 kernel: FS: 0000149bb1295740(0000) GS:ffff889fffb00000(0000) knlGS:0000000000000000 Mar 11 21:24:15 Tower2 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 11 21:24:15 Tower2 kernel: CR2: 0000000000428670 CR3: 0000000144506005 CR4: 00000000000606e0 Mar 11 21:24:15 Tower2 kernel: Call Trace: Mar 11 21:24:15 Tower2 kernel: snprintf+0x49/0x60 Mar 11 21:24:15 Tower2 kernel: ? prep_new_page+0x25/0x71 Mar 11 21:24:15 Tower2 kernel: bnx2x_fill_fw_str+0xc7/0xf9 [bnx2x] Mar 11 21:24:15 Tower2 kernel: ? get_page_from_freelist+0x8e0/0xbd4 Mar 11 21:24:15 Tower2 kernel: bnx2x_get_drvinfo+0xf1/0x14c [bnx2x] Mar 11 21:24:15 Tower2 kernel: ethtool_get_drvinfo+0x6e/0x1b5 Mar 11 21:24:15 Tower2 kernel: dev_ethtool+0x59a/0x2126 Mar 11 21:24:15 Tower2 kernel: ? ___slab_alloc+0x23a/0x4aa Mar 11 21:24:15 Tower2 kernel: ? sk_prot_alloc.isra.0+0x26/0xad Mar 11 21:24:15 Tower2 kernel: ? inet_ioctl+0x17d/0x1a6 Mar 11 21:24:15 Tower2 kernel: ? page_add_file_rmap+0xc9/0xd4 Mar 11 21:24:15 Tower2 kernel: ? set_pte+0x5/0x8 Mar 11 21:24:15 Tower2 kernel: ? alloc_set_pte+0x2f0/0x301 Mar 11 21:24:15 Tower2 kernel: ? full_name_hash+0x12/0x6c Mar 11 21:24:15 Tower2 kernel: ? dev_name_hash+0x23/0x3a Mar 11 21:24:15 Tower2 kernel: dev_ioctl+0x2d7/0x3d5 Mar 11 21:24:15 Tower2 kernel: sock_do_ioctl+0xd9/0x12a Mar 11 21:24:15 Tower2 kernel: ? __do_sys_copy_file_range+0x178/0x18f Mar 11 21:24:15 Tower2 kernel: sock_ioctl+0x314/0x33b Mar 11 21:24:15 Tower2 kernel: ? alloc_file_pseudo+0xba/0xfd Mar 11 21:24:15 Tower2 kernel: vfs_ioctl+0x19/0x26 Mar 11 21:24:15 Tower2 kernel: __do_sys_ioctl+0x51/0x74 Mar 11 21:24:15 Tower2 kernel: do_syscall_64+0x5d/0x6a Mar 11 21:24:15 Tower2 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Mar 11 21:24:15 Tower2 kernel: RIP: 0033:0x149bb13a4417 Mar 11 21:24:15 Tower2 kernel: Code: 00 00 90 48 8b 05 79 2a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 2a 0d 00 f7 d8 64 89 01 48 Mar 11 21:24:15 Tower2 kernel: RSP: 002b:00007fff8c5cfd28 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 Mar 11 21:24:15 Tower2 kernel: RAX: ffffffffffffffda RBX: 00007fff8c5cffa8 RCX: 0000149bb13a4417 Mar 11 21:24:15 Tower2 kernel: RDX: 00007fff8c5cfe30 RSI: 0000000000008946 RDI: 0000000000000003 Mar 11 21:24:15 Tower2 kernel: RBP: 00007fff8c5cfe20 R08: 00007fff8c5cfe30 R09: 0000000000000003 Mar 11 21:24:15 Tower2 kernel: R10: 0000000000401387 R11: 0000000000000206 R12: 000000000040abe0 Mar 11 21:24:15 Tower2 kernel: R13: 0000000000000001 R14: 0000000000435043 R15: 000000000043504b Mar 11 21:24:15 Tower2 kernel: ---[ end trace 033002dfefbd6b3e ]---
March 12, 20224 yr Community Expert 8 minutes ago, bdowden said: it's caused by the network driver Correct, but not the Mellanox, likely the onboard NICs.
March 12, 20224 yr Author Ok cool. I had removed the Mellanox when I installed the daughter card. Do you have any suggestions on next steps? I'm still a linux noob (even though I've been using Unraid for almost a decade).
March 13, 20224 yr Community Expert 20 hours ago, bdowden said: Do you have any suggestions on next steps? Use different NICs or wait for a newer release with a newer driver, or if you're on v6.9.2 try v6.10-rc3.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.