courteous-ox7459 Posted November 26, 2023 Share Posted November 26, 2023 Heya, I'm having trouble installing the RadeonTOP plugin with my AMD APU. [1002:1636]05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Renoir (rev c9) Whenever I try to install it, this comes up plugin: downloading: radeontop-2023.02.22.txz ... done +============================================================================== | Installing new package /boot/config/plugins/radeontop/radeontop-2023.02.22.txz +============================================================================== Verifying package radeontop-2023.02.22.txz. Installing package radeontop-2023.02.22.txz: PACKAGE DESCRIPTION: Package radeontop-2023.02.22.txz installed. ---------Enabling AMDGPU Kernel Module--------- ------Something went wrong! Can't enable------- ----AMDGPU Kernel Module, removing package!---- Removing package: radeontop-2023.02.22 Removing files: --> Deleting /usr/bin/radeontop --> Deleting /usr/local/emhttp/plugins/radeontop/bin/radeontop --> Deleting /usr/local/emhttp/plugins/radeontop/images/radeontop.png --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so.2 --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm.so.2.4.0 --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so.1 --> Deleting /usr/local/emhttp/plugins/radeontop/lib/libdrm_amdgpu.so.1.0.0 --> Deleting /usr/share/libdrm/amdgpu.ids --> Deleting empty directory /usr/share/libdrm/ --> Deleting empty directory /usr/local/emhttp/plugins/radeontop/lib/ --> Deleting empty directory /usr/local/emhttp/plugins/radeontop/images/ --> Deleting empty directory /usr/local/emhttp/plugins/radeontop/bin/ WARNING: Unique directory /usr/local/emhttp/plugins/radeontop/ contains new files plugin: run failed: '/bin/bash' returned 1 Executing hook script: post_plugin_checks I tried to enable the amdgpu kernel module and I get this: modprobe -v amdgpu insmod /lib/modules/6.1.49-Unraid/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.xz modprobe: ERROR: could not insert 'amdgpu': Invalid argument Any idea what how I can solve this? You can find my diagonistics attached as well. Thanks! brotherman-diagnostics-20231126-2127.zip Quote Link to comment
ich777 Posted November 26, 2023 Author Share Posted November 26, 2023 14 minutes ago, courteous-ox7459 said: Any idea what how I can solve this? You can find my diagonistics attached as well. Thanks! Your Kernel Module crashed while the plugin tried to enable it, however it seems related to a MACVLAN issue, please solve this issue first: Nov 26 20:49:30 Brotherman kernel: ------------[ cut here ]------------ Nov 26 20:49:30 Brotherman kernel: WARNING: CPU: 4 PID: 19190 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: Modules linked in: wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha tun bluetooth ecdh_generic ecc veth macvlan xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ipmi_devintf nct6775 nct6775_core hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs af_packet 8021q garp mrp bridge stp llc bonding tls edac_mce_amd edac_core intel_rapl_msr intel_rapl_common iosf_mbi gpu_sched drm_buddy kvm_amd i2c_algo_bit drm_ttm_helper ttm drm_display_helper drm_kms_helper kvm drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd wmi_bmof cryptd agpgart nvme rapl i2c_piix4 syscopyarea Nov 26 20:49:30 Brotherman kernel: i2c_core r8169 nvme_core sysfillrect ccp k10temp joydev sysimgblt ahci fb_sys_fops realtek libahci tpm_crb tpm_tis video tpm_tis_core wmi tpm backlight acpi_cpufreq button unix Nov 26 20:49:30 Brotherman kernel: CPU: 4 PID: 19190 Comm: kworker/u64:2 Tainted: P O 6.1.49-Unraid #1 Nov 26 20:49:30 Brotherman kernel: Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P2.60 02/07/2023 Nov 26 20:49:30 Brotherman kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] Nov 26 20:49:30 Brotherman kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01 Nov 26 20:49:30 Brotherman kernel: RSP: 0018:ffffc900002acd98 EFLAGS: 00010202 Nov 26 20:49:30 Brotherman kernel: RAX: 0000000000000001 RBX: ffff88817b7fcf00 RCX: 614dcb50d5e59f96 Nov 26 20:49:30 Brotherman kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88817b7fcf00 Nov 26 20:49:30 Brotherman kernel: RBP: 0000000000000001 R08: 3807721c7df55c84 R09: 26b204f7336a83ad Nov 26 20:49:30 Brotherman kernel: R10: 808473b5991f4ea0 R11: ffffc900002acd60 R12: ffffffff82a11d00 Nov 26 20:49:30 Brotherman kernel: R13: 000000000003f226 R14: ffff888015853500 R15: 0000000000000000 Nov 26 20:49:30 Brotherman kernel: FS: 0000000000000000(0000) GS:ffff88842e100000(0000) knlGS:0000000000000000 Nov 26 20:49:30 Brotherman kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 26 20:49:30 Brotherman kernel: CR2: 00007fffb1435084 CR3: 000000018251e000 CR4: 0000000000350ee0 Nov 26 20:49:30 Brotherman kernel: Call Trace: Nov 26 20:49:30 Brotherman kernel: <IRQ> Nov 26 20:49:30 Brotherman kernel: ? __warn+0xab/0x122 Nov 26 20:49:30 Brotherman kernel: ? report_bug+0x109/0x17e Nov 26 20:49:30 Brotherman kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: ? handle_bug+0x41/0x6f Nov 26 20:49:30 Brotherman kernel: ? exc_invalid_op+0x13/0x60 Nov 26 20:49:30 Brotherman kernel: ? asm_exc_invalid_op+0x16/0x20 Nov 26 20:49:30 Brotherman kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: ? nf_nat_inet_fn+0x126/0x1a8 [nf_nat] Nov 26 20:49:30 Brotherman kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack] Nov 26 20:49:30 Brotherman kernel: nf_hook_slow+0x3d/0x96 Nov 26 20:49:30 Brotherman kernel: ? ip_protocol_deliver_rcu+0x164/0x164 Nov 26 20:49:30 Brotherman kernel: NF_HOOK.constprop.0+0x79/0xd9 Nov 26 20:49:30 Brotherman kernel: ? ip_protocol_deliver_rcu+0x164/0x164 Nov 26 20:49:30 Brotherman kernel: __netif_receive_skb_one_core+0x77/0x9c Nov 26 20:49:30 Brotherman kernel: process_backlog+0x8c/0x116 Nov 26 20:49:30 Brotherman kernel: __napi_poll.constprop.0+0x2b/0x124 Nov 26 20:49:30 Brotherman kernel: net_rx_action+0x159/0x24f Nov 26 20:49:30 Brotherman kernel: __do_softirq+0x129/0x288 Nov 26 20:49:30 Brotherman kernel: do_softirq+0x7f/0xab Nov 26 20:49:30 Brotherman kernel: </IRQ> Nov 26 20:49:30 Brotherman kernel: <TASK> Nov 26 20:49:30 Brotherman kernel: __local_bh_enable_ip+0x4c/0x6b Nov 26 20:49:30 Brotherman kernel: netif_rx+0x52/0x5a Nov 26 20:49:30 Brotherman kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Nov 26 20:49:30 Brotherman kernel: ? _raw_spin_unlock+0x14/0x29 Nov 26 20:49:30 Brotherman kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Nov 26 20:49:30 Brotherman kernel: process_one_work+0x1ab/0x295 Nov 26 20:49:30 Brotherman kernel: worker_thread+0x18b/0x244 Nov 26 20:49:30 Brotherman kernel: ? rescuer_thread+0x281/0x281 Nov 26 20:49:30 Brotherman kernel: kthread+0xe7/0xef Nov 26 20:49:30 Brotherman kernel: ? kthread_complete_and_exit+0x1b/0x1b Nov 26 20:49:30 Brotherman kernel: ret_from_fork+0x22/0x30 Nov 26 20:49:30 Brotherman kernel: </TASK> Nov 26 20:49:30 Brotherman kernel: ---[ end trace 0000000000000000 ]--- Please either disable the Bridge in your networks settings and make sure to select MACVLAN in the Docker Settings if you need MACVLAN or switch to IPVLAN in your Docker Settings. Please also remove this from your syslinux.config (click on the blue text Flash on your Main page): nomodeset since Unraid should try to enable the GPU first not the plugin. 1 Quote Link to comment
courteous-ox7459 Posted November 28, 2023 Share Posted November 28, 2023 On 11/26/2023 at 9:50 PM, ich777 said: Your Kernel Module crashed while the plugin tried to enable it, however it seems related to a MACVLAN issue, please solve this issue first: Please either disable the Bridge in your networks settings and make sure to select MACVLAN in the Docker Settings if you need MACVLAN or switch to IPVLAN in your Docker Settings. Please also remove this from your syslinux.config (click on the blue text Flash on your Main page): since Unraid should try to enable the GPU first not the plugin. Thanks a lot, it's has been working flawlessly since I have done these changes. 1 Quote Link to comment
florxy Posted December 7, 2023 Share Posted December 7, 2023 On 9/18/2022 at 3:28 PM, ericswpark said: Finally managed to get the temperature to show up in the plugin. Turns out the "detect" button is broken and does not scan available drivers properly. Following this comment: I had to create a `drivers.conf` file in `/boot/config/plugins/dynamix.system.temp` and add the following two lines: it87 k10temp Then once I went back to the temperature plugin settings I was able to select the CPU/MB temperature from the dropdown. One thing to note – already mentioned in the linked comment but just to make sure – don't click on "Detect" or else it will wipe out your changes and you'll have to start over. The commenter in the link had to do the `modprobe force_id` thing, but I didn't have to thanks to this plugin. You probably shouldn't need it if you have this it87 plugin installed. This helped me a lot - Final result - Done - after about 4h searching Thanks again Quote Link to comment
mikeyosm Posted December 12, 2023 Share Posted December 12, 2023 Any chance we can get the Mellanox temperature in to the main UNRAID dashboard instead of having to go to the plugin each time? Quote Link to comment
ich777 Posted December 12, 2023 Author Share Posted December 12, 2023 8 minutes ago, mikeyosm said: Any chance we can get the Mellanox temperature in to the main UNRAID dashboard instead of having to go to the plugin each time? I'm not super convinced that this is necessary since the NIC temperature is usually consistent on Mellanox cards and I have no plans currently to implement that. Quote Link to comment
Wizard_ Posted December 15, 2023 Share Posted December 15, 2023 Does intel 13th CPU work with this plugin poperly? I can install the plugin, but it goes wrong when i reboot the system. However, i do saw "card render0" in /dev/dri for a few time. 向导-服务器-诊断-20231215-1722.zip Quote Link to comment
ich777 Posted December 15, 2023 Author Share Posted December 15, 2023 23 minutes ago, Wizard_ said: Does intel 13th CPU work with this plugin poperly? Yes, but in your case it can't because your iGPU is blacklisted. Please remove the file /boot/config/modprobe.d/i915.conf and reboot your system. The plugin is just the application intel_gpu_top and makes sure that your iGPU is enabled (of course only when the iGPU is not blacklisted). Quote Link to comment
Wizard_ Posted December 16, 2023 Share Posted December 16, 2023 18 hours ago, ich777 said: Yes, but in your case it can't because your iGPU is blacklisted. Please remove the file /boot/config/modprobe.d/i915.conf and reboot your system. The plugin is just the application intel_gpu_top and makes sure that your iGPU is enabled (of course only when the iGPU is not blacklisted). Thanks for answering my question! I have removed the i915.conf and reboot the system, it seems nothing happened? By the way, i can't shutdown the server normally (use the "shutdown" button in the webui). It will stuck at somewhere and i have to press the power button manually. wizard-server-diagnostics-20231216-1204.zip Quote Link to comment
ich777 Posted December 16, 2023 Author Share Posted December 16, 2023 2 hours ago, Wizard_ said: I have removed the i915.conf and reboot the system, it seems nothing happened? You have a call trace in your syslog which is most likely the cause of the issue: Dec 16 11:57:06 Wizard-Server kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access Dec 16 11:57:06 Wizard-Server kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020 Dec 16 11:57:06 Wizard-Server kernel: #PF: supervisor read access in kernel mode Dec 16 11:57:06 Wizard-Server kernel: #PF: error_code(0x0000) - not-present page Dec 16 11:57:06 Wizard-Server kernel: PGD 105f4f067 P4D 105f4f067 PUD 108c01067 PMD 0 Dec 16 11:57:06 Wizard-Server kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI Dec 16 11:57:06 Wizard-Server kernel: CPU: 0 PID: 1148 Comm: udevd Tainted: P O 6.1.64-Unraid #1 Dec 16 11:57:06 Wizard-Server kernel: Hardware name: Default string Default string/MS-WS W680 D4, BIOS H4.2G 11/26/2022 Dec 16 11:57:06 Wizard-Server kernel: RIP: 0010:klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: Code: 03 00 31 c0 48 89 03 5b 89 e8 5d 41 5c 41 5d c3 cc cc cc cc 41 55 41 54 41 89 f4 55 53 48 8b 2f 48 89 fb 48 83 e5 fe 48 89 ef <4c> 8b 6d 20 e8 d2 9b 03 00 45 84 e4 74 10 48 8b 03 a8 01 74 02 0f Dec 16 11:57:06 Wizard-Server kernel: RSP: 0018:ffffc9000103bab8 EFLAGS: 00010246 Dec 16 11:57:06 Wizard-Server kernel: RAX: ffff888135074b80 RBX: ffff888135074ba8 RCX: ffff888135074b80 Dec 16 11:57:06 Wizard-Server kernel: RDX: ffff888103c4b410 RSI: 0000000000000001 RDI: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff829513f0 Dec 16 11:57:06 Wizard-Server kernel: R10: 00003fffffffffff R11: fefefefefefefeff R12: 0000000000000001 Dec 16 11:57:06 Wizard-Server kernel: R13: ffff8881010cc000 R14: ffff888105d19b50 R15: ffff8881010cc0d0 Dec 16 11:57:06 Wizard-Server kernel: FS: 000014e9445d8240(0000) GS:ffff88903f400000(0000) knlGS:0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 CR3: 000000010368e000 CR4: 0000000000750ef0 Dec 16 11:57:06 Wizard-Server kernel: PKRU: 55555554 Dec 16 11:57:06 Wizard-Server kernel: Call Trace: Dec 16 11:57:06 Wizard-Server kernel: <TASK> Dec 16 11:57:06 Wizard-Server kernel: ? __die_body+0x1a/0x5c Dec 16 11:57:06 Wizard-Server kernel: ? page_fault_oops+0x329/0x376 Dec 16 11:57:06 Wizard-Server kernel: ? do_user_addr_fault+0x12e/0x48d Dec 16 11:57:06 Wizard-Server kernel: ? exc_page_fault+0xfb/0x11d Dec 16 11:57:06 Wizard-Server kernel: ? asm_exc_page_fault+0x22/0x30 Dec 16 11:57:06 Wizard-Server kernel: ? klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: device_del+0xb6/0x31d Dec 16 11:57:06 Wizard-Server kernel: ? i915_ggtt_probe_hw+0x593/0x5be [i915] Dec 16 11:57:06 Wizard-Server kernel: platform_device_del+0x21/0x70 Dec 16 11:57:06 Wizard-Server kernel: platform_device_unregister+0xf/0x19 Dec 16 11:57:06 Wizard-Server kernel: sysfb_disable+0x2b/0x54 Dec 16 11:57:06 Wizard-Server kernel: aperture_remove_conflicting_pci_devices+0x1e/0x82 Dec 16 11:57:06 Wizard-Server kernel: i915_driver_probe+0x83f/0xc19 [i915] Dec 16 11:57:06 Wizard-Server kernel: ? slab_free_freelist_hook.constprop.0+0x3b/0xaf Dec 16 11:57:06 Wizard-Server kernel: local_pci_probe+0x3d/0x81 Dec 16 11:57:06 Wizard-Server kernel: pci_device_probe+0x197/0x1eb Dec 16 11:57:06 Wizard-Server kernel: ? sysfs_do_create_link_sd+0x71/0xb7 Dec 16 11:57:06 Wizard-Server kernel: really_probe+0x115/0x282 Dec 16 11:57:06 Wizard-Server kernel: __driver_probe_device+0xc0/0xf2 Dec 16 11:57:06 Wizard-Server kernel: driver_probe_device+0x1f/0x77 Dec 16 11:57:06 Wizard-Server kernel: ? __device_attach_driver+0x97/0x97 Dec 16 11:57:06 Wizard-Server kernel: __driver_attach+0xd7/0xee Dec 16 11:57:06 Wizard-Server kernel: ? __device_attach_driver+0x97/0x97 Dec 16 11:57:06 Wizard-Server kernel: bus_for_each_dev+0x6e/0xa7 Dec 16 11:57:06 Wizard-Server kernel: bus_add_driver+0xd8/0x1d0 Dec 16 11:57:06 Wizard-Server kernel: driver_register+0x99/0xd7 Dec 16 11:57:06 Wizard-Server kernel: i915_init+0x1f/0x7f [i915] Dec 16 11:57:06 Wizard-Server kernel: ? 0xffffffffa2257000 Dec 16 11:57:06 Wizard-Server kernel: do_one_initcall+0x82/0x19f Dec 16 11:57:06 Wizard-Server kernel: ? kmalloc_trace+0x43/0x52 Dec 16 11:57:06 Wizard-Server kernel: do_init_module+0x4b/0x1d4 Dec 16 11:57:06 Wizard-Server kernel: __do_sys_init_module+0xb6/0xf9 Dec 16 11:57:06 Wizard-Server kernel: do_syscall_64+0x68/0x81 Dec 16 11:57:06 Wizard-Server kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Dec 16 11:57:06 Wizard-Server kernel: RIP: 0033:0x14e944aeadfa Dec 16 11:57:06 Wizard-Server kernel: Code: 48 8b 0d 21 20 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee 1f 0d 00 f7 d8 64 89 01 48 Dec 16 11:57:06 Wizard-Server kernel: RSP: 002b:00007ffe72d55f08 EFLAGS: 00000246 ORIG_RAX: 00000000000000af Dec 16 11:57:06 Wizard-Server kernel: RAX: ffffffffffffffda RBX: 0000000000468c70 RCX: 000014e944aeadfa Dec 16 11:57:06 Wizard-Server kernel: RDX: 000014e944bdfaad RSI: 00000000004b1868 RDI: 000014e943cc0010 Dec 16 11:57:06 Wizard-Server kernel: RBP: 000014e944bdfaad R08: 0000000000000007 R09: 0000000000464e80 Dec 16 11:57:06 Wizard-Server kernel: R10: 0000000000000005 R11: 0000000000000246 R12: 000014e943cc0010 Dec 16 11:57:06 Wizard-Server kernel: R13: 0000000000000000 R14: 0000000000459c30 R15: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: </TASK> Dec 16 11:57:06 Wizard-Server kernel: Modules linked in: kvm_intel(+) znvpair(PO) i915(+) spl(O) kvm iosf_mbi drm_buddy i2c_algo_bit ttm crct10dif_pclmul crc32_pclmul drm_display_helper crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel drm_kms_helper mei_hdcp mei_pxp crypto_simd cryptd rapl intel_cstate wmi_bmof drm mpt3sas intel_uncore ahci mei_me intel_gtt i2c_i801 nvme agpgart raid_class i2c_smbus hid_apple input_leds syscopyarea r8125(O) i2c_core nvme_core scsi_transport_sas joydev mei libahci led_class sysfillrect thermal sysimgblt fb_sys_fops fan tpm_crb video tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core acpi_pad acpi_tad button unix Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 Dec 16 11:57:06 Wizard-Server kernel: ---[ end trace 0000000000000000 ]--- Dec 16 11:57:06 Wizard-Server kernel: sdg: sdg1 Dec 16 11:57:06 Wizard-Server kernel: sd 2:0:4:0: [sdg] Attached SCSI disk Dec 16 11:57:06 Wizard-Server kernel: RIP: 0010:klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: Code: 03 00 31 c0 48 89 03 5b 89 e8 5d 41 5c 41 5d c3 cc cc cc cc 41 55 41 54 41 89 f4 55 53 48 8b 2f 48 89 fb 48 83 e5 fe 48 89 ef <4c> 8b 6d 20 e8 d2 9b 03 00 45 84 e4 74 10 48 8b 03 a8 01 74 02 0f Dec 16 11:57:06 Wizard-Server kernel: RSP: 0018:ffffc9000103bab8 EFLAGS: 00010246 Dec 16 11:57:06 Wizard-Server kernel: RAX: ffff888135074b80 RBX: ffff888135074ba8 RCX: ffff888135074b80 Dec 16 11:57:06 Wizard-Server kernel: RDX: ffff888103c4b410 RSI: 0000000000000001 RDI: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff829513f0 Dec 16 11:57:06 Wizard-Server kernel: R10: 00003fffffffffff R11: fefefefefefefeff R12: 0000000000000001 Dec 16 11:57:06 Wizard-Server kernel: R13: ffff8881010cc000 R14: ffff888105d19b50 R15: ffff8881010cc0d0 Dec 16 11:57:06 Wizard-Server kernel: FS: 000014e9445d8240(0000) GS:ffff88903f400000(0000) knlGS:0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 CR3: 000000010368e000 CR4: 0000000000750ef0 Dec 16 11:57:06 Wizard-Server kernel: PKRU: 55555554 Do you have a monitor or at least a HDMI dummy plug connected to your iGPU? Please note that this is not related to my plugin. Quote Link to comment
Wizard_ Posted December 16, 2023 Share Posted December 16, 2023 4 hours ago, ich777 said: You have a call trace in your syslog which is most likely the cause of the issue: Dec 16 11:57:06 Wizard-Server kernel: i915 0000:00:02.0: [drm] VT-d active for gfx access Dec 16 11:57:06 Wizard-Server kernel: BUG: kernel NULL pointer dereference, address: 0000000000000020 Dec 16 11:57:06 Wizard-Server kernel: #PF: supervisor read access in kernel mode Dec 16 11:57:06 Wizard-Server kernel: #PF: error_code(0x0000) - not-present page Dec 16 11:57:06 Wizard-Server kernel: PGD 105f4f067 P4D 105f4f067 PUD 108c01067 PMD 0 Dec 16 11:57:06 Wizard-Server kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI Dec 16 11:57:06 Wizard-Server kernel: CPU: 0 PID: 1148 Comm: udevd Tainted: P O 6.1.64-Unraid #1 Dec 16 11:57:06 Wizard-Server kernel: Hardware name: Default string Default string/MS-WS W680 D4, BIOS H4.2G 11/26/2022 Dec 16 11:57:06 Wizard-Server kernel: RIP: 0010:klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: Code: 03 00 31 c0 48 89 03 5b 89 e8 5d 41 5c 41 5d c3 cc cc cc cc 41 55 41 54 41 89 f4 55 53 48 8b 2f 48 89 fb 48 83 e5 fe 48 89 ef <4c> 8b 6d 20 e8 d2 9b 03 00 45 84 e4 74 10 48 8b 03 a8 01 74 02 0f Dec 16 11:57:06 Wizard-Server kernel: RSP: 0018:ffffc9000103bab8 EFLAGS: 00010246 Dec 16 11:57:06 Wizard-Server kernel: RAX: ffff888135074b80 RBX: ffff888135074ba8 RCX: ffff888135074b80 Dec 16 11:57:06 Wizard-Server kernel: RDX: ffff888103c4b410 RSI: 0000000000000001 RDI: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff829513f0 Dec 16 11:57:06 Wizard-Server kernel: R10: 00003fffffffffff R11: fefefefefefefeff R12: 0000000000000001 Dec 16 11:57:06 Wizard-Server kernel: R13: ffff8881010cc000 R14: ffff888105d19b50 R15: ffff8881010cc0d0 Dec 16 11:57:06 Wizard-Server kernel: FS: 000014e9445d8240(0000) GS:ffff88903f400000(0000) knlGS:0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 CR3: 000000010368e000 CR4: 0000000000750ef0 Dec 16 11:57:06 Wizard-Server kernel: PKRU: 55555554 Dec 16 11:57:06 Wizard-Server kernel: Call Trace: Dec 16 11:57:06 Wizard-Server kernel: <TASK> Dec 16 11:57:06 Wizard-Server kernel: ? __die_body+0x1a/0x5c Dec 16 11:57:06 Wizard-Server kernel: ? page_fault_oops+0x329/0x376 Dec 16 11:57:06 Wizard-Server kernel: ? do_user_addr_fault+0x12e/0x48d Dec 16 11:57:06 Wizard-Server kernel: ? exc_page_fault+0xfb/0x11d Dec 16 11:57:06 Wizard-Server kernel: ? asm_exc_page_fault+0x22/0x30 Dec 16 11:57:06 Wizard-Server kernel: ? klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: device_del+0xb6/0x31d Dec 16 11:57:06 Wizard-Server kernel: ? i915_ggtt_probe_hw+0x593/0x5be [i915] Dec 16 11:57:06 Wizard-Server kernel: platform_device_del+0x21/0x70 Dec 16 11:57:06 Wizard-Server kernel: platform_device_unregister+0xf/0x19 Dec 16 11:57:06 Wizard-Server kernel: sysfb_disable+0x2b/0x54 Dec 16 11:57:06 Wizard-Server kernel: aperture_remove_conflicting_pci_devices+0x1e/0x82 Dec 16 11:57:06 Wizard-Server kernel: i915_driver_probe+0x83f/0xc19 [i915] Dec 16 11:57:06 Wizard-Server kernel: ? slab_free_freelist_hook.constprop.0+0x3b/0xaf Dec 16 11:57:06 Wizard-Server kernel: local_pci_probe+0x3d/0x81 Dec 16 11:57:06 Wizard-Server kernel: pci_device_probe+0x197/0x1eb Dec 16 11:57:06 Wizard-Server kernel: ? sysfs_do_create_link_sd+0x71/0xb7 Dec 16 11:57:06 Wizard-Server kernel: really_probe+0x115/0x282 Dec 16 11:57:06 Wizard-Server kernel: __driver_probe_device+0xc0/0xf2 Dec 16 11:57:06 Wizard-Server kernel: driver_probe_device+0x1f/0x77 Dec 16 11:57:06 Wizard-Server kernel: ? __device_attach_driver+0x97/0x97 Dec 16 11:57:06 Wizard-Server kernel: __driver_attach+0xd7/0xee Dec 16 11:57:06 Wizard-Server kernel: ? __device_attach_driver+0x97/0x97 Dec 16 11:57:06 Wizard-Server kernel: bus_for_each_dev+0x6e/0xa7 Dec 16 11:57:06 Wizard-Server kernel: bus_add_driver+0xd8/0x1d0 Dec 16 11:57:06 Wizard-Server kernel: driver_register+0x99/0xd7 Dec 16 11:57:06 Wizard-Server kernel: i915_init+0x1f/0x7f [i915] Dec 16 11:57:06 Wizard-Server kernel: ? 0xffffffffa2257000 Dec 16 11:57:06 Wizard-Server kernel: do_one_initcall+0x82/0x19f Dec 16 11:57:06 Wizard-Server kernel: ? kmalloc_trace+0x43/0x52 Dec 16 11:57:06 Wizard-Server kernel: do_init_module+0x4b/0x1d4 Dec 16 11:57:06 Wizard-Server kernel: __do_sys_init_module+0xb6/0xf9 Dec 16 11:57:06 Wizard-Server kernel: do_syscall_64+0x68/0x81 Dec 16 11:57:06 Wizard-Server kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Dec 16 11:57:06 Wizard-Server kernel: RIP: 0033:0x14e944aeadfa Dec 16 11:57:06 Wizard-Server kernel: Code: 48 8b 0d 21 20 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee 1f 0d 00 f7 d8 64 89 01 48 Dec 16 11:57:06 Wizard-Server kernel: RSP: 002b:00007ffe72d55f08 EFLAGS: 00000246 ORIG_RAX: 00000000000000af Dec 16 11:57:06 Wizard-Server kernel: RAX: ffffffffffffffda RBX: 0000000000468c70 RCX: 000014e944aeadfa Dec 16 11:57:06 Wizard-Server kernel: RDX: 000014e944bdfaad RSI: 00000000004b1868 RDI: 000014e943cc0010 Dec 16 11:57:06 Wizard-Server kernel: RBP: 000014e944bdfaad R08: 0000000000000007 R09: 0000000000464e80 Dec 16 11:57:06 Wizard-Server kernel: R10: 0000000000000005 R11: 0000000000000246 R12: 000014e943cc0010 Dec 16 11:57:06 Wizard-Server kernel: R13: 0000000000000000 R14: 0000000000459c30 R15: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: </TASK> Dec 16 11:57:06 Wizard-Server kernel: Modules linked in: kvm_intel(+) znvpair(PO) i915(+) spl(O) kvm iosf_mbi drm_buddy i2c_algo_bit ttm crct10dif_pclmul crc32_pclmul drm_display_helper crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 aesni_intel drm_kms_helper mei_hdcp mei_pxp crypto_simd cryptd rapl intel_cstate wmi_bmof drm mpt3sas intel_uncore ahci mei_me intel_gtt i2c_i801 nvme agpgart raid_class i2c_smbus hid_apple input_leds syscopyarea r8125(O) i2c_core nvme_core scsi_transport_sas joydev mei libahci led_class sysfillrect thermal sysimgblt fb_sys_fops fan tpm_crb video tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core acpi_pad acpi_tad button unix Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 Dec 16 11:57:06 Wizard-Server kernel: ---[ end trace 0000000000000000 ]--- Dec 16 11:57:06 Wizard-Server kernel: sdg: sdg1 Dec 16 11:57:06 Wizard-Server kernel: sd 2:0:4:0: [sdg] Attached SCSI disk Dec 16 11:57:06 Wizard-Server kernel: RIP: 0010:klist_put+0x16/0x74 Dec 16 11:57:06 Wizard-Server kernel: Code: 03 00 31 c0 48 89 03 5b 89 e8 5d 41 5c 41 5d c3 cc cc cc cc 41 55 41 54 41 89 f4 55 53 48 8b 2f 48 89 fb 48 83 e5 fe 48 89 ef <4c> 8b 6d 20 e8 d2 9b 03 00 45 84 e4 74 10 48 8b 03 a8 01 74 02 0f Dec 16 11:57:06 Wizard-Server kernel: RSP: 0018:ffffc9000103bab8 EFLAGS: 00010246 Dec 16 11:57:06 Wizard-Server kernel: RAX: ffff888135074b80 RBX: ffff888135074ba8 RCX: ffff888135074b80 Dec 16 11:57:06 Wizard-Server kernel: RDX: ffff888103c4b410 RSI: 0000000000000001 RDI: 0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff829513f0 Dec 16 11:57:06 Wizard-Server kernel: R10: 00003fffffffffff R11: fefefefefefefeff R12: 0000000000000001 Dec 16 11:57:06 Wizard-Server kernel: R13: ffff8881010cc000 R14: ffff888105d19b50 R15: ffff8881010cc0d0 Dec 16 11:57:06 Wizard-Server kernel: FS: 000014e9445d8240(0000) GS:ffff88903f400000(0000) knlGS:0000000000000000 Dec 16 11:57:06 Wizard-Server kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Dec 16 11:57:06 Wizard-Server kernel: CR2: 0000000000000020 CR3: 000000010368e000 CR4: 0000000000750ef0 Dec 16 11:57:06 Wizard-Server kernel: PKRU: 55555554 Do you have a monitor or at least a HDMI dummy plug connected to your iGPU? Please note that this is not related to my plugin. Errr...no,but i don't need such thing when i still use 6.11.5+12400. That's kinda wierd Quote Link to comment
ich777 Posted December 16, 2023 Author Share Posted December 16, 2023 11 minutes ago, Wizard_ said: Errr...no,but i don't need such thing when i still use 6.11.5+12400. Try to connect a display and see if that changes anything. Quote Link to comment
DerTom Posted December 16, 2023 Share Posted December 16, 2023 Hi! I'm trying to get my Mellanox ConnectX4LX MCX4121A-ACAT working with unraid. Started with the 'German' and the 'General Support' sub-forums before I remembered that I used your plugin to get the ConnectX-3 working. Right now it seems that my connectX-4 nic isn't supported. Quote mstconfig q Device type: ConnectX4LX Name: MCX4121A-ACA_Ax Description: ConnectX-4 Lx EN network interface card; 25GbE dual-port SFP28; PCIe3.0 x8; ROHS R6 Device: /sys/bus/pci/devices/0000:06:00.0/config ... -E- Unsupported device Is there a chance to get this card working with unraid also? What is it that I have to do? Kind regards, Tom Quote Link to comment
ich777 Posted December 16, 2023 Author Share Posted December 16, 2023 16 minutes ago, DerTom said: I'm trying to get my Mellanox ConnectX4LX MCX4121A-ACAT working with unraid. Connect X4 cards are known to work well with Unraid. Without Diagnostics I can‘t say anything. Quote Link to comment
DerTom Posted December 16, 2023 Share Posted December 16, 2023 (edited) 21 hours ago, ich777 said: Connect X4 cards are known to work well with Unraid. Without Diagnostics I can‘t say anything. Hi ich777! Thank you for your short dated reply! That is what I thought also after having read about the connectx-4 cards and unraid. What comes to my mind... I just installed the new card and changed the assignment for the first eth-port to the new nic within network settings. Is it that I have to 'reset' the network before assigning the nic? I added the diagnostics-file Edited December 17, 2023 by DerTom Quote Link to comment
ich777 Posted December 16, 2023 Author Share Posted December 16, 2023 1 hour ago, DerTom said: Thank you for your short dated reply! That is what I thought also after having read about the connectx-4 cards and unraid. I don't see why your card should not work: 04:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies ConnectX-3 10 GbE Single Port SFP+ Adapter [15b3:0055] Kernel driver in use: mlx4_core Kernel modules: mlx4_core 06:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015] Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT [15b3:0003] Kernel driver in use: mlx5_core Kernel modules: mlx5_core 06:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015] Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT [15b3:0003] Kernel driver in use: mlx5_core Kernel modules: mlx5_core Both your ConnectX3 and ConnectX4 are detected and running. 1 hour ago, DerTom said: Is it that I have to 'reset' the network before assigning the nic? You can delete network.cfg and network-rules.cfg from /boot/config, reboot and see if that changes anything (keep in mind that your server may have another IP and it is not reachable on the IP where it was before). 1 Quote Link to comment
DerTom Posted December 17, 2023 Share Posted December 17, 2023 19 hours ago, ich777 said: I don't see why your card should not work: 04:00.0 Ethernet controller [0200]: Mellanox Technologies MT27500 Family [ConnectX-3] [15b3:1003] Subsystem: Mellanox Technologies ConnectX-3 10 GbE Single Port SFP+ Adapter [15b3:0055] Kernel driver in use: mlx4_core Kernel modules: mlx4_core 06:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015] Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT [15b3:0003] Kernel driver in use: mlx5_core Kernel modules: mlx5_core 06:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015] Subsystem: Mellanox Technologies Stand-up ConnectX-4 Lx EN, 25GbE dual-port SFP28, PCIe3.0 x8, MCX4121A-ACAT [15b3:0003] Kernel driver in use: mlx5_core Kernel modules: mlx5_core Both your ConnectX3 and ConnectX4 are detected and running. You can delete network.cfg and network-rules.cfg from /boot/config, reboot and see if that changes anything (keep in mind that your server may have another IP and it is not reachable on the IP where it was before). I had to delete the two network config files. Right now it seems to work. 1 Quote Link to comment
Random Mike Posted December 19, 2023 Share Posted December 19, 2023 I have a issue with systemp fans on IT8792 randomly turn off with no warning messages after a few hours? Quote Link to comment
ich777 Posted December 19, 2023 Author Share Posted December 19, 2023 6 minutes ago, Random Mike said: I have a issue with systemp fans on IT8792 randomly turn off with no warning messages after a few hours? Do you have Diagnostics or anything? Is it even related to any of my plugins? Anyways I don‘t think that I can change that. Quote Link to comment
Random Mike Posted December 19, 2023 Share Posted December 19, 2023 there is no errors logged, i install ITE87 drivers and sys temp and since then as mentioned the fans on sensor IT8792 switch of at random hours later Quote Link to comment
Random Mike Posted December 19, 2023 Share Posted December 19, 2023 just checked the server attached to a monitor is does show a message modprobe could not intert it87 Quote Link to comment
ich777 Posted December 19, 2023 Author Share Posted December 19, 2023 17 minutes ago, Random Mike said: just checked the server attached to a monitor is does show a message modprobe could not intert it87 Still no Diagnostics... Quote Link to comment
cinereus Posted December 19, 2023 Share Posted December 19, 2023 Quote Step 3: Verify Fan Reporting Check to see if the fans are being correctly reported on your UnRAID dashboard. If it's correctly set up, you should see the fan speed under the hardware status. You might see two other fans that are unrelated. I'd just ignore them as they seem to be harmless. There is no "hardware status" on my dashboard. This is with QNAP TS-EC1079 Pro. Quote Link to comment
ich777 Posted December 19, 2023 Author Share Posted December 19, 2023 19 minutes ago, cinereus said: There is no "hardware status" on my dashboard. ? What have you done? Where is the quote from? I'm not sure if that is a quote from my documentation. Quote Link to comment
cinereus Posted December 19, 2023 Share Posted December 19, 2023 1 hour ago, ich777 said: ? What have you done? Where is the quote from? I'm not sure if that is a quote from my documentation. It's a quote from your readme at https://github.com/ich777/unraid-qnapec/blob/master/TS464.md I have cd /sys/class/hwmon/hwmon0 and cd /sys/class/hwmon/hwmon1 but neither have any pwn entries in them. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.