Jump to content

Crash without any message


Recommended Posts

  • 3 weeks later...

I've the crash report now, there was a problem with that docker but sometimes also my NVIDIA GPU seems to crash

 

Could be due to the driver update done without a reboot? 

 

Jul 24 06:30:38 littleboy kernel: NVRM: GPU 0000:af:00.0: RmInitAdapter failed! (0x25:0x51:1589)
Jul 24 06:30:38 littleboy kernel: NVRM: GPU 0000:af:00.0: rm_init_adapter failed, device minor number 0
Jul 24 06:31:23 littleboy kernel: BUG: unable to handle page fault for address: ffffc90287112a00
Jul 24 06:31:23 littleboy kernel: #PF: supervisor write access in kernel mode
Jul 24 06:31:23 littleboy kernel: #PF: error_code(0x0002) - not-present page
Jul 24 06:31:23 littleboy kernel: PGD 100000067 P4D 100000067 PUD 3d339b9067 PMD 0 
Jul 24 06:31:23 littleboy kernel: Oops: 0002 [#1] PREEMPT SMP PTI
Jul 24 06:31:23 littleboy kernel: CPU: 50 PID: 2329 Comm: nv_open_q Tainted: P           O       6.1.79-Unraid #1
Jul 24 06:31:23 littleboy kernel: Hardware name: VxRack AS PowerEdge R740xd/0RR8YK, BIOS 2.21.2 02/19/2024
Jul 24 06:31:23 littleboy kernel: RIP: 0010:os_mem_copy_custom+0x2c/0x60 [nvidia]
Jul 24 06:31:23 littleboy kernel: Code: 44 00 00 83 fa 7f 49 89 f8 48 89 f9 76 0b 48 89 f8 48 09 f0 83 e0 03 74 06 89 d2 31 c0 eb 25 89 d1 83 e2 03 c1 e9 02 8b 3c 86 <41> 89 3c 80 48 ff c0 39 c1 75 f2 89 c8 48 c1 e0 02 49 8d 0c 00 48
Jul 24 06:31:23 littleboy kernel: RSP: 0018:ffffc9000f477930 EFLAGS: 00010206
Jul 24 06:31:23 littleboy kernel: RAX: 0000000000000000 RBX: ffff8882faf61cd8 RCX: 0000000000000200
Jul 24 06:31:23 littleboy kernel: RDX: 0000000000000000 RSI: ffff888eb7b24008 RDI: 000000000000000d
Jul 24 06:31:23 littleboy kernel: RBP: ffff88af099aab80 R08: ffffc90287112a00 R09: 0000000000000000
Jul 24 06:31:23 littleboy kernel: R10: 000000000010a804 R11: 0000000000000000 R12: 000000000000000d
Jul 24 06:31:23 littleboy kernel: R13: ffff889924c88008 R14: ffff8885b15a2010 R15: ffff8885b15a2f13
Jul 24 06:31:23 littleboy kernel: FS:  0000000000000000(0000) GS:ffff889fffc40000(0000) knlGS:0000000000000000
Jul 24 06:31:23 littleboy kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 24 06:31:23 littleboy kernel: CR2: ffffc90287112a00 CR3: 00000023a29d6001 CR4: 00000000007706e0
Jul 24 06:31:23 littleboy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 24 06:31:23 littleboy kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 24 06:31:23 littleboy kernel: PKRU: 55555554
Jul 24 06:31:23 littleboy kernel: Call Trace:
Jul 24 06:31:23 littleboy kernel: <TASK>
Jul 24 06:31:23 littleboy kernel: ? __die_body+0x1a/0x5c
Jul 24 06:31:23 littleboy kernel: ? page_fault_oops+0x329/0x376
Jul 24 06:31:23 littleboy kernel: ? fixup_exception+0x22/0x24b
Jul 24 06:31:23 littleboy kernel: ? exc_page_fault+0xf4/0x11d
Jul 24 06:31:23 littleboy kernel: ? asm_exc_page_fault+0x22/0x30
Jul 24 06:31:23 littleboy kernel: ? os_mem_copy_custom+0x2c/0x60 [nvidia]
Jul 24 06:31:23 littleboy kernel: _nv011987rm+0x5c/0x100 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv011968rm+0x6e/0x1d0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv024663rm+0x1a3/0x290 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv043897rm+0x428/0xad2 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv043916rm+0x148/0x3f0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv043916rm+0x10d/0x3f0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv049576rm+0x6d/0xb0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv049576rm+0x35/0xb0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv014517rm+0x51/0xc0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv013557rm+0xa8/0x210 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv042458rm+0xd1/0x1d0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv013346rm+0x5a/0xd0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv011239rm+0xe1/0x160 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv050979rm+0x20/0x2e0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv014771rm+0x50/0x100 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv014815rm+0xf1/0x2f0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv044401rm+0x1fd/0x260 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv017529rm+0x35/0x110 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv018649rm+0x13b/0x3d0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv026712rm+0x97/0x1a0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv000773rm+0x1b3/0x313 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _nv000720rm+0x482/0x20e0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? rm_init_adapter+0xcd/0xf0 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? ttwu_queue_wakelist+0x9a/0xcf
Jul 24 06:31:23 littleboy kernel: ? nv_open_device+0x57a/0x869 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? nvidia_open_deferred+0x33/0x7f [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _raw_q_schedule+0x69/0x69 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? _main_loop+0xf1/0x115 [nvidia]
Jul 24 06:31:23 littleboy kernel: ? kthread+0xe4/0xef
Jul 24 06:31:23 littleboy kernel: ? kthread_complete_and_exit+0x1b/0x1b
Jul 24 06:31:23 littleboy kernel: ? ret_from_fork+0x1f/0x30
Jul 24 06:31:23 littleboy kernel: </TASK>
Jul 24 06:31:23 littleboy kernel: Modules linked in: joydev uinput af_packet bluetooth ecdh_generic ecc xt_connmark xt_mark iptable_mangle xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha nvidia_uvm(PO) veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter bridge xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod tcp_diag inet_diag ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs 8021q garp mrp stp llc macvtap macvlan tap bonding tls ixgbe xfrm_algo mdio igb intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal nvidia_drm(PO) coretemp nvidia_modeset(PO) kvm_intel zfs(PO) kvm nvidia(PO) zunicode(PO) zzstd(O) crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 mgag200 zlua(O) sha1_ssse3
Jul 24 06:31:23 littleboy kernel: ipmi_ssif video drm_shmem_helper zavl(PO) aesni_intel crypto_simd drm_kms_helper icp(PO) cryptd zcommon(PO) znvpair(PO) drm rapl spl(O) wmi_bmof backlight intel_cstate i2c_i801 acpi_ipmi mei_me syscopyarea nvme i2c_algo_bit i2c_smbus intel_uncore sysfillrect ahci sysimgblt mei i2c_core nvme_core megaraid_sas fb_sys_fops libahci intel_pch_thermal wmi ipmi_si acpi_power_meter button unix [last unloaded: xfrm_algo]
Jul 24 06:31:23 littleboy kernel: CR2: ffffc90287112a00
Jul 24 06:31:23 littleboy kernel: ---[ end trace 0000000000000000 ]---
Jul 24 06:31:23 littleboy kernel: RIP: 0010:os_mem_copy_custom+0x2c/0x60 [nvidia]
Jul 24 06:31:23 littleboy kernel: Code: 44 00 00 83 fa 7f 49 89 f8 48 89 f9 76 0b 48 89 f8 48 09 f0 83 e0 03 74 06 89 d2 31 c0 eb 25 89 d1 83 e2 03 c1 e9 02 8b 3c 86 <41> 89 3c 80 48 ff c0 39 c1 75 f2 89 c8 48 c1 e0 02 49 8d 0c 00 48
Jul 24 06:31:23 littleboy kernel: RSP: 0018:ffffc9000f477930 EFLAGS: 00010206
Jul 24 06:31:23 littleboy kernel: RAX: 0000000000000000 RBX: ffff8882faf61cd8 RCX: 0000000000000200
Jul 24 06:31:23 littleboy kernel: RDX: 0000000000000000 RSI: ffff888eb7b24008 RDI: 000000000000000d
Jul 24 06:31:23 littleboy kernel: RBP: ffff88af099aab80 R08: ffffc90287112a00 R09: 0000000000000000
Jul 24 06:31:23 littleboy kernel: R10: 000000000010a804 R11: 0000000000000000 R12: 000000000000000d
Jul 24 06:31:23 littleboy kernel: R13: ffff889924c88008 R14: ffff8885b15a2010 R15: ffff8885b15a2f13
Jul 24 06:31:23 littleboy kernel: FS:  0000000000000000(0000) GS:ffff889fffc40000(0000) knlGS:0000000000000000
Jul 24 06:31:23 littleboy kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 24 06:31:23 littleboy kernel: CR2: ffffc90287112a00 CR3: 00000023a29d6001 CR4: 00000000007706e0
Jul 24 06:31:23 littleboy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 24 06:31:23 littleboy kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 24 06:31:23 littleboy kernel: PKRU: 55555554
Jul 24 06:31:23 littleboy kernel: note: nv_open_q[2329] exited with irqs disabled
Jul 24 06:31:26 littleboy kernel: BUG: unable to handle page fault for address: ffffc9000f477df8
Jul 24 06:31:26 littleboy kernel: #PF: supervisor read access in kernel mode
Jul 24 06:31:26 littleboy kernel: #PF: error_code(0x0000) - not-present page
Jul 24 06:31:26 littleboy kernel: PGD 100000067 P4D 100000067 PUD 1001be067 PMD 208b1de067 PTE 0
Jul 24 06:31:26 littleboy kernel: Oops: 0000 [#2] PREEMPT SMP PTI
Jul 24 06:31:26 littleboy kernel: CPU: 78 PID: 50540 Comm: nvidia-smi Tainted: P      D    O       6.1.79-Unraid #1
Jul 24 06:31:26 littleboy kernel: Hardware name: VxRack AS PowerEdge R740xd/0RR8YK, BIOS 2.21.2 02/19/2024
Jul 24 06:31:26 littleboy kernel: RIP: 0010:_nv012504rm+0x3c/0x310 [nvidia]
Jul 24 06:31:26 littleboy kernel: Code: 48 63 47 08 48 01 c2 48 8b 07 48 85 c0 75 1b e9 2b 02 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 8b 48 10 48 85 c9 74 17 48 89 c8 <48> 39 30 77 ef 0f 83 f9 01 00 00 48 8b 48 18 48 85 c9 75 e9 48 89
Jul 24 06:31:26 littleboy kernel: RSP: 0018:ffffc9002a203d98 EFLAGS: 00010082
Jul 24 06:31:26 littleboy kernel: RAX: ffffc9000f477df8 RBX: ffffffffa2910a67 RCX: fffffffdda3e82fd
Jul 24 06:31:26 littleboy kernel: RDX: ffffc9002a203e10 RSI: 000000000000c56c RDI: ffffffffa4f1e5d8
Jul 24 06:31:26 littleboy kernel: RBP: ffff888321f86000 R08: 0000000000000000 R09: ffffc9002a203e38
Jul 24 06:31:26 littleboy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9002a203dc0
Jul 24 06:31:26 littleboy kernel: R13: ffff888ddf6d1780 R14: 0000000000000048 R15: 000000000000001f
Jul 24 06:31:26 littleboy kernel: FS:  000014e2068e31c0(0000) GS:ffff889ffffc0000(0000) knlGS:0000000000000000
Jul 24 06:31:26 littleboy kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 24 06:31:26 littleboy kernel: CR2: ffffc9000f477df8 CR3: 00000006cfee8004 CR4: 00000000007706e0
Jul 24 06:31:26 littleboy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 24 06:31:26 littleboy kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 24 06:31:26 littleboy kernel: PKRU: 55555554
Jul 24 06:31:26 littleboy kernel: Call Trace:
Jul 24 06:31:26 littleboy kernel: <TASK>
Jul 24 06:31:26 littleboy kernel: ? __die_body+0x1a/0x5c
Jul 24 06:31:26 littleboy kernel: ? page_fault_oops+0x329/0x376
Jul 24 06:31:26 littleboy kernel: ? fixup_exception+0x22/0x24b
Jul 24 06:31:26 littleboy kernel: ? exc_page_fault+0xf4/0x11d
Jul 24 06:31:26 littleboy kernel: ? asm_exc_page_fault+0x22/0x30
Jul 24 06:31:26 littleboy kernel: ? rm_perform_version_check+0x37/0x150 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? _nv012504rm+0x3c/0x310 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? rm_perform_version_check+0x37/0x150 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? _nv049845rm+0xd6/0x1d0 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? rm_perform_version_check+0x37/0x150 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? nvidia_unlocked_ioctl+0x4b1/0x6c2 [nvidia]
Jul 24 06:31:26 littleboy kernel: ? _raw_spin_unlock+0x14/0x29
Jul 24 06:31:26 littleboy kernel: ? do_fcntl+0x19a/0x569
Jul 24 06:31:26 littleboy kernel: ? vfs_ioctl+0x1b/0x2f
Jul 24 06:31:26 littleboy kernel: ? __do_sys_ioctl+0x52/0x78
Jul 24 06:31:26 littleboy kernel: ? do_syscall_64+0x68/0x81
Jul 24 06:31:26 littleboy kernel: ? entry_SYSCALL_64_after_hwframe+0x64/0xce
Jul 24 06:31:26 littleboy kernel: </TASK>
Jul 24 06:31:26 littleboy kernel: Modules linked in: joydev uinput af_packet bluetooth ecdh_generic ecc xt_connmark xt_mark iptable_mangle xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha nvidia_uvm(PO) veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter bridge xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod tcp_diag inet_diag ipmi_devintf ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs 8021q garp mrp stp llc macvtap macvlan tap bonding tls ixgbe xfrm_algo mdio igb intel_rapl_msr intel_rapl_common iosf_mbi x86_pkg_temp_thermal nvidia_drm(PO) coretemp nvidia_modeset(PO) kvm_intel zfs(PO) kvm nvidia(PO) zunicode(PO) zzstd(O) crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 mgag200 zlua(O) sha1_ssse3
Jul 24 06:31:26 littleboy kernel: ipmi_ssif video drm_shmem_helper zavl(PO) aesni_intel crypto_simd drm_kms_helper icp(PO) cryptd zcommon(PO) znvpair(PO) drm rapl spl(O) wmi_bmof backlight intel_cstate i2c_i801 acpi_ipmi mei_me syscopyarea nvme i2c_algo_bit i2c_smbus intel_uncore sysfillrect ahci sysimgblt mei i2c_core nvme_core megaraid_sas fb_sys_fops libahci intel_pch_thermal wmi ipmi_si acpi_power_meter button unix [last unloaded: xfrm_algo]
Jul 24 06:31:26 littleboy kernel: CR2: ffffc9000f477df8
Jul 24 06:31:26 littleboy kernel: ---[ end trace 0000000000000000 ]---
Jul 24 06:31:26 littleboy kernel: RIP: 0010:os_mem_copy_custom+0x2c/0x60 [nvidia]
Jul 24 06:31:26 littleboy kernel: Code: 44 00 00 83 fa 7f 49 89 f8 48 89 f9 76 0b 48 89 f8 48 09 f0 83 e0 03 74 06 89 d2 31 c0 eb 25 89 d1 83 e2 03 c1 e9 02 8b 3c 86 <41> 89 3c 80 48 ff c0 39 c1 75 f2 89 c8 48 c1 e0 02 49 8d 0c 00 48
Jul 24 06:31:26 littleboy kernel: RSP: 0018:ffffc9000f477930 EFLAGS: 00010206
Jul 24 06:31:26 littleboy kernel: RAX: 0000000000000000 RBX: ffff8882faf61cd8 RCX: 0000000000000200
Jul 24 06:31:26 littleboy kernel: RDX: 0000000000000000 RSI: ffff888eb7b24008 RDI: 000000000000000d
Jul 24 06:31:26 littleboy kernel: RBP: ffff88af099aab80 R08: ffffc90287112a00 R09: 0000000000000000
Jul 24 06:31:26 littleboy kernel: R10: 000000000010a804 R11: 0000000000000000 R12: 000000000000000d
Jul 24 06:31:26 littleboy kernel: R13: ffff889924c88008 R14: ffff8885b15a2010 R15: ffff8885b15a2f13
Jul 24 06:31:26 littleboy kernel: FS:  000014e2068e31c0(0000) GS:ffff889ffffc0000(0000) knlGS:0000000000000000
Jul 24 06:31:26 littleboy kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 24 06:31:26 littleboy kernel: CR2: ffffc9000f477df8 CR3: 00000006cfee8004 CR4: 00000000007706e0
Jul 24 06:31:26 littleboy kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jul 24 06:31:26 littleboy kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Jul 24 06:31:26 littleboy kernel: PKRU: 55555554
Jul 24 06:31:26 littleboy kernel: note: nvidia-smi[50540] exited with irqs disabled
Jul 24 06:31:26 littleboy kernel: note: nvidia-smi[50540] exited with preempt_count 1
Jul 24 11:04:12 littleboy kernel: mdcmd (36): set md_write_method 1
Jul 24 11:04:12 littleboy kernel: 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...