unraid crash - nvme not available after restart


Recommended Posts

I had an Unpaid crash today around noon. I log my Unraid server to my pfSense box see attached for full logs but basically:

 

Oct 27 12:33:02 unraid kernel: nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0xffff
Oct 27 12:33:02 unraid kernel: nvme nvme0: Does your device have a faulty power saving mode enabled?
Oct 27 12:33:02 unraid kernel: nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" and report a bug

 

The server automatically restarted but the cache drive (nvm0) was no longer available. 

Oct 27 12:36:00 unraid emhttpd: import 30 cache device: no device

 

and the cache drive was not available in the dropdown. So I did a power cycle and it all came back.

 

Any ideas what could be going on with my 1TB SN850 WDC Black cache drive?

 

See attached crash log and diagnostics after restart. Thank you!
 

unraidnas-diagnostics-20221027-2032.zip pfSense_Unraid_2022-10-27_Crash_nvme.txt

Link to comment

I have udpated my flash drive Unraid OS to:

kernel /bzimage
append pcie_acs_override=downstream,multifunction isolcpus=8-11,20-23 initrd=/bzroot video=efifb:off nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

and I am hoping for the best.

 

I question if I have a motherboard, nvme or other problem going on. 

Link to comment

Server crash, but seems unrelated. Seems like the more common [02:00.0] usb hub crash I've been having. I will follow up on other thread but thought it was worth mentioning it here...

 

Nov  4 07:07:59 unraid kernel: xhci_hcd 0000:02:00.0: WARN Set TR Deq Ptr cmd failed due to incorrect slot or ep state.
Nov  4 07:07:59 unraid kernel: xhci_hcd 0000:02:00.0: WARN Successful completion on short TX
Nov  4 07:07:59 unraid kernel: xhci_hcd 0000:02:00.0: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 1
Nov  4 07:07:59 unraid kernel: xhci_hcd 0000:02:00.0: Looking for event-dma 000000026ad7d410 trb-start 000000026ad7d420 trb-end 000000026ad7d420 seg-start 000000026ad7d000 seg-end 000000026ad7dff0
Nov  4 07:08:00 unraid kernel: general protection fault, probably for non-canonical address 0xcec7beccd6afa8c1: 0000 [#1] PREEMPT SMP NOPTI
Nov  4 07:08:00 unraid kernel: CPU: 6 PID: 16336 Comm: nginx Tainted: P           O      5.19.14-Unraid #1
Nov  4 07:08:00 unraid kernel: Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P2.30 02/24/2022
Nov  4 07:08:00 unraid kernel: RIP: 0010:__kmalloc_node_track_caller+0x126/0x1d9
Nov  4 07:08:00 unraid kernel: Code: 19 8b 54 24 04 4c 89 f9 4c 89 e7 8b 34 24 e8 78 fe ff ff 48 89 44 24 10 eb 2c 41 8b 44 24 28 48 8d 8a 00 01 00 00 49 8b 3c 24 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 74 87 41 8b 44
Nov  4 07:08:00 unraid kernel: RSP: 0018:ffffc9000448fbf0 EFLAGS: 00010202
Nov  4 07:08:00 unraid kernel: RAX: 0000000000000200 RBX: ffff8881a4a91c00 RCX: 000000153c2fed06
Nov  4 07:08:00 unraid kernel: RDX: 000000153c2fec06 RSI: 00000000ffffffff RDI: 000000000002f650
Nov  4 07:08:00 unraid kernel: RBP: ffff888100042b00 R08: 0000000000082a20 R09: 0000000000000000
Nov  4 07:08:00 unraid kernel: R10: ffffc9000448fe40 R11: 0000000000000000 R12: ffff888100042b00
Nov  4 07:08:00 unraid kernel: R13: 0000000000000280 R14: cec7beccd6afa6c1 R15: ffffffff816b81f4
Nov  4 07:08:00 unraid kernel: FS:  00001513ba288740(0000) GS:ffff88900e980000(0000) knlGS:0000000000000000
Nov  4 07:08:00 unraid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov  4 07:08:00 unraid kernel: CR2: 000000c0000ea000 CR3: 0000000fd72a8000 CR4: 0000000000350ee0
Nov  4 07:08:00 unraid kernel: Call Trace:
Nov  4 07:08:00 unraid kernel: <TASK>
Nov  4 07:08:00 unraid kernel: kmalloc_reserve+0x2d/0x73
Nov  4 07:08:00 unraid kernel: __alloc_skb+0xb2/0x15e
Nov  4 07:08:00 unraid kernel: ? preempt_latency_start+0x2b/0x46
Nov  4 07:08:00 unraid kernel: __tcp_send_ack+0x3b/0xdc
Nov  4 07:08:00 unraid kernel: tcp_recvmsg_locked+0x6a1/0x6cf
Nov  4 07:08:00 unraid kernel: tcp_recvmsg+0x101/0x1a2
Nov  4 07:08:00 unraid kernel: inet_recvmsg+0x69/0xa9
Nov  4 07:08:00 unraid kernel: __sys_recvfrom+0x97/0xf8
Nov  4 07:08:00 unraid kernel: __x64_sys_recvfrom+0x20/0x27
Nov  4 07:08:00 unraid kernel: do_syscall_64+0x6b/0x81
Nov  4 07:08:00 unraid kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
Nov  4 07:08:00 unraid kernel: RIP: 0033:0x1513bca9f5b0
Nov  4 07:08:00 unraid kernel: Code: 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 31 c0 b8 2d 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 68 c3 0f 1f 80 00 00 00 00 55 48 83 ec 20 48
Nov  4 07:08:00 unraid kernel: RSP: 002b:00007ffc6a372808 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
Nov  4 07:08:00 unraid kernel: RAX: ffffffffffffffda RBX: 0000000000001012 RCX: 00001513bca9f5b0
Nov  4 07:08:00 unraid kernel: RDX: 0000000000001012 RSI: 0000560813db8e80 RDI: 0000000000000010
Nov  4 07:08:00 unraid kernel: RBP: 0000151399cb8400 R08: 0000000000000000 R09: 0000000000000000
Nov  4 07:08:00 unraid kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
Nov  4 07:08:00 unraid kernel: R13: 0000560813db8e80 R14: 0000560813aee180 R15: 0000560813d4d4c0
Nov  4 07:08:00 unraid kernel: </TASK>
Nov  4 07:08:00 unraid kernel: Modules linked in: xt_mark af_packet nvidia_uvm(PO) xt_nat veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax md_mod nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls ipv6 wmi_bmof edac_mce_amd edac_core nvidia_drm(PO) nvidia_modeset(PO) kvm_amd nvidia(PO) kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper aesni_intel crypto_simd cryptd rapl drm r8169 i2c_piix4 ccp k10temp nvme backlight realtek i2c_core joydev ahci nvme_core syscopyarea sysfillrect sysimgblt libahci fb_sys_fops wmi tpm_crb tpm_tis tpm_tis_core tpm acpi_cpufreq button unix
Nov  4 07:08:00 unraid kernel: ---[ end trace 0000000000000000 ]---
Nov  4 07:08:00 unraid kernel: RIP: 0010:__kmalloc_node_track_caller+0x126/0x1d9
Nov  4 07:08:00 unraid kernel: Code: 19 8b 54 24 04 4c 89 f9 4c 89 e7 8b 34 24 e8 78 fe ff ff 48 89 44 24 10 eb 2c 41 8b 44 24 28 48 8d 8a 00 01 00 00 49 8b 3c 24 <49> 8b 1c 06 4c 89 f0 65 48 0f c7 0f 0f 94 c0 84 c0 74 87 41 8b 44
Nov  4 07:08:00 unraid kernel: RSP: 0018:ffffc9000448fbf0 EFLAGS: 00010202
Nov  4 07:08:00 unraid kernel: RAX: 0000000000000200 RBX: ffff8881a4a91c00 RCX: 000000153c2fed06
Nov  4 07:08:00 unraid kernel: RDX: 000000153c2fec06 RSI: 00000000ffffffff RDI: 000000000002f650
Nov  4 07:08:00 unraid kernel: RBP: ffff888100042b00 R08: 0000000000082a20 R09: 0000000000000000
Nov  4 07:08:00 unraid kernel: R10: ffffc9000448fe40 R11: 0000000000000000 R12: ffff888100042b00
Nov  4 07:08:00 unraid kernel: R13: 0000000000000280 R14: cec7beccd6afa6c1 R15: ffffffff816b81f4
Nov  4 07:08:00 unraid kernel: FS:  00001513ba288740(0000) GS:ffff88900e980000(0000) knlGS:0000000000000000
Nov  4 07:08:00 unraid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov  4 07:08:00 unraid kernel: CR2: 000000c0000ea000 CR3: 0000000fd72a8000 CR4: 0000000000350ee0
Nov  4 07:08:00 unraid kernel: general protection fault, probably for non-canonical address 0xcec7beccd6afa8c1: 0000 [#2] PREEMPT SMP NOPTI
Nov  4 07:08:00 unraid kernel: CPU: 6 PID: 3055 Comm: kworker/6:0 Tainted: P      D    O      5.19.14-Unraid #1
Nov  4 07:08:00 unraid kernel: Hardware name: To Be Filled By O.E.M. B550M Pro4/B550M Pro4, BIOS P2.30 02/24/2022
Nov  4 07:08:00 unraid kernel: Workqueue: events efi_pstore_update_entries
Nov  4 07:08:00 unraid kernel: RIP: 0010:__kmalloc+0xf2/0x19e
Nov  4 07:08:00 unraid kernel: Code: 00 48 89 04 24 74 05 48 85 c0 75 17 4c 89 f9 83 ca ff 44 89 e6 48 89 ef e8 50 f9 ff ff 48 89 04 24 eb 25 8b 4d 28 48 8b 7d 00 <48> 8b 1c 08 48 8d 8a 00 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 74
Nov  4 07:08:00 unraid kernel: RSP: 0018:ffffc90003effd90 EFLAGS: 00010282
Nov  4 07:08:00 unraid kernel: RAX: cec7beccd6afa6c1 RBX: ffff888102eaf000 RCX: 0000000000000200
Nov  4 07:08:00 unraid kernel: RDX: 000000153c2fec06 RSI: 0000000000000dc0 RDI: 000000000002f650
Nov  4 07:08:00 unraid kernel: RBP: ffff888100042b00 R08: 0000000000000dc0 R09: 0000000000000001
Nov  4 07:08:00 unraid kernel: R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000dc0
Nov  4 07:08:00 unraid kernel: R13: ffff888100042b00 R14: 0000000000000400 R15: ffffffff8168a2f0
Nov  4 07:08:00 unraid kernel: FS:  0000000000000000(0000) GS:ffff88900e980000(0000) knlGS:0000000000000000
Nov  4 07:08:00 unraid kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov  4 07:08:00 unraid kernel: CR2: 000000c0000ea000 CR3: 0000000fd72a8000 CR4: 0000000000350ee0
Nov  4 07:08:00 unraid kernel: Call Trace:
Nov  4 07:08:00 unraid kernel: <TASK>
Nov  4 07:08:00 unraid kernel: efivar_init+0x78/0x330
Nov  4 07:08:00 unraid kernel: ? efi_pstore_read_func+0x275/0x275
Nov  4 07:08:00 unraid kernel: ? efi_pstore_update_entries+0x1c/0x67
Nov  4 07:08:00 unraid kernel: ? kmem_cache_alloc_trace+0x11e/0x149
Nov  4 07:08:00 unraid kernel: efi_pstore_update_entries+0x3c/0x67
Nov  4 07:08:00 unraid kernel: process_one_work+0x1ab/0x295
Nov  4 07:08:00 unraid kernel: worker_thread+0x18b/0x244
Nov  4 07:08:00 unraid kernel: ? rescuer_thread+0x281/0x281
Nov  4 07:08:00 unraid kernel: kthread+0xe7/0xef
Nov  4 07:08:00 unraid kernel: ? kthread_complete_and_exit+0x1b/0x1b
Nov  4 07:08:00 unraid kernel: ret_from_fork+0x22/0x30
Nov  4 07:08:00 unraid kernel: </TASK>
Nov  4 07:08:00 unraid kernel: Modules linked in: xt_mark af_packet nvidia_uvm(PO) xt_nat veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs dm_crypt dm_mod dax md_mod nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls ipv6 wmi_bmof edac_mce_amd edac_core nvidia_drm(PO) nvidia_modeset(PO) kvm_amd nvidia(PO) kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm_kms_helper aesni_intel crypto_simd cryptd rapl drm r8169 i2c_piix4 ccp k10temp nvme backlight realtek i2c_core joydev ahci nvme_core syscopyarea sysfillrect sysimgblt libahci fb_sys_fops wmi tpm_crb tpm_tis tpm_tis_core tpm acpi_cpufreq button unix
Nov  4 07:08:00 unraid kernel: ---[ end trace 0000000000000000 ]---

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.