Jump to content
  • 6.8.3 and 6.9.2 same issues continue


    mfdoom7
    • Urgent

    problem is after i added nvme with 10g nic server experiances random hard crashes. system locks up so hard i dont have display and cant use keyboard i need to power down server via button. once i did saw kernel panic messege what i managed to capture. before that my server was stable almost a year of uptime. i need help to solve this problem i dont rly know nothing about linux thats why i did choose unraid so yea how do i capture log when things will go bad ? 

     

    also there is log that shows mover did put plex metadata to nvme (cache) and then things went  bad. 

    May  2 12:47:51 Storinator kernel: general protection fault, probably for non-canonical address 0x888100069000: 0000 [#1] SMP NOPTI
    May  2 12:47:51 Storinator kernel: CPU: 1 PID: 11665 Comm: in_use Not tainted 5.10.28-Unraid #1
    May  2 12:47:51 Storinator kernel: Hardware name: Gigabyte Technology Co., Ltd. B450 I AORUS PRO WIFI/B450 I AORUS PRO WIFI-CF, BIOS F61a 03/29/2021
    May  2 12:47:51 Storinator kernel: RIP: 0010:lock_page_memcg+0x1a/0x73
    May  2 12:47:51 Storinator kernel: Code: 89 f8 74 0b 48 8b 87 88 07 00 00 48 8b 40 20 c3 41 54 55 53 e8 d2 ff ff ff 48 89 c3 e8 01 ff ff ff 84 c0 74 32 45 31 e4 eb 51 <41> 8b 84 24 80 09 00 00 85 c0 7e 45 49 8d ac 24 40 04 00 00 48 89
    May  2 12:47:51 Storinator kernel: RSP: 0018:ffffc90004c0fb60 EFLAGS: 00010206
    May  2 12:47:51 Storinator kernel: RAX: 0000000000000000 RBX: ffffea0006f17b40 RCX: 000000ffffffffff
    May  2 12:47:51 Storinator kernel: RDX: ffffea000b280bc8 RSI: 0000000000000000 RDI: ffffea0006f17b40
    May  2 12:47:51 Storinator kernel: RBP: ffffea0006f17b40 R08: ffffea0006f17b40 R09: ffff88813ad59500
    May  2 12:47:51 Storinator kernel: R10: 80000001bc5ed045 R11: 000000000000000c R12: 0000888100069000
    May  2 12:47:51 Storinator kernel: R13: 80000001bc5ed045 R14: ffff88813ad59500 R15: ffff88810775f000
    May  2 12:47:51 Storinator kernel: FS:  00001536b0d9b740(0000) GS:ffff88842ec40000(0000) knlGS:0000000000000000
    May  2 12:47:51 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  2 12:47:51 Storinator kernel: CR2: 0000000000529f24 CR3: 00000001053dc000 CR4: 00000000003506e0
    May  2 12:47:51 Storinator kernel: Call Trace:
    May  2 12:47:51 Storinator kernel: page_remove_rmap+0xf/0x1d8
    May  2 12:47:51 Storinator kernel: unmap_page_range+0x3c7/0x648
    May  2 12:47:51 Storinator kernel: unmap_vmas+0x6c/0x9a
    May  2 12:47:51 Storinator kernel: exit_mmap+0xb5/0x155
    May  2 12:47:51 Storinator kernel: __mmput+0x3b/0xcf
    May  2 12:47:51 Storinator kernel: begin_new_exec+0x5ed/0x80a
    May  2 12:47:51 Storinator kernel: load_elf_binary+0x210/0x1217
    May  2 12:47:51 Storinator kernel: ? __kernel_read+0x118/0x15a
    May  2 12:47:51 Storinator kernel: ? __kernel_read+0x118/0x15a
    May  2 12:47:51 Storinator kernel: bprm_execve+0x25f/0x524
    May  2 12:47:51 Storinator kernel: do_execveat_common.isra.0+0x12a/0x156
    May  2 12:47:51 Storinator kernel: __x64_sys_execve+0x34/0x3d
    May  2 12:47:51 Storinator kernel: do_syscall_64+0x5d/0x6a
    May  2 12:47:51 Storinator kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    May  2 12:47:51 Storinator kernel: RIP: 0033:0x1536b0e7aaf7
    May  2 12:47:51 Storinator kernel: Code: Unable to access opcode bytes at RIP 0x1536b0e7aacd.
    May  2 12:47:51 Storinator kernel: RSP: 002b:00007ffd30301f38 EFLAGS: 00000202 ORIG_RAX: 000000000000003b
    May  2 12:47:51 Storinator kernel: RAX: ffffffffffffffda RBX: 0000000000535b68 RCX: 00001536b0e7aaf7
    May  2 12:47:51 Storinator kernel: RDX: 0000000000535208 RSI: 0000000000539a08 RDI: 00000000005399e8
    May  2 12:47:51 Storinator kernel: RBP: 00000000005399e8 R08: 0000000000539a08 R09: 0000000000000003
    May  2 12:47:51 Storinator kernel: R10: 0000000000420cf0 R11: 0000000000000202 R12: 00000000ffffffff
    May  2 12:47:51 Storinator kernel: R13: 0000000000539a08 R14: 0000000000535208 R15: 0000000000535de8
    May  2 12:47:51 Storinator kernel: Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod it87 hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables atlantic igb i2c_algo_bit edac_mce_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof btusb btrtl btbcm btintel bluetooth ecdh_generic ecc rapl k10temp ahci video nvme i2c_piix4 nvme_core wmi i2c_core ccp libahci backlight acpi_cpufreq button [last unloaded: atlantic]
    May  2 12:47:51 Storinator kernel: ---[ end trace ac9f4e51479d3485 ]---
    May  2 12:47:51 Storinator kernel: RIP: 0010:lock_page_memcg+0x1a/0x73
    May  2 12:47:51 Storinator kernel: Code: 89 f8 74 0b 48 8b 87 88 07 00 00 48 8b 40 20 c3 41 54 55 53 e8 d2 ff ff ff 48 89 c3 e8 01 ff ff ff 84 c0 74 32 45 31 e4 eb 51 <41> 8b 84 24 80 09 00 00 85 c0 7e 45 49 8d ac 24 40 04 00 00 48 89
    May  2 12:47:51 Storinator kernel: RSP: 0018:ffffc90004c0fb60 EFLAGS: 00010206
    May  2 12:47:51 Storinator kernel: RAX: 0000000000000000 RBX: ffffea0006f17b40 RCX: 000000ffffffffff
    May  2 12:47:51 Storinator kernel: RDX: ffffea000b280bc8 RSI: 0000000000000000 RDI: ffffea0006f17b40
    May  2 12:47:51 Storinator kernel: RBP: ffffea0006f17b40 R08: ffffea0006f17b40 R09: ffff88813ad59500
    May  2 12:47:51 Storinator kernel: R10: 80000001bc5ed045 R11: 000000000000000c R12: 0000888100069000
    May  2 12:47:51 Storinator kernel: R13: 80000001bc5ed045 R14: ffff88813ad59500 R15: ffff88810775f000
    May  2 12:47:51 Storinator kernel: FS:  00001536b0d9b740(0000) GS:ffff88842ec40000(0000) knlGS:0000000000000000
    May  2 12:47:51 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  2 12:47:51 Storinator kernel: CR2: 00001536b0e7aacd CR3: 00000001053dc000 CR4: 00000000003506e0
    May  2 12:47:51 Storinator kernel: general protection fault, probably for non-canonical address 0x888100069000: 0000 [#2] SMP NOPTI
    May  2 12:47:51 Storinator kernel: CPU: 2 PID: 11661 Comm: in_use Tainted: G      D           5.10.28-Unraid #1
    May  2 12:47:51 Storinator kernel: Hardware name: Gigabyte Technology Co., Ltd. B450 I AORUS PRO WIFI/B450 I AORUS PRO WIFI-CF, BIOS F61a 03/29/2021
    May  2 12:47:51 Storinator kernel: RIP: 0010:lock_page_memcg+0x1a/0x73
    May  2 12:47:51 Storinator kernel: Code: 89 f8 74 0b 48 8b 87 88 07 00 00 48 8b 40 20 c3 41 54 55 53 e8 d2 ff ff ff 48 89 c3 e8 01 ff ff ff 84 c0 74 32 45 31 e4 eb 51 <41> 8b 84 24 80 09 00 00 85 c0 7e 45 49 8d ac 24 40 04 00 00 48 89
    May  2 12:47:51 Storinator kernel: RSP: 0018:ffffc90004b87c90 EFLAGS: 00010206
    May  2 12:47:51 Storinator kernel: RAX: 0000000000000000 RBX: ffffea0006f17b40 RCX: 000000ffffffffff
    May  2 12:47:51 Storinator kernel: RDX: ffffea000b280bc8 RSI: 0000000000000000 RDI: ffffea0006f17b40
    May  2 12:47:51 Storinator kernel: RBP: ffffea0006f17b40 R08: ffffea0006f17b40 R09: ffff888393c01680
    May  2 12:47:51 Storinator kernel: R10: 80000001bc5ed065 R11: 000000000000000c R12: 0000888100069000
    May  2 12:47:51 Storinator kernel: R13: 80000001bc5ed065 R14: ffff888393c01680 R15: ffff8881090df000
    May  2 12:47:51 Storinator kernel: FS:  00001536b0d9b740(0000) GS:ffff88842ec80000(0000) knlGS:0000000000000000
    May  2 12:47:51 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  2 12:47:51 Storinator kernel: CR2: 00001536b0f7d718 CR3: 0000000105402000 CR4: 00000000003506e0
    May  2 12:47:51 Storinator kernel: Call Trace:
    May  2 12:47:51 Storinator kernel: page_remove_rmap+0xf/0x1d8
    May  2 12:47:51 Storinator kernel: unmap_page_range+0x3c7/0x648
    May  2 12:47:51 Storinator kernel: unmap_vmas+0x6c/0x9a
    May  2 12:47:51 Storinator kernel: ? lru_add_drain_cpu+0x23/0xf8
    May  2 12:47:51 Storinator kernel: exit_mmap+0xb5/0x155
    May  2 12:47:51 Storinator kernel: __mmput+0x3b/0xcf
    May  2 12:47:51 Storinator kernel: do_exit+0x3b4/0x8eb
    May  2 12:47:51 Storinator kernel: do_group_exit+0x8e/0x8e
    May  2 12:47:51 Storinator kernel: __x64_sys_exit_group+0xf/0xf
    May  2 12:47:51 Storinator kernel: do_syscall_64+0x5d/0x6a
    May  2 12:47:51 Storinator kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    May  2 12:47:51 Storinator kernel: RIP: 0033:0x1536b0e7aac6
    May  2 12:47:51 Storinator kernel: Code: Unable to access opcode bytes at RIP 0x1536b0e7aa9c.
    May  2 12:47:51 Storinator kernel: RSP: 002b:00007ffd303027e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
    May  2 12:47:51 Storinator kernel: RAX: ffffffffffffffda RBX: 00001536b0f79820 RCX: 00001536b0e7aac6
    May  2 12:47:51 Storinator kernel: RDX: 0000000000000001 RSI: 000000000000003c RDI: 0000000000000001
    May  2 12:47:51 Storinator kernel: RBP: 0000000000000001 R08: 00000000000000e7 R09: ffffffffffffff80
    May  2 12:47:51 Storinator kernel: R10: 0000000000000004 R11: 0000000000000246 R12: 00001536b0f79820
    May  2 12:47:51 Storinator kernel: R13: 0000000000000001 R14: 00001536b0f82328 R15: 0000000000000000
    May  2 12:47:51 Storinator kernel: Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod it87 hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables atlantic igb i2c_algo_bit edac_mce_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof btusb btrtl btbcm btintel bluetooth ecdh_generic ecc rapl k10temp ahci video nvme i2c_piix4 nvme_core wmi i2c_core ccp libahci backlight acpi_cpufreq button [last unloaded: atlantic]
    May  2 12:47:51 Storinator kernel: ---[ end trace ac9f4e51479d3486 ]---
    May  2 12:47:51 Storinator kernel: RIP: 0010:lock_page_memcg+0x1a/0x73
    May  2 12:47:51 Storinator kernel: Code: 89 f8 74 0b 48 8b 87 88 07 00 00 48 8b 40 20 c3 41 54 55 53 e8 d2 ff ff ff 48 89 c3 e8 01 ff ff ff 84 c0 74 32 45 31 e4 eb 51 <41> 8b 84 24 80 09 00 00 85 c0 7e 45 49 8d ac 24 40 04 00 00 48 89
    May  2 12:47:51 Storinator kernel: RSP: 0018:ffffc90004c0fb60 EFLAGS: 00010206
    May  2 12:47:51 Storinator kernel: RAX: 0000000000000000 RBX: ffffea0006f17b40 RCX: 000000ffffffffff
    May  2 12:47:51 Storinator kernel: RDX: ffffea000b280bc8 RSI: 0000000000000000 RDI: ffffea0006f17b40
    May  2 12:47:51 Storinator kernel: RBP: ffffea0006f17b40 R08: ffffea0006f17b40 R09: ffff88813ad59500
    May  2 12:47:51 Storinator kernel: R10: 80000001bc5ed045 R11: 000000000000000c R12: 0000888100069000
    May  2 12:47:51 Storinator kernel: R13: 80000001bc5ed045 R14: ffff88813ad59500 R15: ffff88810775f000
    May  2 12:47:51 Storinator kernel: FS:  00001536b0d9b740(0000) GS:ffff88842ec80000(0000) knlGS:0000000000000000
    May  2 12:47:51 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  2 12:47:51 Storinator kernel: CR2: 00001536b0f7d718 CR3: 0000000105402000 CR4: 00000000003506e0
    May  2 12:47:51 Storinator kernel: Fixing recursive fault but reboot is needed!
     




    User Feedback

    Recommended Comments

    Syslogs alone aren't usually that helpful.  Better to post the diagnostics that would have been generated on the flash drive (logs folder)

     

    When you tried to shutdown, the reason it hung was because the docker image was still in use.  This implies that the cause of the crash was Plex and it just plain refused to exit.

     

    Have you run a memtest, if only to rule it out?

    Link to comment
    34 minutes ago, Squid said:

    Syslogs alone aren't usually that helpful.  Better to post the diagnostics that would have been generated on the flash drive (logs folder)

     

    When you tried to shutdown, the reason it hung was because the docker image was still in use.  This implies that the cause of the crash was Plex and it just plain refused to exit.

     

    Have you run a memtest, if only to rule it out?

    i ran memtest overnight when i first built server. it was running rock solid 1 year almost uptime 250 days or something 

    maybe metadata or something is corrupted ? because it transfered metadata to nvme and it hanged process 1 core begged 100% 

    100 unraid.png

    Link to comment

    so i deleted plex container with metadata also no caching on appdata folder and so far no problems 1 day uptime. will update if something changes.

    Link to comment

    so yea it seems like something triggers hiccup at night today also there was no server online no display nothing. at daytime all worked fine.

    May  4 02:03:23 Storinator kernel: kernel BUG at fs/inode.c:533!
    May  4 02:03:23 Storinator kernel: invalid opcode: 0000 [#1] SMP NOPTI
    May  4 02:03:23 Storinator kernel: CPU: 1 PID: 836 Comm: kswapd0 Not tainted 5.10.28-Unraid #1
    May  4 02:03:23 Storinator kernel: Hardware name: Gigabyte Technology Co., Ltd. B450 I AORUS PRO WIFI/B450 I AORUS PRO WIFI-CF, BIOS F61a 03/29/2021
    May  4 02:03:23 Storinator kernel: RIP: 0010:clear_inode+0x2a/0x86
    May  4 02:03:23 Storinator kernel: Code: 55 48 8d af 78 01 00 00 53 48 89 fb 48 89 ef e8 68 39 55 00 48 83 bb c8 01 00 00 00 74 02 0f 0b 48 83 bb d0 01 00 00 00 74 02 <0f> 0b 48 89 ef e8 58 fb ff ff fb 66 0f 1f 44 00 00 48 8b 93 f8 01
    May  4 02:03:23 Storinator kernel: RSP: 0018:ffffc900004bbbc0 EFLAGS: 00010006
    May  4 02:03:23 Storinator kernel: RAX: 0000000000000000 RBX: ffff88815a1e8c70 RCX: ffffc900004bbaa0
    May  4 02:03:23 Storinator kernel: RDX: 0000000000000001 RSI: ffffc900004bb9a0 RDI: ffff88815a1e8de8
    May  4 02:03:23 Storinator kernel: RBP: ffff88815a1e8de8 R08: 0000000000000000 R09: 0000000000000000
    May  4 02:03:23 Storinator kernel: R10: ffffc900004bb9a0 R11: ffffc900004bb9a0 R12: ffffffffa03e32c0
    May  4 02:03:23 Storinator kernel: R13: 0000000000000000 R14: 0000000000000636 R15: 00000000000000be
    May  4 02:03:23 Storinator kernel: FS:  0000000000000000(0000) GS:ffff88842ec40000(0000) knlGS:0000000000000000
    May  4 02:03:23 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  4 02:03:23 Storinator kernel: CR2: 0000150781bed000 CR3: 00000001470dc000 CR4: 00000000003506e0
    May  4 02:03:23 Storinator kernel: Call Trace:
    May  4 02:03:23 Storinator kernel: evict+0xcd/0x16b
    May  4 02:03:23 Storinator kernel: dispose_list+0x30/0x39
    May  4 02:03:23 Storinator kernel: prune_icache_sb+0x56/0x74
    May  4 02:03:23 Storinator kernel: super_cache_scan+0x11a/0x16d
    May  4 02:03:23 Storinator kernel: do_shrink_slab+0x148/0x1ae
    May  4 02:03:23 Storinator kernel: shrink_slab+0x234/0x29e
    May  4 02:03:23 Storinator kernel: shrink_node+0x32a/0x578
    May  4 02:03:23 Storinator kernel: balance_pgdat+0x25d/0x3dc
    May  4 02:03:23 Storinator kernel: kswapd+0x240/0x28c
    May  4 02:03:23 Storinator kernel: ? init_wait_entry+0x24/0x24
    May  4 02:03:23 Storinator kernel: ? balance_pgdat+0x3dc/0x3dc
    May  4 02:03:23 Storinator kernel: kthread+0xe5/0xea
    May  4 02:03:23 Storinator kernel: ? __kthread_bind_mask+0x57/0x57
    May  4 02:03:23 Storinator kernel: ret_from_fork+0x22/0x30
    May  4 02:03:23 Storinator kernel: Modules linked in: xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod it87 hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables atlantic igb i2c_algo_bit edac_mce_amd kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof btusb btrtl btbcm btintel bluetooth i2c_piix4 ecdh_generic ecc rapl ccp nvme i2c_core input_leds k10temp wmi nvme_core ahci led_class video libahci backlight button [last unloaded: atlantic]
    May  4 02:03:23 Storinator kernel: ---[ end trace 4ad744b9384204aa ]---
    May  4 02:03:23 Storinator kernel: RIP: 0010:clear_inode+0x2a/0x86
    May  4 02:03:23 Storinator kernel: Code: 55 48 8d af 78 01 00 00 53 48 89 fb 48 89 ef e8 68 39 55 00 48 83 bb c8 01 00 00 00 74 02 0f 0b 48 83 bb d0 01 00 00 00 74 02 <0f> 0b 48 89 ef e8 58 fb ff ff fb 66 0f 1f 44 00 00 48 8b 93 f8 01
    May  4 02:03:23 Storinator kernel: RSP: 0018:ffffc900004bbbc0 EFLAGS: 00010006
    May  4 02:03:23 Storinator kernel: RAX: 0000000000000000 RBX: ffff88815a1e8c70 RCX: ffffc900004bbaa0
    May  4 02:03:23 Storinator kernel: RDX: 0000000000000001 RSI: ffffc900004bb9a0 RDI: ffff88815a1e8de8
    May  4 02:03:23 Storinator kernel: RBP: ffff88815a1e8de8 R08: 0000000000000000 R09: 0000000000000000
    May  4 02:03:23 Storinator kernel: R10: ffffc900004bb9a0 R11: ffffc900004bb9a0 R12: ffffffffa03e32c0
    May  4 02:03:23 Storinator kernel: R13: 0000000000000000 R14: 0000000000000636 R15: 00000000000000be
    May  4 02:03:23 Storinator kernel: FS:  0000000000000000(0000) GS:ffff88842ec40000(0000) knlGS:0000000000000000
    May  4 02:03:23 Storinator kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  4 02:03:23 Storinator kernel: CR2: 0000150781bed000 CR3: 00000001470dc000 CR4: 00000000003506e0

    Edited by mfdoom7
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...