• Crashes since updating to v6.11.x for qBittorrent and Deluge users


    JorgeB
    • Closed

    EDIT: issue was traced to libtorrent 2.x, it's not an Unraid problem, more info in this post:

     

    https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-for-qbittorrent-and-deluge-users-r2153/?do=findComment&comment=21671

     

     

    Original Post:

     

    I'm creating this to better track an issue that some users have been reporting where Unraid started crashing after updating to v6.11.x (it happens with both 6.11.0 and 6.11.1), there's a very similar call traced logged for all cases, e.g:

     

    Oct 12 04:18:27 zaBOX kernel: BUG: kernel NULL pointer dereference, address: 00000000000000b6
    Oct 12 04:18:27 zaBOX kernel: #PF: supervisor read access in kernel mode
    Oct 12 04:18:27 zaBOX kernel: #PF: error_code(0x0000) - not-present page
    Oct 12 04:18:27 zaBOX kernel: PGD 0 P4D 0
    Oct 12 04:18:27 zaBOX kernel: Oops: 0000 [#1] PREEMPT SMP PTI
    Oct 12 04:18:27 zaBOX kernel: CPU: 4 PID: 28596 Comm: Disk Tainted: P     U  W  O      5.19.14-Unraid #1
    Oct 12 04:18:27 zaBOX kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS PRO WIFI/Z390 AORUS PRO WIFI-CF, BIOS F12 11/05/2021
    Oct 12 04:18:27 zaBOX kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 12 04:18:27 zaBOX kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 12 04:18:27 zaBOX kernel: RSP: 0000:ffffc900070dbcc0 EFLAGS: 00010246
    Oct 12 04:18:27 zaBOX kernel: RAX: 0000000000000082 RBX: 0000000000000082 RCX: 0000000000000082
    Oct 12 04:18:27 zaBOX kernel: RDX: 0000000000000001 RSI: ffff888757426fe8 RDI: 0000000000000082
    Oct 12 04:18:27 zaBOX kernel: RBP: 0000000000000000 R08: 0000000000000028 R09: ffffc900070dbcd0
    Oct 12 04:18:27 zaBOX kernel: R10: ffffc900070dbcd0 R11: ffffc900070dbd48 R12: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: R13: ffff88824f95d138 R14: 000000000007292c R15: ffff88824f95d140
    Oct 12 04:18:27 zaBOX kernel: FS:  000014ed38204b38(0000) GS:ffff8888a0500000(0000) knlGS:0000000000000000
    Oct 12 04:18:27 zaBOX kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 12 04:18:27 zaBOX kernel: CR2: 00000000000000b6 CR3: 0000000209854005 CR4: 00000000003706e0
    Oct 12 04:18:27 zaBOX kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Oct 12 04:18:27 zaBOX kernel: Call Trace:
    Oct 12 04:18:27 zaBOX kernel: <TASK>
    Oct 12 04:18:27 zaBOX kernel: __filemap_get_folio+0x98/0x1ff
    Oct 12 04:18:27 zaBOX kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a
    Oct 12 04:18:27 zaBOX kernel: filemap_fault+0x6e/0x524
    Oct 12 04:18:27 zaBOX kernel: __do_fault+0x2d/0x6e
    Oct 12 04:18:27 zaBOX kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 12 04:18:27 zaBOX kernel: handle_mm_fault+0x113/0x1d7
    Oct 12 04:18:27 zaBOX kernel: do_user_addr_fault+0x36a/0x514
    Oct 12 04:18:27 zaBOX kernel: exc_page_fault+0xfc/0x11e
    Oct 12 04:18:27 zaBOX kernel: asm_exc_page_fault+0x22/0x30
    Oct 12 04:18:27 zaBOX kernel: RIP: 0033:0x14ed3a0ae7b5
    Oct 12 04:18:27 zaBOX kernel: Code: 8b 48 08 48 8b 32 48 8b 00 48 39 f0 73 09 48 8d 14 08 48 39 d6 eb 0c 48 39 c6 73 0b 48 8d 14 0e 48 39 d0 73 02 0f 0b 48 89 c7 <f3> a4 66 48 8d 3d 59 b7 22 00 66 66 48 e8 d9 d8 f6 ff 48 89 28 48
    Oct 12 04:18:27 zaBOX kernel: RSP: 002b:000014ed38203960 EFLAGS: 00010206
    Oct 12 04:18:27 zaBOX kernel: RAX: 000014ed371aa160 RBX: 000014ed38203ad0 RCX: 0000000000004000
    Oct 12 04:18:27 zaBOX kernel: RDX: 000014c036530000 RSI: 000014c03652c000 RDI: 000014ed371aa160
    Oct 12 04:18:27 zaBOX kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000014ed38203778
    Oct 12 04:18:27 zaBOX kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: R13: 000014ed38203b40 R14: 000014ed384fe940 R15: 000014ed38203ac0
    Oct 12 04:18:27 zaBOX kernel: </TASK>
    Oct 12 04:18:27 zaBOX kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net vhost vhost_iotlb tap tun veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs md_mod kvmgt mdev i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper intel_gtt agpgart hwmon_vid iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls ipv6 nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) x86_pkg_temp_thermal intel_powerclamp drm_kms_helper btusb btrtl i2c_i801 btbcm coretemp gigabyte_wmi wmi_bmof intel_wmi_thunderbolt mxm_wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd
    Oct 12 04:18:27 zaBOX kernel: btintel rapl intel_cstate intel_uncore e1000e i2c_smbus bluetooth drm nvme nvme_core ahci i2c_core libahci ecdh_generic ecc syscopyarea sysfillrect input_leds sysimgblt led_class joydev nzxt_kraken2 intel_pch_thermal fb_sys_fops thermal fan video tpm_crb wmi tpm_tis backlight tpm_tis_core tpm acpi_pad button unix
    Oct 12 04:18:27 zaBOX kernel: CR2: 00000000000000b6
    Oct 12 04:18:27 zaBOX kernel: ---[ end trace 0000000000000000 ]---

     

    Another example with very different hardware:

    Oct 11 21:32:08 Impulse kernel: BUG: kernel NULL pointer dereference, address: 0000000000000056
    Oct 11 21:32:08 Impulse kernel: #PF: supervisor read access in kernel mode
    Oct 11 21:32:08 Impulse kernel: #PF: error_code(0x0000) - not-present page
    Oct 11 21:32:08 Impulse kernel: PGD 0 P4D 0
    Oct 11 21:32:08 Impulse kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
    Oct 11 21:32:08 Impulse kernel: CPU: 1 PID: 5236 Comm: Disk Not tainted 5.19.14-Unraid #1
    Oct 11 21:32:08 Impulse kernel: Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING II, BIOS 4301 03/04/2021
    Oct 11 21:32:08 Impulse kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 11 21:32:08 Impulse kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b e9 cc 5f 86 00 <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 11 21:32:08 Impulse kernel: RSP: 0000:ffffc900026ffcc0 EFLAGS: 00010246
    Oct 11 21:32:08 Impulse kernel: RAX: 0000000000000022 RBX: 0000000000000022 RCX: 0000000000000022
    Oct 11 21:32:08 Impulse kernel: RDX: 0000000000000001 RSI: ffff88801e450b68 RDI: 0000000000000022
    Oct 11 21:32:08 Impulse kernel: RBP: 0000000000000000 R08: 000000000000000c R09: ffffc900026ffcd0
    Oct 11 21:32:08 Impulse kernel: R10: ffffc900026ffcd0 R11: ffffc900026ffd48 R12: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R13: ffff888428441cb8 R14: 00000000000028cd R15: ffff888428441cc0
    Oct 11 21:32:08 Impulse kernel: FS:  00001548d34fa6c0(0000) GS:ffff88842e840000(0000) knlGS:0000000000000000
    Oct 11 21:32:08 Impulse kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 11 21:32:08 Impulse kernel: CR2: 0000000000000056 CR3: 00000001a3fe6000 CR4: 00000000003506e0
    Oct 11 21:32:08 Impulse kernel: Call Trace:
    Oct 11 21:32:08 Impulse kernel: <TASK>
    Oct 11 21:32:08 Impulse kernel: __filemap_get_folio+0x98/0x1ff
    Oct 11 21:32:08 Impulse kernel: filemap_fault+0x6e/0x524
    Oct 11 21:32:08 Impulse kernel: __do_fault+0x30/0x6e
    Oct 11 21:32:08 Impulse kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 11 21:32:08 Impulse kernel: handle_mm_fault+0x113/0x1d7
    Oct 11 21:32:08 Impulse kernel: do_user_addr_fault+0x36a/0x514
    Oct 11 21:32:08 Impulse kernel: exc_page_fault+0xfc/0x11e
    Oct 11 21:32:08 Impulse kernel: asm_exc_page_fault+0x22/0x30
    Oct 11 21:32:08 Impulse kernel: RIP: 0033:0x1548dbc04741
    Oct 11 21:32:08 Impulse kernel: Code: 48 01 d0 eb 1b 0f 1f 40 00 f3 0f 1e fa 48 39 d1 0f 82 73 28 fc ff 0f 1f 00 f3 0f 1e fa 48 89 f8 48 83 fa 20 0f 82 af 00 00 00 <c5> fe 6f 06 48 83 fa 40 0f 87 3e 01 00 00 c5 fe 6f 4c 16 e0 c5 fe
    Oct 11 21:32:08 Impulse kernel: RSP: 002b:00001548d34f9808 EFLAGS: 00010202
    Oct 11 21:32:08 Impulse kernel: RAX: 000015480c010d30 RBX: 000015480c018418 RCX: 00001548d34f9a40
    Oct 11 21:32:08 Impulse kernel: RDX: 0000000000004000 RSI: 000015471f8cd50f RDI: 000015480c010d30
    Oct 11 21:32:08 Impulse kernel: RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R13: 00001548d34f9ac0 R14: 0000000000000003 R15: 0000154814013d10
    Oct 11 21:32:08 Impulse kernel: </TASK>
    Oct 11 21:32:08 Impulse kernel: Modules linked in: xt_connmark xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_mark xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc ipv6 mlx4_en mlx4_core igb i2c_algo_bit edac_mce_amd edac_core kvm_amd kvm wmi_bmof mxm_wmi asus_wmi_sensors crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mpt3sas aesni_intel crypto_simd nvme cryptd ahci i2c_piix4 raid_class rapl k10temp i2c_core nvme_core ccp scsi_transport_sas libahci wmi button acpi_cpufreq unix [last unloaded: mlx4_core]
    Oct 11 21:32:08 Impulse kernel: CR2: 0000000000000056
    Oct 11 21:32:08 Impulse kernel: ---[ end trace 0000000000000000 ]---

     

    So they always start with this (end address will change):

     

    Oct 11 05:02:02 Cogsworth kernel: BUG: kernel NULL pointer dereference, address: 0000000000000076

     

    and always have this:

     

    Oct 11 05:02:02 Cogsworth kernel: Call Trace:
    Oct 11 05:02:02 Cogsworth kernel: <TASK>
    Oct 11 05:02:02 Cogsworth kernel: __filemap_get_folio+0x98/0x1ff

     

    The fact that it's happening to various users with very different hardware, both Intel and AMD, makes me think it's not a hardware/firmware issue, so we can try to find if they are running anything in common, these are the plugins I've found in common between the 4 or 5 cases found so far, these are some of the most used plugins so not surprising they are installed in all but it's also easy to rule them out:

     

    ca.backup2.plg - 2022.07.23  (Up to date)
    community.applications.plg - 2022.09.30  (Up to date)
    dynamix.active.streams.plg - 2020.06.17  (Up to date)
    file.activity.plg - 2022.08.19  (Up to date)
    fix.common.problems.plg - 2022.10.09  (Up to date)
    unassigned.devices.plg - 2022.10.03  (Up to date)
    unassigned.devices-plus.plg - 2022.08.19  (Up to date)

     

    So anyone having this issue try temporarily uninstalling/disabling these plugin to see if there's any difference.

    • Like 2
    • Upvote 1



    User Feedback

    Recommended Comments



    I spoke too soon. about 6 days uptime and another soft lock. Only difference this time was I couldn't get the webgui to load at all. CA Appdata Backup did run yesterday whether that's relevant. I can't see removing that, so I'll be downgrading.

    Link to comment

    There's one user with the same issue that can apparently make it crash on demand with high i/o, so it does point to that being the problem, of course some servers might be more susceptible than others.

    Link to comment
    2 hours ago, JorgeB said:

    There's one user with the same issue that can apparently make it crash on demand with high i/o, so it does point to that being the problem, of course some servers might be more susceptible than others.

     

    False alarm, turns out the server has bad RAM, strange that one time the call trace was exactly the same as these.

    Link to comment
    33 minutes ago, JorgeB said:

     

    False alarm, turns out the server has bad RAM, strange that one time the call trace was exactly the same as these.

    But it has to be something with high I/O load probably, since it only happens with any torrent programs.
     

     

    Link to comment
    10 hours ago, CiscoCoreX said:

    But it has to be something with high I/O load probably, since it only happens with any torrent programs.

    Yes, I agree, false alarm was being able to reproduce it on demand, this would help a lot with troubleshooting.

    • Like 1
    Link to comment

    And here we go again.
    I was downloading, finish downloading and my torrent was seeding.
    At same time I was looking inside APPS to see if it was something new from the community.

    The my unRAID GUI stop responding.

     

    Uptime 10 days 15 hours.
    But now, look at line 6 ?

    "Oct 26 23:18:14 Pegasus kernel: CPU: 12 PID: 5398 Comm: qbittorrent-nox Tainted: P O 5.19.14-Unraid #1"

     

    Oct 26 23:18:14 Pegasus kernel: BUG: kernel NULL pointer dereference, address: 0000000000000036
    Oct 26 23:18:14 Pegasus kernel: #PF: supervisor read access in kernel mode
    Oct 26 23:18:14 Pegasus kernel: #PF: error_code(0x0000) - not-present page
    Oct 26 23:18:14 Pegasus kernel: PGD 0 P4D 0 
    Oct 26 23:18:14 Pegasus kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
    Oct 26 23:18:14 Pegasus kernel: CPU: 12 PID: 5398 Comm: qbittorrent-nox Tainted: P           O      5.19.14-Unraid #1
    Oct 26 23:18:14 Pegasus kernel: Hardware name: ASUS System Product Name/PRIME B560M-K, BIOS 1605 05/13/2022
    Oct 26 23:18:14 Pegasus kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 26 23:18:14 Pegasus kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 26 23:18:14 Pegasus kernel: RSP: 0000:ffffc900048dfcc0 EFLAGS: 00010246
    Oct 26 23:18:14 Pegasus kernel: RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002
    Oct 26 23:18:14 Pegasus kernel: RDX: 0000000000000001 RSI: ffff888149542fe8 RDI: 0000000000000002
    Oct 26 23:18:14 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000004 R09: ffffc900048dfcd0
    Oct 26 23:18:14 Pegasus kernel: R10: ffffc900048dfcd0 R11: ffffc900048dfd48 R12: 0000000000000000
    Oct 26 23:18:14 Pegasus kernel: R13: ffff8882236972f8 R14: 0000000000015107 R15: ffff888223697300
    Oct 26 23:18:14 Pegasus kernel: FS:  000014f2053f8b30(0000) GS:ffff88883c300000(0000) knlGS:0000000000000000
    Oct 26 23:18:14 Pegasus kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 26 23:18:14 Pegasus kernel: CR2: 0000000000000036 CR3: 0000000121e52002 CR4: 00000000007726e0
    Oct 26 23:18:14 Pegasus kernel: PKRU: 55555554
    Oct 26 23:18:14 Pegasus kernel: Call Trace:
    Oct 26 23:18:14 Pegasus kernel: <TASK>
    Oct 26 23:18:14 Pegasus kernel: __filemap_get_folio+0x98/0x1ff
    Oct 26 23:18:14 Pegasus kernel: ? _raw_spin_unlock+0x14/0x29
    Oct 26 23:18:14 Pegasus kernel: filemap_fault+0x6e/0x524
    Oct 26 23:18:14 Pegasus kernel: __do_fault+0x2d/0x6e
    Oct 26 23:18:14 Pegasus kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 26 23:18:14 Pegasus kernel: handle_mm_fault+0x113/0x1d7
    Oct 26 23:18:14 Pegasus kernel: do_user_addr_fault+0x36a/0x514
    Oct 26 23:18:14 Pegasus kernel: exc_page_fault+0xfc/0x11e
    Oct 26 23:18:14 Pegasus kernel: asm_exc_page_fault+0x22/0x30
    Oct 26 23:18:14 Pegasus kernel: RIP: 0033:0x14f20859db1e
    Oct 26 23:18:14 Pegasus kernel: Code: 8b 48 08 48 8b 32 48 8b 00 48 39 f0 73 09 48 8d 14 08 48 39 d6 eb 0c 48 39 c6 73 0b 48 8d 14 0e 48 39 d0 73 02 0f 0b 48 89 c7 <f3> a4 66 48 8d 3d 00 04 23 00 66 66 48 e8 60 05 f7 ff 48 89 28 48
    Oct 26 23:18:14 Pegasus kernel: RSP: 002b:000014f2053f79a0 EFLAGS: 00010212
    Oct 26 23:18:14 Pegasus kernel: RAX: 000014f05d7bb780 RBX: 000014f2053f7af0 RCX: 0000000000004000
    Oct 26 23:18:14 Pegasus kernel: RDX: 000014eb83d0ba8f RSI: 000014eb83d07a8f RDI: 000014f05d7bb780
    Oct 26 23:18:14 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000014f2053f7808
    Oct 26 23:18:14 Pegasus kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 000014f2053f7ad8
    Oct 26 23:18:14 Pegasus kernel: R13: 000014f2053f7ae0 R14: 0000000015107a8f R15: 0000000000004000
    Oct 26 23:18:14 Pegasus kernel: </TASK>
    Oct 26 23:18:14 Pegasus kernel: Modules linked in: veth nvidia_uvm(PO) cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat ipvlan iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls ipv6 e1000e r8169 realtek i915 iosf_mbi drm_buddy i2c_algo_bit nvidia_drm(PO) ttm nvidia_modeset(PO) x86_pkg_temp_thermal intel_powerclamp wmi_bmof coretemp kvm_intel kvm nvidia(PO) drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd drm_kms_helper cryptd rapl intel_cstate i2c_i801 intel_gtt ahci i2c_smbus drm intel_uncore agpgart libahci i2c_core joydev input_leds syscopyarea
    Oct 26 23:18:14 Pegasus kernel: led_class sysfillrect corsair_psu sysimgblt fb_sys_fops thermal fan wmi tpm_crb tpm_tis video tpm_tis_core backlight tpm acpi_tad acpi_pad button unix [last unloaded: e1000e]
    Oct 26 23:18:14 Pegasus kernel: CR2: 0000000000000036
    Oct 26 23:18:14 Pegasus kernel: ---[ end trace 0000000000000000 ]---
    Oct 26 23:18:14 Pegasus kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 26 23:18:14 Pegasus kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 26 23:18:14 Pegasus kernel: RSP: 0000:ffffc900048dfcc0 EFLAGS: 00010246
    Oct 26 23:18:14 Pegasus kernel: RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002
    Oct 26 23:18:14 Pegasus kernel: RDX: 0000000000000001 RSI: ffff888149542fe8 RDI: 0000000000000002
    Oct 26 23:18:14 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000004 R09: ffffc900048dfcd0
    Oct 26 23:18:14 Pegasus kernel: R10: ffffc900048dfcd0 R11: ffffc900048dfd48 R12: 0000000000000000
    Oct 26 23:18:14 Pegasus kernel: R13: ffff8882236972f8 R14: 0000000000015107 R15: ffff888223697300
    Oct 26 23:18:14 Pegasus kernel: FS:  000014f2053f8b30(0000) GS:ffff88883c300000(0000) knlGS:0000000000000000
    Oct 26 23:18:14 Pegasus kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 26 23:18:14 Pegasus kernel: CR2: 0000000000000036 CR3: 0000000121e52002 CR4: 00000000007726e0
    Oct 26 23:18:14 Pegasus kernel: PKRU: 55555554

     

    Like always, can't kill qbittorrent

    root@Pegasus:~# docker restart qbittorrent
    Error response from daemon: Cannot restart container qbittorrent: tried to kill container, but did not receive an exit event
    root@Pegasus:~# docker restart qbittorrent
    Error response from daemon: Cannot restart container qbittorrent: tried to kill container, but did not receive an exit event
    root@Pegasus:~# docker stop $(docker ps -q)
    646501113ad6
    9d5aafe86ec3
    bf2cc0cee3a5
    ba49d7638669
    f5c24ce1673a
    135f286f6efd
    0cca14d6c79e
    57ea88817780
    998f22a93850
    8c07f687d842
    88b24a5ab208
    ecb1d286d078
    4766342ac2e6
    cad22e8aee84
    fb306c4a68c3
    Error response from daemon: cannot stop container: 215ad9c9de1c: tried to kill container, but did not receive an exit event
    root@Pegasus:~# 


     

    Edited by CiscoCoreX
    Link to comment
    On 10/19/2022 at 9:36 PM, JesterEE said:

    I'm going to restart, remove my deluge docker (but keep the appdata of course!) and reinstall Unassigned Devices. If this doesn't work, I think I'm going to head back to a stable 6.10.3 till LimeTech squares this off. If it does, I'm not sure what my next step will be (suggestions?).

     

    7 days uptime after removing my deluge docker and doing normal tasks with the server (VM, docker, plex streaming, databasing, file storage, parity/integrity checks, etc.). I'm going to install it again and see how long it stays stable. I'd bet, less than 3 days. 🤔

     

     

    If this fails, it would be nice to be able to replicate it without docker in the loop. Is there a known way to issue high IO natively in the Unraid environment (or via a script)? I was thinking stress-ng might be the right thing, but:

    1. I've never used it for IO testing so I don't know if it's doing the right thing(s)
    2. There's no slackware package for it so it would need to be compiled from source and packaged for use in Unraid.

     

    -JesterEE

     

    Link to comment
    9 hours ago, CiscoCoreX said:

    I was downloading, finish downloading and my torrent was seeding.

    Did it crash during or right after downloading or when it was seeding for a while?

    Link to comment
    8 hours ago, JorgeB said:

    Did it crash during or right after downloading or when it was seeding for a while?

    It was after it was finish. It was seeding for maybe couple of hours, 2 maybe.

    Link to comment

    And it happens again.

     

    Oct 27 13:07:12 Pegasus kernel: BUG: kernel NULL pointer dereference, address: 00000000000000f6
    Oct 27 13:07:12 Pegasus kernel: #PF: supervisor read access in kernel mode
    Oct 27 13:07:12 Pegasus kernel: #PF: error_code(0x0000) - not-present page
    Oct 27 13:07:12 Pegasus kernel: PGD 0 P4D 0 
    Oct 27 13:07:12 Pegasus kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
    Oct 27 13:07:12 Pegasus kernel: CPU: 10 PID: 10701 Comm: qbittorrent-nox Tainted: P           O      5.19.14-Unraid #1
    Oct 27 13:07:12 Pegasus kernel: Hardware name: ASUS System Product Name/PRIME B560M-K, BIOS 1605 05/13/2022
    Oct 27 13:07:12 Pegasus kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 27 13:07:12 Pegasus kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 27 13:07:12 Pegasus kernel: RSP: 0000:ffffc90003077cc0 EFLAGS: 00010246
    Oct 27 13:07:12 Pegasus kernel: RAX: 00000000000000c2 RBX: 00000000000000c2 RCX: 00000000000000c2
    Oct 27 13:07:12 Pegasus kernel: RDX: 0000000000000001 RSI: ffff88810c454490 RDI: 00000000000000c2
    Oct 27 13:07:12 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000034 R09: ffffc90003077cd0
    Oct 27 13:07:12 Pegasus kernel: R10: ffffc90003077cd0 R11: ffffc90003077d48 R12: 0000000000000000
    Oct 27 13:07:12 Pegasus kernel: R13: ffff88824fa4e038 R14: 0000000000011937 R15: ffff88824fa4e040
    Oct 27 13:07:12 Pegasus kernel: FS:  000014cd00925b30(0000) GS:ffff88883c280000(0000) knlGS:0000000000000000
    Oct 27 13:07:12 Pegasus kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 27 13:07:12 Pegasus kernel: CR2: 00000000000000f6 CR3: 000000022de3c002 CR4: 00000000007726e0
    Oct 27 13:07:12 Pegasus kernel: PKRU: 55555554
    Oct 27 13:07:12 Pegasus kernel: Call Trace:
    Oct 27 13:07:12 Pegasus kernel: <TASK>
    Oct 27 13:07:12 Pegasus kernel: __filemap_get_folio+0x98/0x1ff
    Oct 27 13:07:12 Pegasus kernel: filemap_fault+0x6e/0x524
    Oct 27 13:07:12 Pegasus kernel: __do_fault+0x2d/0x6e
    Oct 27 13:07:12 Pegasus kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 27 13:07:12 Pegasus kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a
    Oct 27 13:07:12 Pegasus kernel: handle_mm_fault+0x113/0x1d7
    Oct 27 13:07:12 Pegasus kernel: do_user_addr_fault+0x36a/0x514
    Oct 27 13:07:12 Pegasus kernel: exc_page_fault+0xfc/0x11e
    Oct 27 13:07:12 Pegasus kernel: asm_exc_page_fault+0x22/0x30
    Oct 27 13:07:12 Pegasus kernel: RIP: 0033:0x14cd02e51b1e
    Oct 27 13:07:12 Pegasus kernel: Code: 8b 48 08 48 8b 32 48 8b 00 48 39 f0 73 09 48 8d 14 08 48 39 d6 eb 0c 48 39 c6 73 0b 48 8d 14 0e 48 39 d0 73 02 0f 0b 48 89 c7 <f3> a4 66 48 8d 3d 00 04 23 00 66 66 48 e8 60 05 f7 ff 48 89 28 48
    Oct 27 13:07:12 Pegasus kernel: RSP: 002b:000014cd009249a0 EFLAGS: 00010216
    Oct 27 13:07:12 Pegasus kernel: RAX: 000014ccefb3aab0 RBX: 000014cd00924af0 RCX: 0000000000004000
    Oct 27 13:07:12 Pegasus kernel: RDX: 000014c76b33bb06 RSI: 000014c76b337b06 RDI: 000014ccefb3aab0
    Oct 27 13:07:12 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000014cd00924808
    Oct 27 13:07:12 Pegasus kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 000014cd00924ad8
    Oct 27 13:07:12 Pegasus kernel: R13: 000014cd00924ae0 R14: 0000000011937b06 R15: 0000000000004000
    Oct 27 13:07:12 Pegasus kernel: </TASK>
    Oct 27 13:07:12 Pegasus kernel: Modules linked in: nvidia_uvm(PO) cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls ipv6 e1000e r8169 realtek i915 iosf_mbi drm_buddy i2c_algo_bit ttm nvidia_drm(PO) drm_display_helper nvidia_modeset(PO) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel wmi_bmof kvm crct10dif_pclmul crc32_pclmul nvidia(PO) crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd drm_kms_helper cryptd rapl intel_cstate i2c_i801 drm intel_uncore intel_gtt i2c_smbus ahci agpgart libahci i2c_core input_leds joydev syscopyarea
    Oct 27 13:07:12 Pegasus kernel: led_class corsair_psu sysfillrect sysimgblt fb_sys_fops thermal fan wmi tpm_crb tpm_tis tpm_tis_core video tpm backlight acpi_tad acpi_pad button unix [last unloaded: e1000e]
    Oct 27 13:07:12 Pegasus kernel: CR2: 00000000000000f6
    Oct 27 13:07:12 Pegasus kernel: ---[ end trace 0000000000000000 ]---
    Oct 27 13:07:12 Pegasus kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 27 13:07:12 Pegasus kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 27 13:07:12 Pegasus kernel: RSP: 0000:ffffc90003077cc0 EFLAGS: 00010246
    Oct 27 13:07:12 Pegasus kernel: RAX: 00000000000000c2 RBX: 00000000000000c2 RCX: 00000000000000c2
    Oct 27 13:07:12 Pegasus kernel: RDX: 0000000000000001 RSI: ffff88810c454490 RDI: 00000000000000c2
    Oct 27 13:07:12 Pegasus kernel: RBP: 0000000000000000 R08: 0000000000000034 R09: ffffc90003077cd0
    Oct 27 13:07:12 Pegasus kernel: R10: ffffc90003077cd0 R11: ffffc90003077d48 R12: 0000000000000000
    Oct 27 13:07:12 Pegasus kernel: R13: ffff88824fa4e038 R14: 0000000000011937 R15: ffff88824fa4e040
    Oct 27 13:07:12 Pegasus kernel: FS:  000014cd00925b30(0000) GS:ffff88883c280000(0000) knlGS:0000000000000000
    Oct 27 13:07:12 Pegasus kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 27 13:07:12 Pegasus kernel: CR2: 00000000000000f6 CR3: 000000022de3c002 CR4: 00000000007726e0
    Oct 27 13:07:12 Pegasus kernel: PKRU: 55555554

     

    Link to comment
    16 minutes ago, CiscoCoreX said:

    It was after it was finish. It was seeding for maybe couple of hours, 2 maybe.

    That suggests high i/o might not be the cause, unless it's a delayed crash.

    Link to comment
    5 minutes ago, JorgeB said:

    That suggests high i/o might not be the cause, unless it's a delayed crash.

     

    Sure, it could be a lot of things since torrent clients do a lot from networking to file IO. Plus, in Unraid, this includes the docker abstraction layer so that's yet another whole set of potential interactions.

     

    This is why I asked above if there is any testing we can do natively in Unraid. I was hoping a dev would chime in with a unit test or something along those lines so we can start to get some resolution on this.

     

    -JesterEE

    • Like 1
    Link to comment

    It's not difficult to generate high i/o, it would be more difficult to replicate if it's docker with high i/o, anyone having this issue without using docker, or at least without using high i/o docker containers, like torrents, etc?

    Link to comment
    3 minutes ago, JorgeB said:

    It's not difficult to generate high i/o, it would be more difficult to replicate if it's docker with high i/o, anyone having this issue without using docker, or at least without using high i/o docker containers, like torrents, etc?

     

    Why would this be any harder to do in docker than on the host?? If you have a preferred method of doing this at a shell prompt, it's easy enough to just run that command in any docker image you want.

    Link to comment
    1 hour ago, JesterEE said:

    If you have a preferred method of doing this at a shell prompt, it's easy enough to just run that command in any docker image you want.

    I'm not an advanced docker user, can you easily install something like fio and run it in a docker?

    Link to comment
    2 minutes ago, JorgeB said:

    I'm not an advanced docker user, can you easily install something like fio and run it in a docker?

     

    Short answer... Yes! I've never used that utility, but I just googled it and if there's value, I could run some tests. 

    Link to comment

    Do you have a way of using that utility in Unraid to see if it comes up without docker?

     

    I couldn't find a Slackware package for it. 

    Link to comment
    3 minutes ago, Rockikone said:

    I don't know if it is related.

    Only if you get the same call trace as posted in the 1st post.

    • Like 2
    Link to comment
    On 10/26/2022 at 7:23 PM, JesterEE said:

    7 days uptime after removing my deluge docker and doing normal tasks with the server (VM, docker, plex streaming, databasing, file storage, parity/integrity checks, etc.). I'm going to install it again and see how long it stays stable. I'd bet, less than 3 days. 🤔

     

    Well, it was actually only 26 hours and poof!  This is pretty conclusive to me. Something is messed up, and it's not something I did 😝. I'm not 100% sure if it's high IO, the network interface, or docker, but any way you slice it, it's an OS issue and needs to be addressed by the devs in a future release.

     

    If someone at @limetech posts here (or contacts me privately) by 10/29/22 asking to help debug, I will gladly do so. If not, back to 6.10.3 and a stable server for me.

     

    -JesterEE

    Edited by JesterEE
    Link to comment

    Now reached the heady heights of 14 days uptime, no crashes since the below changes:-

    https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-r2153/?do=findComment&comment=21225

     

    i do torrent a fair bit (probably shifted around 1.5TB of data), 2 vm's running and actively used and i have not changed my daily usage of the server, so for now at least it SEEMS stable for me, time will tell i guess.

     

    p.s. I am not saying the above is THE fix, simply that it works for MY system.

    • Like 1
    Link to comment
    17 hours ago, JesterEE said:

    I couldn't find a Slackware package for it. 

    It was included with the old nerdpack plugin, but not sure it's still compatible with v6.11.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.