• Crashes since updating to v6.11.x for qBittorrent and Deluge users


    JorgeB
    • Closed

    EDIT: issue was traced to libtorrent 2.x, it's not an Unraid problem, more info in this post:

     

    https://forums.unraid.net/bug-reports/stable-releases/crashes-since-updating-to-v611x-for-qbittorrent-and-deluge-users-r2153/?do=findComment&comment=21671

     

     

    Original Post:

     

    I'm creating this to better track an issue that some users have been reporting where Unraid started crashing after updating to v6.11.x (it happens with both 6.11.0 and 6.11.1), there's a very similar call traced logged for all cases, e.g:

     

    Oct 12 04:18:27 zaBOX kernel: BUG: kernel NULL pointer dereference, address: 00000000000000b6
    Oct 12 04:18:27 zaBOX kernel: #PF: supervisor read access in kernel mode
    Oct 12 04:18:27 zaBOX kernel: #PF: error_code(0x0000) - not-present page
    Oct 12 04:18:27 zaBOX kernel: PGD 0 P4D 0
    Oct 12 04:18:27 zaBOX kernel: Oops: 0000 [#1] PREEMPT SMP PTI
    Oct 12 04:18:27 zaBOX kernel: CPU: 4 PID: 28596 Comm: Disk Tainted: P     U  W  O      5.19.14-Unraid #1
    Oct 12 04:18:27 zaBOX kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS PRO WIFI/Z390 AORUS PRO WIFI-CF, BIOS F12 11/05/2021
    Oct 12 04:18:27 zaBOX kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 12 04:18:27 zaBOX kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b c3 cc cc cc cc <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 12 04:18:27 zaBOX kernel: RSP: 0000:ffffc900070dbcc0 EFLAGS: 00010246
    Oct 12 04:18:27 zaBOX kernel: RAX: 0000000000000082 RBX: 0000000000000082 RCX: 0000000000000082
    Oct 12 04:18:27 zaBOX kernel: RDX: 0000000000000001 RSI: ffff888757426fe8 RDI: 0000000000000082
    Oct 12 04:18:27 zaBOX kernel: RBP: 0000000000000000 R08: 0000000000000028 R09: ffffc900070dbcd0
    Oct 12 04:18:27 zaBOX kernel: R10: ffffc900070dbcd0 R11: ffffc900070dbd48 R12: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: R13: ffff88824f95d138 R14: 000000000007292c R15: ffff88824f95d140
    Oct 12 04:18:27 zaBOX kernel: FS:  000014ed38204b38(0000) GS:ffff8888a0500000(0000) knlGS:0000000000000000
    Oct 12 04:18:27 zaBOX kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 12 04:18:27 zaBOX kernel: CR2: 00000000000000b6 CR3: 0000000209854005 CR4: 00000000003706e0
    Oct 12 04:18:27 zaBOX kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Oct 12 04:18:27 zaBOX kernel: Call Trace:
    Oct 12 04:18:27 zaBOX kernel: <TASK>
    Oct 12 04:18:27 zaBOX kernel: __filemap_get_folio+0x98/0x1ff
    Oct 12 04:18:27 zaBOX kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a
    Oct 12 04:18:27 zaBOX kernel: filemap_fault+0x6e/0x524
    Oct 12 04:18:27 zaBOX kernel: __do_fault+0x2d/0x6e
    Oct 12 04:18:27 zaBOX kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 12 04:18:27 zaBOX kernel: handle_mm_fault+0x113/0x1d7
    Oct 12 04:18:27 zaBOX kernel: do_user_addr_fault+0x36a/0x514
    Oct 12 04:18:27 zaBOX kernel: exc_page_fault+0xfc/0x11e
    Oct 12 04:18:27 zaBOX kernel: asm_exc_page_fault+0x22/0x30
    Oct 12 04:18:27 zaBOX kernel: RIP: 0033:0x14ed3a0ae7b5
    Oct 12 04:18:27 zaBOX kernel: Code: 8b 48 08 48 8b 32 48 8b 00 48 39 f0 73 09 48 8d 14 08 48 39 d6 eb 0c 48 39 c6 73 0b 48 8d 14 0e 48 39 d0 73 02 0f 0b 48 89 c7 <f3> a4 66 48 8d 3d 59 b7 22 00 66 66 48 e8 d9 d8 f6 ff 48 89 28 48
    Oct 12 04:18:27 zaBOX kernel: RSP: 002b:000014ed38203960 EFLAGS: 00010206
    Oct 12 04:18:27 zaBOX kernel: RAX: 000014ed371aa160 RBX: 000014ed38203ad0 RCX: 0000000000004000
    Oct 12 04:18:27 zaBOX kernel: RDX: 000014c036530000 RSI: 000014c03652c000 RDI: 000014ed371aa160
    Oct 12 04:18:27 zaBOX kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000014ed38203778
    Oct 12 04:18:27 zaBOX kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
    Oct 12 04:18:27 zaBOX kernel: R13: 000014ed38203b40 R14: 000014ed384fe940 R15: 000014ed38203ac0
    Oct 12 04:18:27 zaBOX kernel: </TASK>
    Oct 12 04:18:27 zaBOX kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net vhost vhost_iotlb tap tun veth xt_nat xt_tcpudp xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs md_mod kvmgt mdev i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper intel_gtt agpgart hwmon_vid iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls ipv6 nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) x86_pkg_temp_thermal intel_powerclamp drm_kms_helper btusb btrtl i2c_i801 btbcm coretemp gigabyte_wmi wmi_bmof intel_wmi_thunderbolt mxm_wmi kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd
    Oct 12 04:18:27 zaBOX kernel: btintel rapl intel_cstate intel_uncore e1000e i2c_smbus bluetooth drm nvme nvme_core ahci i2c_core libahci ecdh_generic ecc syscopyarea sysfillrect input_leds sysimgblt led_class joydev nzxt_kraken2 intel_pch_thermal fb_sys_fops thermal fan video tpm_crb wmi tpm_tis backlight tpm_tis_core tpm acpi_pad button unix
    Oct 12 04:18:27 zaBOX kernel: CR2: 00000000000000b6
    Oct 12 04:18:27 zaBOX kernel: ---[ end trace 0000000000000000 ]---

     

    Another example with very different hardware:

    Oct 11 21:32:08 Impulse kernel: BUG: kernel NULL pointer dereference, address: 0000000000000056
    Oct 11 21:32:08 Impulse kernel: #PF: supervisor read access in kernel mode
    Oct 11 21:32:08 Impulse kernel: #PF: error_code(0x0000) - not-present page
    Oct 11 21:32:08 Impulse kernel: PGD 0 P4D 0
    Oct 11 21:32:08 Impulse kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
    Oct 11 21:32:08 Impulse kernel: CPU: 1 PID: 5236 Comm: Disk Not tainted 5.19.14-Unraid #1
    Oct 11 21:32:08 Impulse kernel: Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING II, BIOS 4301 03/04/2021
    Oct 11 21:32:08 Impulse kernel: RIP: 0010:folio_try_get_rcu+0x0/0x21
    Oct 11 21:32:08 Impulse kernel: Code: e8 8e 61 63 00 48 8b 84 24 80 00 00 00 65 48 2b 04 25 28 00 00 00 74 05 e8 9e 9b 64 00 48 81 c4 88 00 00 00 5b e9 cc 5f 86 00 <8b> 57 34 85 d2 74 10 8d 4a 01 89 d0 f0 0f b1 4f 34 74 04 89 c2 eb
    Oct 11 21:32:08 Impulse kernel: RSP: 0000:ffffc900026ffcc0 EFLAGS: 00010246
    Oct 11 21:32:08 Impulse kernel: RAX: 0000000000000022 RBX: 0000000000000022 RCX: 0000000000000022
    Oct 11 21:32:08 Impulse kernel: RDX: 0000000000000001 RSI: ffff88801e450b68 RDI: 0000000000000022
    Oct 11 21:32:08 Impulse kernel: RBP: 0000000000000000 R08: 000000000000000c R09: ffffc900026ffcd0
    Oct 11 21:32:08 Impulse kernel: R10: ffffc900026ffcd0 R11: ffffc900026ffd48 R12: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R13: ffff888428441cb8 R14: 00000000000028cd R15: ffff888428441cc0
    Oct 11 21:32:08 Impulse kernel: FS:  00001548d34fa6c0(0000) GS:ffff88842e840000(0000) knlGS:0000000000000000
    Oct 11 21:32:08 Impulse kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 11 21:32:08 Impulse kernel: CR2: 0000000000000056 CR3: 00000001a3fe6000 CR4: 00000000003506e0
    Oct 11 21:32:08 Impulse kernel: Call Trace:
    Oct 11 21:32:08 Impulse kernel: <TASK>
    Oct 11 21:32:08 Impulse kernel: __filemap_get_folio+0x98/0x1ff
    Oct 11 21:32:08 Impulse kernel: filemap_fault+0x6e/0x524
    Oct 11 21:32:08 Impulse kernel: __do_fault+0x30/0x6e
    Oct 11 21:32:08 Impulse kernel: __handle_mm_fault+0x9a5/0xc7d
    Oct 11 21:32:08 Impulse kernel: handle_mm_fault+0x113/0x1d7
    Oct 11 21:32:08 Impulse kernel: do_user_addr_fault+0x36a/0x514
    Oct 11 21:32:08 Impulse kernel: exc_page_fault+0xfc/0x11e
    Oct 11 21:32:08 Impulse kernel: asm_exc_page_fault+0x22/0x30
    Oct 11 21:32:08 Impulse kernel: RIP: 0033:0x1548dbc04741
    Oct 11 21:32:08 Impulse kernel: Code: 48 01 d0 eb 1b 0f 1f 40 00 f3 0f 1e fa 48 39 d1 0f 82 73 28 fc ff 0f 1f 00 f3 0f 1e fa 48 89 f8 48 83 fa 20 0f 82 af 00 00 00 <c5> fe 6f 06 48 83 fa 40 0f 87 3e 01 00 00 c5 fe 6f 4c 16 e0 c5 fe
    Oct 11 21:32:08 Impulse kernel: RSP: 002b:00001548d34f9808 EFLAGS: 00010202
    Oct 11 21:32:08 Impulse kernel: RAX: 000015480c010d30 RBX: 000015480c018418 RCX: 00001548d34f9a40
    Oct 11 21:32:08 Impulse kernel: RDX: 0000000000004000 RSI: 000015471f8cd50f RDI: 000015480c010d30
    Oct 11 21:32:08 Impulse kernel: RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000000
    Oct 11 21:32:08 Impulse kernel: R13: 00001548d34f9ac0 R14: 0000000000000003 R15: 0000154814013d10
    Oct 11 21:32:08 Impulse kernel: </TASK>
    Oct 11 21:32:08 Impulse kernel: Modules linked in: xt_connmark xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_mark xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc ipv6 mlx4_en mlx4_core igb i2c_algo_bit edac_mce_amd edac_core kvm_amd kvm wmi_bmof mxm_wmi asus_wmi_sensors crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel mpt3sas aesni_intel crypto_simd nvme cryptd ahci i2c_piix4 raid_class rapl k10temp i2c_core nvme_core ccp scsi_transport_sas libahci wmi button acpi_cpufreq unix [last unloaded: mlx4_core]
    Oct 11 21:32:08 Impulse kernel: CR2: 0000000000000056
    Oct 11 21:32:08 Impulse kernel: ---[ end trace 0000000000000000 ]---

     

    So they always start with this (end address will change):

     

    Oct 11 05:02:02 Cogsworth kernel: BUG: kernel NULL pointer dereference, address: 0000000000000076

     

    and always have this:

     

    Oct 11 05:02:02 Cogsworth kernel: Call Trace:
    Oct 11 05:02:02 Cogsworth kernel: <TASK>
    Oct 11 05:02:02 Cogsworth kernel: __filemap_get_folio+0x98/0x1ff

     

    The fact that it's happening to various users with very different hardware, both Intel and AMD, makes me think it's not a hardware/firmware issue, so we can try to find if they are running anything in common, these are the plugins I've found in common between the 4 or 5 cases found so far, these are some of the most used plugins so not surprising they are installed in all but it's also easy to rule them out:

     

    ca.backup2.plg - 2022.07.23  (Up to date)
    community.applications.plg - 2022.09.30  (Up to date)
    dynamix.active.streams.plg - 2020.06.17  (Up to date)
    file.activity.plg - 2022.08.19  (Up to date)
    fix.common.problems.plg - 2022.10.09  (Up to date)
    unassigned.devices.plg - 2022.10.03  (Up to date)
    unassigned.devices-plus.plg - 2022.08.19  (Up to date)

     

    So anyone having this issue try temporarily uninstalling/disabling these plugin to see if there's any difference.

    • Like 2
    • Upvote 1



    User Feedback

    Recommended Comments



    I use Deluge, and in the linuxserver/deluge:latest docker it's pulling libtorrent 2.0.8.0 which causes the same issue. I'm at work, but later I'm going to try and load up some older versions that use libtorrent v1 and see if it persists.

    Link to comment
    1 hour ago, JesterEE said:

    and it's an application error rather than a kernel error!

    question is though, if it really is purely an application error (libtorrent bug) then why is the crash not seen in unraid 6.9.x?, i'm not saying its unrelated to libtorrent, just saying it could still be related to kernel/docker changes that are making the libtorrent bug more pronounced - keep in mind qbittorrent has been using libtorrent 2.x for my image since Jan 10th 2022, no reported issues until 6.111.x series came out.

     

    qbittorrentvpn users - if you are using my image and are seeing the crash then try rolling back to tagged image '4.3.9-2-01' this will be using libtorrent 1.x - WARNING you may see loss of some configuration when rolling back due to changes in qbittorrent between v4.3.x and 4.4.x

     

    • Like 1
    Link to comment
    23 minutes ago, JesterEE said:

     

    Thank you very much for joining our community forum just to let us know that this is repeatable on other Linux systems and it's an application error rather than a kernel error!

    Maybe this is some bug in kernel that exposed by libtorrent. I got exactly same problem on gentoo and followed this thread waiting for possible solution. Swapped to entirely different hardware (cpu,motherboard,memory,hdd) and only recompiling qbittorrent with 1.2 version solved this.

    Link to comment
    17 minutes ago, Altwazar said:

    Maybe this is some bug in kernel that exposed by libtorrent.

    you might be onto something there.

     

    17 minutes ago, Altwazar said:

    Swapped to entirely different hardware (cpu,motherboard,memory,hdd) and only recompiling qbittorrent with 1.2 version solved this.

    interesting!.

     

    just as a fyi i am drinking my own champagne here so i also do use qbittorrentvpn (latest) on unraid 6.11.3, and although i did get one crash earlier on its been rock solid for me since, so its not triggering the bug for all users, perhaps based on torrent load perhaps or hardware configuration.

    Link to comment

    I think the most important part is this is seemingly not an Unraid specific issue and seems like some funky interaction between libraries.

     

    Does it still make sense to track this issue here?

     

     

    Link to comment
    10 minutes ago, JesterEE said:

    Does it still make sense to track this issue here?

    I'm going to close the bug, but the thread will remain unlocked if anyone still wants to add any info.

    • Like 2
    Link to comment

    UPDATE: I had an issue where the 'old' version of deluge (2.0.5) was causing state corruption on docker power off. I know this has been resolved more recently (maybe by libtorrent v2, maybe in deluge, I'm not sure) as I have not had this problem in months running the latest LSIO docker. So, I looked in the source repo, again, and found a newer version of deluge (2.1.1) built with libtorrent-rasterbar v1. This may fix the issue, I do not know yet. Either way, it's newer than the link from the original post.

    2.1.1-r3-ls179 - Dockerhub Link

     

    ORIGINAL POST:

    For those of us that use Deluge (from Linuxserver), I snooped the source and found that the last libtorrent-rasterbar with a version < 2 was 2.0.5-0202202181752ubuntu20.04.1-ls140 - Dockerhub Link.

     

    This was released 9 months ago! At that point they switched the base image to Alpine and pointed to either 3.16 or edge; which now only has libtorrent-rasterbar v2 in the Alpine package index.  I've updated this docker daily since forever, so I have definitely been using libtorrent-rasterbar v2 for months. So, @Altwazar and @binhex are probably right in that there is some new Linux kernel interaction in the 6.11 series that is triggering the error.

     

    I'll try and roll another container with this version over my lunch break and see if the kernel gets mad.

    Update: Pulled and running. We'll see what we see 😆.

    Edited by JesterEE
    Updated with new info
    • Thanks 3
    Link to comment
    1 hour ago, binhex said:

    qbittorrentvpn users - if you are using my image and are seeing the crash then try rolling back to tagged image '4.3.9-2-01' this will be using libtorrent 1.x - WARNING you may see loss of some configuration when rolling back due to changes in qbittorrent between v4.3.x and 4.4.x

    And for users with questions on this, @binhex as already provided you an answer (from https://github.com/binhex/documentation/blob/master/docker/faq/unraid.md)

    Quote

    Q5. There is an issue with the latest version of an application, how do i roll back to a specific version?

    A5. In order to pull down a specific version of an application you need to specify the tag with the version you want. To find out what tags are available for the docker image you need to go to the first post in the applications support thread, then copy the URL shown after the text "Docker Hub:" and append "tags/" to the end of the url and paste into your preferred browser.

    This will return a list of available tag names, make a note of the tag you want (tag name denotes the version of the application) and then go the unRAID web interface, left clicking the specific Docker container and selecting "edit", then click on the advanced view option (top right) and edit the repository string, adding in ":" to the end of the name, e.g. to specify a version of 1.0.0.0 for couchpotato. the repository would be changed from:-

    binhex/arch-couchpotato

    to

    binhex/arch-couchpotato:1.0.0.0

     

    • Like 1
    Link to comment

    You know, now that this has all come to light ... it makes sense that this is a libtorrent issue because the only people seeing it are either running deluge or qbittorent; the 2 most popular clients that use this library (Libtorrent - Projects).

    Link to comment
    13 hours ago, ShadyDeth said:

    Also @JesterEE, @CiscoCoreX@III_D and anyone else reading this having the issue, could you give a little more info on your torrent container and whether you're using the built in vpn (openvpn or wireguard) or using the "VPN Manager" built into Unraid if using a vpn at all.


    Hi @ShadyDeth I don't use VPN on any torrent or build-in VPN manager. But I was using binhex-qbittorrentvpn when i got my first error. After that i was trying binhex-delugevpn, same problem.
    I installed linuxserver-qbittorrent and linuxserver-deluge and still same problem.

    I still using linuxserver-qbittorrent now, since it hasn't nothing todo with binhex's versions.

    Link to comment
    10 hours ago, JorgeB said:

    Yes, thank you @Altwazar, everyone having this specific call trace with Unraid is using Qbittorrent or is there anyone not using it and still having this issue?

    I am using neither Qbittoreent or deluge and my server has crashed multiple times over the last two weeks since using 6.11.1. Going to upgrade to 6.11.3 and see if it persists. Let me know if there is any information I can provide to assist. I've attached my diagnostics but it only went back to just the last time I rebooted so I attached a larger cut of the syslog as well. Today's crash occured while I was remotely accessing my server (via wireguard) and couldn't load the docker page, went to the docer settings to try to stop the docker and then the UI stopped responding. That was at about noon today.

    citadel-diagnostics-20221118-2026.zip syslog.txt

    Link to comment
    4 minutes ago, SergeantCC4 said:

    I am using neither Qbittoreent or deluge and my server has crashed multiple times over the last two weeks since using 6.11.1.

     

    Looking at your syslog, this doesn't look like the same problem as this thread is addressing.

     

    You have a ton of BTRFS errors. Maybe search that on the forum and see what you find.

     

    • Thanks 1
    Link to comment
    6 hours ago, SergeantCC4 said:

    Let me know if there is any information I can provide to assist.

    It's not the same issue being discussed here, start by running memtest, if issues persist please start a new thread in the general support forum.

    • Thanks 1
    Link to comment

    FWIW - For a couple of days I have been starting binhex qbittorrentvpn docker and only leaving it on for a couple of hours, then turning it off.  Since I have started doing this, I have not had an unraid crash.  Before this, it would almost certainly crash <48h. 

     

    Thanks to everyone for your attempts to solve this.

    Link to comment
    On 11/18/2022 at 11:30 AM, JesterEE said:

    For those of us that use Deluge (from Linuxserver), I snooped the source and found that the last libtorrent-rasterbar with a version < 2 was 2.0.5-0202202181752ubuntu20.04.1-ls140 - Dockerhub Link.

     

    This was released 9 months ago! At that point they switched the base image to Alpine and pointed to either 3.16 or edge; which now only has libtorrent-rasterbar v2 in the Alpine package index.  I've updated this docker daily since forever, so I have definitely been using libtorrent-rasterbar v2 for months. So, @Altwazar and @binhex are probably right in that there is some new Linux kernel interaction in the 6.11 series that is triggering the error.

     

    I'll try and roll another container with this version over my lunch break and see if the kernel gets mad.

    Update: Pulled and running. We'll see what we see 😆.

     

    Thanks Jester for finding the specific version.  I switched to this version and upgrade back to 6.11 series.  So far so good....

    • Like 1
    Link to comment

    72 hours of running Deluge on libtorrent v1 and no crashes. In every other test it has crashed before this point, so all signs point to this is the true issue.

     

    Looking at this thread on the kernel bug tracker is looks like some sort of operating system configuration bug. I guess we are at the mercy of the Linux gods.

     

    What I find interesting is that this should be a very widespread and a somewhat prohibitive issue with an internet full of bittorent users on newer Linux distros. Yet the chatter is relatively low on +1s and reports to having the same issue. Is it not affecting everyone the same way?

    Edited by JesterEE
    • Like 1
    • Thanks 1
    Link to comment

    Day 4 and I got this somewhat related error spammed in the deluge docker container log.

     

    Quote

    terminate called after throwing an instance of 'libtorrent::libtorrent_exception'
      what():  invalid type requested from entry

     

    The client is not accessible, but Unraid itself seems unaffected (WebUI, SSH, etc.). I tried to restart the docker container and the deluge daemon is not starting correctly yielding the same error being reported right away.

     

    I'm going to restart the server and see what happens. I'll edit this post when I test out the container after the restart.

     

    POST RESTART UPDATE

     

    The Unraid OS restart stopped my array without issue so no parity check on reboot! I'm very happy about that, I was getting sick of dirty restarts. After restarting my deluge container, I was still getting the same error as above. I did some debugging, and it looks like my session.state file in the appdata got corrupted (this is a known issue in the before-days which I haven't had in a while - maybe about a year). Once I pulled in a copy of the state file from the backup, everything returned to normal. FYI for those that don't deluge, it makes one backup of the settings and state files in the app directory, but in my experience, it is not good about backing up only known good files ... I've had the built-in backup store a corrupted file before so milage may vary. It would be a good idea to get the state file from the Unraid CA Backup / Restore Appdata plugin if you have it installed (which, come on, we all should).

     

    I'll restart my container and see how long it goes till this comes up again. But overall, this was much nicer bad experience in that libtorrent didn't trigger a kernel error that took down the server. 😆

     

    First, I'm going to upgrade to 6.11.5 and go from there.

    Edited by JesterEE
    Post restart update
    Link to comment

    60+ hrs here.  qbittorrent showing over 8TB up and 12TB down.  It would have crashed before now under that type of load.  (Not to mention I didn't catch the older version of qbittorrent reset my cache directory and moved over 2TB off of long term storage and on to NVME cache drives, that I had to then move back...).

     

    To @JesterEE's point:  You'd think some of these large seedbox companies would be shouting from the rooftops that this is broken, but it's almost impossible to find any mention of this issue...

    • Like 1
    Link to comment
    2 minutes ago, sundown said:

    I didn't catch the older version of qbittorrent reset my cache directory and moved over 2TB off of long term storage and on to NVME cache drives, that I had to then move back..

    indeed, take heed of my warning posted earlier above:-

    WARNING you may see loss of some configuration when rolling back due to changes in qbittorrent between v4.3.x and 4.4.x

    it was the same issue going from 4.3.x to 4.4.x, loss of some (not all) configuration.

    Link to comment
    2 minutes ago, JesterEE said:

    @binhex you run one of the more widely used qbittorrent images in the community. Have you seen many more reports on your support channels?

    tbh, nope!, but then again if you see your system crashing it make take you a while to realise its related to qbittorrent/deluge and libtorrent - very much like this thread 🙂

    • Thanks 1
    Link to comment

    Hi,
    I'm on version 6.11.5 and running linuxserver's qbittorrent:

    Repository:  lscr.io/linuxserver/qbittorrent:libtorrentv1

    Still on qbittorrent version 4.4.5 just with libtorrentv1

     

    image.png

    Edited by CiscoCoreX
    Link to comment

    Quick note to deluge users that are now using the older version I linked a couple days ago, I updated the post with a newer release since I had an issue with state corruption between restarts. See the updated comment here.

     

    I did a couple quick tests (starting/stopping/restarting the container) and I don't see the same corruption occurring on the newer version of deluge still with libtorrent v1.

    • Like 1
    • Thanks 1
    Link to comment

    An interesting read regards THP, which looks to be triggering the crash:- https://blog.nelhage.com/post/transparent-hugepages/

     

    if you are feeling brave then you can try the following to disable THP which SHOUILD then prevent the crash without the need to downgrade libtorrent:-

     

    drop to 'terminal' for unraid (not the containers console) and copy and paste the following:-

     

    echo '# temporarily disable hugepages to prevent libtorrent crash with unraid 6.11.x' >> /boot/config/go
    echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /boot/config/go
    echo 'echo never > /sys/kernel/mm/transparent_hugepage/defrag' >> /boot/config/go

    then reboot your system, and then run the following to confirm THP is disabled:-

     

    grep -i HugePages_Total /proc/meminfo # output should be 0
    cat /proc/sys/vm/nr_hugepages # output should be 0

     

    from the article linked above this MAY actually increase your performance!, or at the very worst you may see a 10% drop in performance (depends on usage).

     

    keep in mind the above is a temporary hack, libtorrent/kernel/app will no doubt resolve this issue at some point, im simply posting this as a workaround for now, so you will then need to remove the above from your go file and reboot to reverse this hack.

    • Like 3
    • Thanks 5
    • Upvote 2
    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.