bastl

Members
  • Posts

    1266
  • Joined

  • Last visited

  • Days Won

    3

Report Comments posted by bastl

  1. Small update from my side. As long as I close any active session to Unraids web-ui from my main desktop, the server won't freeze. If I activly manage something on the server, no freezes. It only happens when I'am logged in on the web-ui from my Win10 PC and the PC isn't really in use. But even if on idle, it happens randomly only every 2-3 days. I'am still not sure how to fix this. 😒

    • Upvote 1
  2. I'am kinda in the same boat as you. Random crashes on all 6.12 releases I tested. I also tried all sorts of combination with Dockers started or stopped, same with VMs. 14h Memtest with 0 errors. Smart values from the disks show no errors. There is no clear indication what causes the crash for me. Sometimes during the night when idle, sometimes during the day on low load or even when transcoding a video with tdar. 30min after a fresh reboot it crashes/freezes on the next the run the server is stable for 3-4 days and as you experienced nothing in the logs. It's kinda frustrating. I'am back on 6.11.5 and it was stable for 11 days. I had a power outage 3 days ago and since than also no crash. 

  3. @JorgeB Ok, now server is also crashing on 6.12.2. I changed nothing else, only rolled back the update within Unraid itself.

     

    full syslog:  syslog.txt

     

    last couple lines:

    Oct 11 16:41:06 mini kernel: divide error: 0000 [#1] PREEMPT SMP NOPTI
    Oct 11 16:41:06 mini kernel: CPU: 7 PID: 0 Comm: swapper/7 Tainted: P           O       6.1.36-Unraid #1
    Oct 11 16:41:06 mini kernel: Hardware name: BESSTAR TECH LIMITED HM90/HM90, BIOS 5.16 10/13/2021
    Oct 11 16:41:06 mini kernel: RIP: 0010:flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: Code: e8 f6 f5 76 00 bf 01 00 00 00 e8 fb f4 ff ff 48 c7 c7 dd 59 10 82 e8 e0 f5 76 00 65 66 8b 05 ca e0 f2 7e 66 85 c0 74 05 e8 fd <45> f7 ff 0f ba e3 09 73 06 fb 0f 1f 44 00 00 5b e9 ae 34 b0 00 0f
    Oct 11 16:41:06 mini kernel: RSP: 0018:ffffc9000019fee8 EFLAGS: 00010647
    Oct 11 16:41:06 mini kernel: RAX: 0000000000000000 RBX: 0000000000000286 RCX: 00000000000f4240
    Oct 11 16:41:06 mini kernel: RDX: 0000000000000002 RSI: ffffffff821059dd RDI: ffffffff820ba9d5
    Oct 11 16:41:06 mini kernel: RBP: ffffffff823235a0 R08: ffff888712ded470 R09: ffff888712ded470
    Oct 11 16:41:06 mini kernel: R10: 0000000000000000 R11: 0000000000000075 R12: 0000000000000007
    Oct 11 16:41:06 mini kernel: R13: ffff88810090de80 R14: 0000000000000001 R15: 0000000000000000
    Oct 11 16:41:06 mini kernel: FS:  0000000000000000(0000) GS:ffff888712dc0000(0000) knlGS:0000000000000000
    Oct 11 16:41:06 mini kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 11 16:41:06 mini kernel: CR2: 0000001e6780b000 CR3: 0000000154e22000 CR4: 0000000000350ee0
    Oct 11 16:41:06 mini kernel: Call Trace:
    Oct 11 16:41:06 mini kernel: <TASK>
    Oct 11 16:41:06 mini kernel: ? __die_body+0x1a/0x5c
    Oct 11 16:41:06 mini kernel: ? die+0x30/0x49
    Oct 11 16:41:06 mini kernel: ? do_trap+0x7b/0xfe
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: ? do_error_trap+0x6e/0x98
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: ? exc_divide_error+0x34/0x41
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: ? asm_exc_divide_error+0x16/0x20
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: ? flush_smp_call_function_queue+0x55/0x83
    Oct 11 16:41:06 mini kernel: do_idle+0x1d5/0x1fb
    Oct 11 16:41:06 mini kernel: cpu_startup_entry+0x1d/0x1f
    Oct 11 16:41:06 mini kernel: start_secondary+0xeb/0xeb
    Oct 11 16:41:06 mini kernel: secondary_startup_64_no_verify+0xce/0xdb
    Oct 11 16:41:06 mini kernel: </TASK>
    Oct 11 16:41:06 mini kernel: Modules linked in: macvlan nfsv3 nfs xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth ipvlan xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter dm_crypt dm_mod xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) it87 tcp_diag inet_diag hwmon_vid vendor_reset(O) iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs bridge stp llc igc r8169 realtek amdgpu edac_mce_amd edac_core gpu_sched drm_buddy i2c_algo_bit drm_ttm_helper ttm kvm_amd drm_display_helper drm_kms_helper kvm drm crct10dif_pclmul crc32_pclmul crc32c_intel
    Oct 11 16:41:06 mini kernel: ghash_clmulni_intel sha512_ssse3 btusb btrtl btbcm aesni_intel btintel crypto_simd bluetooth cryptd agpgart ahci nvme i2c_piix4 ecdh_generic rapl i2c_core syscopyarea libahci k10temp ecc amd_sfh nvme_core sysfillrect ccp sysimgblt fb_sys_fops tpm_crb video tpm_tis tpm_tis_core wmi tpm backlight acpi_cpufreq button unix [last unloaded: igc]
    Oct 11 16:41:06 mini kernel: ---[ end trace 0000000000000000 ]---
    Oct 11 16:41:06 mini kernel: RIP: 0010:flush_smp_call_function_queue+0x64/0x83
    Oct 11 16:41:06 mini kernel: Code: e8 f6 f5 76 00 bf 01 00 00 00 e8 fb f4 ff ff 48 c7 c7 dd 59 10 82 e8 e0 f5 76 00 65 66 8b 05 ca e0 f2 7e 66 85 c0 74 05 e8 fd <45> f7 ff 0f ba e3 09 73 06 fb 0f 1f 44 00 00 5b e9 ae 34 b0 00 0f
    Oct 11 16:41:06 mini kernel: RSP: 0018:ffffc9000019fee8 EFLAGS: 00010647
    Oct 11 16:41:06 mini kernel: RAX: 0000000000000000 RBX: 0000000000000286 RCX: 00000000000f4240
    Oct 11 16:41:06 mini kernel: RDX: 0000000000000002 RSI: ffffffff821059dd RDI: ffffffff820ba9d5
    Oct 11 16:41:06 mini kernel: RBP: ffffffff823235a0 R08: ffff888712ded470 R09: ffff888712ded470
    Oct 11 16:41:06 mini kernel: R10: 0000000000000000 R11: 0000000000000075 R12: 0000000000000007
    Oct 11 16:41:06 mini kernel: R13: ffff88810090de80 R14: 0000000000000001 R15: 0000000000000000
    Oct 11 16:41:06 mini kernel: FS:  0000000000000000(0000) GS:ffff888712dc0000(0000) knlGS:0000000000000000
    Oct 11 16:41:06 mini kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Oct 11 16:41:06 mini kernel: CR2: 0000001e6780b000 CR3: 0000000154e22000 CR4: 0000000000350ee0

     

  4. 4 hours ago, JorgeB said:

    There are some reports of possible hangs due to OOM issues caused by the kernel failing to invoke the OOM killer, create the small script below with the user scripts plugin and schedule it to run hourly, it will output the memory stats to the syslog, then see if there's anything abnormal in the persistent syslog.

     

    #!/bin/bash
    free -h |& logger &

     

    Thanks. Fresh start of the server and first run of the script:

    Oct  8 16:36:20 mini emhttpd: cmd: /usr/local/emhttp/plugins/user.scripts/backgroundScript.sh /tmp/user.scripts/tmpScripts/OOM_test_script_testing_crashes/script
    Oct  8 16:36:20 mini root:                total        used        free      shared  buff/cache   available
    Oct  8 16:36:20 mini root: Mem:            27Gi       8.9Gi       4.3Gi       160Mi        14Gi        17Gi
    Oct  8 16:36:20 mini root: Swap:             0B          0B          0B

    I will report back 👍

  5. another small update:

     

    Still no idea, whats causing my server crashes on 6.12.4. I tried a couple things

    • disabled all dockers, only 2 VMs running: crash
    • no VMs, only a couple dockers running: random crash
    • no docker no VMs: crash
    • switched the Docker custom network type from macvlan to ipvlan even without having any macvlan call traces before: same issue: crash/freeze

     

    For all the crashes during the last couple of days the server was basically idle and the syslog server didn't catched anything.

     

    Any ideas?

     

  6. And it happend again. For me it looks like there is no mention of KVM this time. Server froze complety, no access at all.

     

     

    Dec 13 01:20:24 UNRAID  emhttpd: spinning down /dev/sdb
    Dec 13 01:30:23 UNRAID  emhttpd: read SMART /dev/sdb
    Dec 13 01:36:17 UNRAID kernel: BUG: unable to handle page fault for address: ffffffff8109fb3c
    Dec 13 01:36:17 UNRAID kernel: #PF: supervisor write access in kernel mode
    Dec 13 01:36:17 UNRAID kernel: #PF: error_code(0x0003) - permissions violation
    Dec 13 01:36:17 UNRAID kernel: PGD 220e067 P4D 220e067 PUD 220f063 PMD 10001e1 
    Dec 13 01:36:17 UNRAID kernel: Oops: 0003 [#1] PREEMPT SMP NOPTI
    Dec 13 01:36:17 UNRAID kernel: CPU: 41 PID: 0 Comm: swapper/41 Not tainted 5.19.17-Unraid #2
    Dec 13 01:36:17 UNRAID kernel: Hardware name: Gigabyte Technology Co., Ltd. TRX40 AORUS XTREME/TRX40 AORUS XTREME, BIOS F4d 03/05/2020
    Dec 13 01:36:17 UNRAID kernel: RIP: 0010:menu_reflect+0x25/0x43
    Dec 13 01:36:17 UNRAID kernel: Code: e9 b5 8a 57 00 0f 1f 44 00 00 41 54 41 89 f4 55 48 89 fd 53 48 c7 c3 20 b1 02 00 e8 13 6b 1a 00 89 c0 48 03 1c c5 e0 6a 16 82 <44> 89 65 10 c7 03 01 00 00 00 e8 47 b7 a6 ff 0f b6 c0 89 43 04 5b
    Dec 13 01:36:17 UNRAID kernel: RSP: 0018:ffffc9000044fed8 EFLAGS: 00010282
    Dec 13 01:36:17 UNRAID kernel: RAX: 0000000000000029 RBX: ffff88902da6b120 RCX: 0000000200000000
    Dec 13 01:36:17 UNRAID kernel: RDX: 0000000000000000 RSI: ffffffff820d7be1 RDI: ffffffff820d80c1
    Dec 13 01:36:17 UNRAID kernel: RBP: ffffffff8109fb2c R08: 0000000000000002 R09: 0000000000000002
    Dec 13 01:36:17 UNRAID kernel: R10: 0000000000000020 R11: 0000000000000072 R12: 0000000000000002
    Dec 13 01:36:17 UNRAID kernel: R13: ffff888100a80000 R14: 0000000000000002 R15: 0000000000000000
    Dec 13 01:36:17 UNRAID kernel: FS:  0000000000000000(0000) GS:ffff88902da40000(0000) knlGS:0000000000000000
    Dec 13 01:36:17 UNRAID kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Dec 13 01:36:17 UNRAID kernel: CR2: ffffffff8109fb3c CR3: 0000000619570000 CR4: 0000000000350ee0
    Dec 13 01:36:17 UNRAID kernel: Call Trace:
    Dec 13 01:36:17 UNRAID kernel: <TASK>
    Dec 13 01:36:17 UNRAID kernel: ? update_curr+0x24/0x14e
    Dec 13 01:36:17 UNRAID kernel: do_idle+0x191/0x1f5
    Dec 13 01:36:17 UNRAID kernel: cpu_startup_entry+0x1d/0x1f
    Dec 13 01:36:17 UNRAID kernel: start_secondary+0xeb/0xeb
    Dec 13 01:36:17 UNRAID kernel: secondary_startup_64_no_verify+0xce/0xdb
    Dec 13 01:36:17 UNRAID kernel: </TASK>
    Dec 13 01:36:17 UNRAID kernel: Modules linked in: nfsv3 nfs dm_mod dax xt_CHECKSUM ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod it87 hwmon_vid ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc ixgbe xfrm_algo mdio btusb btrtl btbcm btintel gigabyte_wmi wmi_bmof mxm_wmi bluetooth edac_mce_amd edac_core kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl ecdh_generic ecc corsair_psu ahci libahci ccp nvme i2c_piix4 input_leds led_class joydev nvme_core i2c_core k10temp thermal wmi button acpi_cpufreq unix [last unloaded: xfrm_algo]
    Dec 13 01:36:17 UNRAID kernel: CR2: ffffffff8109fb3c
    Dec 13 01:36:17 UNRAID kernel: ---[ end trace 0000000000000000 ]---
    Dec 13 01:36:17 UNRAID kernel: RIP: 0010:menu_reflect+0x25/0x43
    Dec 13 01:36:17 UNRAID kernel: Code: e9 b5 8a 57 00 0f 1f 44 00 00 41 54 41 89 f4 55 48 89 fd 53 48 c7 c3 20 b1 02 00 e8 13 6b 1a 00 89 c0 48 03 1c c5 e0 6a 16 82 <44> 89 65 10 c7 03 01 00 00 00 e8 47 b7 a6 ff 0f b6 c0 89 43 04 5b
    Dec 13 01:36:17 UNRAID kernel: RSP: 0018:ffffc9000044fed8 EFLAGS: 00010282
    Dec 13 01:36:17 UNRAID kernel: RAX: 0000000000000029 RBX: ffff88902da6b120 RCX: 0000000200000000
    Dec 13 01:36:17 UNRAID kernel: RDX: 0000000000000000 RSI: ffffffff820d7be1 RDI: ffffffff820d80c1
    Dec 13 01:36:17 UNRAID kernel: RBP: ffffffff8109fb2c R08: 0000000000000002 R09: 0000000000000002
    Dec 13 01:36:17 UNRAID kernel: R10: 0000000000000020 R11: 0000000000000072 R12: 0000000000000002
    Dec 13 01:36:17 UNRAID kernel: R13: ffff888100a80000 R14: 0000000000000002 R15: 0000000000000000
    Dec 13 01:36:17 UNRAID kernel: FS:  0000000000000000(0000) GS:ffff88902da40000(0000) knlGS:0000000000000000
    Dec 13 01:36:17 UNRAID kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Dec 13 01:36:17 UNRAID kernel: CR2: ffffffff8109fb3c CR3: 0000000619570000 CR4: 0000000000350ee0
    Dec 13 10:49:37 UNRAID  wsdd2[8836]: starting.

     

     

     

  7. 20 hours ago, PeZet said:

    Adding extra pcie is not a case for me as don't have a space in my tiny PC case.

    Looks like the "Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller" is the only one on your board. Most boards these days have at least 2 controllers. Kinda bad, if you don't have enough space for a simple pcie usb card. Even a cheap x1 card should work.

     

    Another question, what PSU are you using? Maybe it's at the limit and small spikes causing to drop connected USB devices.

     

    Depending how old your board is, you might check the pins/contacts for the used USB port. Clean them in case there are some corrosion. For some USB devices a small voltage drop is enogh to become unstable. I had a old external USB drive with 2 contact slightly differ in colour. Since I cleaned them with alcohol, contact issues are gone. Just an idea.

  8. @PeZet Usually as long as you don't change settings in your BIOS or add/remove USB devices, the IDs are kinda static. If you have a VM for example with a passed through USB controller which you often restart with changing USB devices connected, sooner or later you will see some IDs changing. As long as unraid itself always sees the same devices without adding or removing devices you shoud be ok.

     

    What you can do is passthrough a USB controller where the Conbee is connected to. Keep in mind, all connected devices will be disconnected from Unraid and handed over to the VM. Maybe get an extra pcie card for passthrough if you have no unused/ungrouped controllers.

     

  9. I've noticed the same thing. Syslog in the WebUI starts with some entries with plugin update checks. Would be nice to have the full syslog since boot back in the WebUI.

     

    I guess this was set in the RC builds where people reported spams in the logs to avoid the webinterface breaking.

  10. @mmmeee15 I have noticed something similiar. Not sure what triggers the cursor in the VNC window sometimes disappearing, but it often happens if I change the resolution inside the VM or change some graphic settings. For me it helped to change the cursor appearance in the guest os itself to inverted or a dark one and it will always show again. Somehow the cursor only disappears for me if the guest has it set to a white one what is the default on most operating systems.

  11. 12 hours ago, Marshalleq said:

    I've been googling this one to see what the joke is - please share!

    It was the "best performing" AIO special made for first gen Threadrippers. And indeed it performed really well for 3-4 months. During the time of use lots of people reported something growing in their rads what ended in a failing cooling solution. I had 2 of them. One of the first gen, dead after 4 months and a RMA unit V2 version dead after 5 months.

     

    https://www.youtube.com/watch?v=nttKqzQiZEo

     

    If you search you will find lots of videos and forum threads of broken devices like in this video.

    • Like 1