emnclarke

Members
  • Posts

    9
  • Joined

Everything posted by emnclarke

  1. No crashes since! I am confident that mnt/user was causing the issue.
  2. You should only need to change it for Organizr if you're just having issues with Organizr.
  3. I set my /config to /mint/cache/... and I haven’t had a crash since. Try that I’d yours isn’t.
  4. I'm not crazy! I've been testing more and it's not related to my ram. I've narrowed it down to only happening when NZBGet is open, any chance you use that as well? Specifically when flipping back and forth between sonarr/radarr and nzbget. I can't seem to get it to trigger without NZBget open.
  5. Thanks for the link! The symptoms seem similar enough that it may be related. I'll update after I've tried disabling xmp. It seems strange that it would only be an issue with one docker though.
  6. Late september ish I started having unraid (6.9 latest beta, currently beta30) lock up and it requires a hard reboot to fix. Initially, docker shuts down and most CPU cores go to 100%. Within 1-5 minutes, the Unraid ui stops responding and the server no longer responds to pings or ssh. Attempting to reboot/shutdown from the UI while it's still responsive does not work and just enters the unresponsive state. A hard reset is the only way to fix this. I've determined it is extremely likely that it only happens while the organizr docker is running. Possibly only happens while a browser has organizr open but I'm not 100% sure about that. I was having near daily unraid crashes so I spent the last week with organizr not running crash free and two nights ago turned it back on (although wasn't using it) and yesterday when I started using it almost immediately I had another crash. In the syslog, crashes always start with the following message or something very similar: Oct 29 10:25:30 Mercury kernel: BUG: kernel NULL pointer dereference, address: 0000000000000402 Oct 29 10:25:30 Mercury kernel: #PF: supervisor read access in kernel mode Oct 29 10:25:30 Mercury kernel: #PF: error_code(0x0000) - not-present page Oct 29 10:25:30 Mercury kernel: PGD 0 P4D 0 Oct 29 10:25:30 Mercury kernel: Oops: 0000 [#1] SMP NOPTI Oct 29 10:25:30 Mercury kernel: CPU: 6 PID: 118105 Comm: php-fpm7 Tainted: P O 5.8.13-Unraid #1 Oct 29 10:25:30 Mercury kernel: Hardware name: Gigabyte Technology Co., Ltd. X399 AORUS Gaming 7/X399 AORUS Gaming 7, BIOS F12 12/11/2019 Oct 29 10:25:30 Mercury kernel: RIP: 0010:fuse_readahead+0x124/0x352 Does anyone have any ideas what could be causing this and any suggestions for how I could fix this so I can keep using organizr? It's possible the issue is something to do with one of my other dockers being in an iframe but I don't know why that would be an issue. Yesterday the crash happened while I was looking at nzbget, nzbhydra, and radarr v3. I posted this on the Organizr discord but they seem to think it's an unraid issue since there are no other reports of similar behaviour. I have a number of unraid plugins and other dockers running although I've managed to trigger a crash with most dockers and some plugins disabled. I've confirmed it's not the unraid Nvidia build (crashes happen on stock). I've also disabled the cachedir plugin which may have been causing some other issues but crashes still happen. If it is Organizr causing the crashes, how can I prevent a docker from taking down my whole system? Is there perhaps some obscure conflict I'm not aware of? I appreciate any suggestions and can provide any addition info I missed. Thanks so much for any help! I've attached diagnostics and the full syslog for yesterday. I've also run several memtests without error. Also attached a list of hardware and plugins. Tagging per request from organizr discord: @Roxedus @tronyx mercury-diagnostics-20201030-1740.zip syslog2020-10-29 copy.txt hardware.txt plugins.txt
  7. Just had my fastest crash after a reboot and it was on a stock build. Thanks for reminding me to never make assumptions. Guess I have to keep looking for a possible cause...
  8. I'm having issues with multiple crashes of unraid a week and I believe it's related to the Nvidia plugin. Every crash that I have a syslog for starts with this: Oct 12 21:35:22 Mercury kernel: BUG: kernel NULL pointer dereference, address: 0000000000000402 Oct 12 21:35:22 Mercury kernel: #PF: supervisor read access in kernel mode Oct 12 21:35:22 Mercury kernel: #PF: error_code(0x0000) - not-present page Oct 12 21:35:22 Mercury kernel: PGD 0 P4D 0 Oct 12 21:35:22 Mercury kernel: Oops: 0000 [#1] SMP NOPTI Oct 12 21:35:22 Mercury kernel: CPU: 9 PID: 97523 Comm: php-fpm7 Tainted: P O 5.8.12-Unraid #1 I can still login for a variable amount of time when this happens (max 10 minutes ish) and some things still work (changing settings, docker and VMs have always gone down). Inevitably it completely crashes and I cannot even ping or ssh into the machine requiring a hard reset to fix. I believe it's related to the nvidia plugin due to a repeated line that mentions Nvidia: Oct 12 21:35:22 Mercury kernel: Modules linked in: veth nvidia_uvm(O) xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost vhost_iotlb tap xt_nat iptable_filter xfs md_mod it87 hwmon_vid iptable_nat xt_MASQUERADE nf_nat ip_tables wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 libchacha poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic bonding nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) edac_mce_amd crc32_pclmul rapl aesni_intel drm_kms_helper glue_helper btusb btrtl btbcm crypto_simd btintel ghash_clmulni_intel cryptd drm bluetooth kvm backlight syscopyarea sysfillrect sysimgblt fb_sys_fops wmi_bmof mxm_wmi ecdh_generic agpgart ahci alx crct10dif_pclmul mpt3sas nvme i2c_piix4 ecc crc32c_intel k10temp ccp i2c_core libahci mdio raid_class nvme_core button scsi_transport_sas wmi acpi_cpufreq I've also seen a couple Reddit threads that have not exact but similar looking logs but none of the suggested fixes worked (I can't seem to find the threads again, otherwise I would have linked them). I've attached the full syslog, any help would be greatly appreciated. Thanks! edit: a little more info, not using the plex script but I do run nvidia-smi -pm 1 on first boot to fix the power state issues edit 2: just had an identical crash on stock build disregard this syslog.txt