lp0101

Members
  • Posts

    16
  • Joined

  • Last visited

Everything posted by lp0101

  1. I ended up solving this finally. The solution was the turn of SMB on the affected shares. It looks like having both SMB and NFS for some reason lead to the errors.
  2. Unfortunately I use hardlinks pretty extensively, so 1 isn't an option. 2 isn't great either, I do frequent 1+gb/s transfers on this share I don't think it's 3 - I have the same issue across multiple machines with multiple linux OS's. Right now, all 3 are Ubuntu 22.04 machines, but still. Diagnostics attached. barad-dur-diagnostics-20231022-1710.zip
  3. Output of `mount -v`: Looks like it's NFSv4 I've tried as well without setting the flag, same behaviour - this was when mounting with fstab instead of autofs. For completion's sake, this is the autofs config I'm using to mount it:
  4. I've been trying to get NFS to play nice with unRAID for a while, and I'm really not sure where to go from here. All the shares I'm sharing are user shares with `Cache: Yes`. I'm sharing the shares from my unRAID box to 3 Kubernetes nodes, mounted with autoFS. This is a sample output of nfsstat to show it's using the correct version: /mnt/autofs/media from 192.168.10.120:/mnt/user/media Flags: rw,relatime,vers=4.2,rsize=8192,wsize=8192,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.10.171,local_lock=none,addr=192.168.10.120 I've also set my `fuse_remember` tunable to -1, which in theory should solve the stale file handle issues, but it doesn't. If I ever add files to this share, once the mover runs, I'm going to experience the stale file handle error on my nodes. Applications (containers) continue to work, but if they're restarted, they'll fail to start with the stale file mount error. As usual, the fix is to unmount and remount, which involves stopping all the applications that are using the share. Is there any advice on possible other fixes? Anything flags I should add to the mount?
  5. I'm having some weird issues with speed. I have a gigabit connection. When starting a large torrent, the speed will shoot up to ~80-100 MBps no problem, stay there for a few seconds, and then plummet back down to the single digits, sometimes as low as 2 MBps. It'll stay there for the rest of the download, not speeding up. This even happens with slower torrents. A bad torrent might start at around 15 MB/s, then also drop down after a few percent has finished to the single digits. I'm using wireguard with mullvad. I can confirm the server I'm connecting to is capable of gigabit speeds, so I'm not really sure what the issue is. Connections status is green, so the port is properly forwarded. I also set the protocol to TCP only, but no luck with either TCP or TCP/uTP Edit: Just bought a month of PIA, same issue. The speed will shoot up, then drop significantly. Edit2: Just tested a torrent on my desktop using the same wireguard config. It's not as aggressive, but it still dropped from 80 MB/s to ~20 MB/s. So I guess it isn't an unraid-specific issue. Maybe something to do with my ISP? Weird that VPN traffic would be throttled though
  6. Interesting. The bug report forum probably isn't the best place to discuss this, then, but what is the recommended way to move a file from a cached share to a non-cached share? Is it recommended to first run a copy, then delete the original?
  7. Say I have two user shares, a and b. a has the cache set to "yes," b has the cache set to "off." I have a file, foo, on a which is currently on the cache, and I run the following command: mv /mnt/user/a/foo /mnt/user/b/ foo is instantly moved to share b, however, the file itself is still on the cache. If I check the contents of my cache drive, I see `/mnt/cache/b/foo`. Since the cache behaviour of share b is set to "no," this file is now stuck on the cache, unless I manually set b to "yes" for the cache, invoke the mover, then set it back to "no." Without this manual intervention, my cache drive would fill up endlessly. Is this intended behaviour, or a bug?
  8. We're on the forum thread for the unraid nvidia plugin. I figured that was implied.
  9. What versions of the nvidia drivers does 6.9.0 beta30 use?
  10. Happened again tonight. Server was actually not doing anything when it crashed, completely idle. Woke up to the same error.
  11. Bump. Would love to move more essential services to my unraid box, but I'm not comfortable doing that until I know what causes this
  12. You don't have the `--runtime=nvidia` in the right place. You're passing it as a container env var, which does nothing. Enabled "advanced view" in the top right, and you'll see a new field called "extra parameters" under WebUI. Enter it there. Should look like that. Not that `--gpus all` and `--runtime=nvidia` both work.
  13. I'm running unraid on an intel x99 platform. Occasionally, during somewhat heavy loads from VMs/containers (recently - scanning a whole library for metadata in Jellyfin docker), I'll get hit with a crash. The only way to recover from this is to hard reboot the machine. Below is the log: (Note - Log is reversed, read the log from bottom to top. It's just how my syslog server gave it to me.) 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,CR2: 000014d1aebde718 CR3: 0000000001e0a002 CR4: 00000000001606e0 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,FS: 0000000000000000(0000) GS:ffff88881fa00000(0000) knlGS:0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,R13: ffff88881f41a000 R14: 0000000000000002 R15: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,R10: ffffc90003ba3a48 R11: 0000000000000001 R12: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RBP: ffff88881f41a020 R08: ffffc90003ba3cb3 R09: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RDX: ffffc90003ba3a50 RSI: 0000000000000000 RDI: ffffea0002e89e80 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RAX: 00000000ffffffea RBX: ffffea0002e89e88 RCX: ffffc90003ba3990 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RSP: 0018:ffffc90003ba3960 EFLAGS: 00010082 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,Code: 53 08 48 89 1a eb 25 48 8b 43 08 48 8b 3b 48 89 47 08 48 89 38 48 8b 45 00 48 89 58 08 48 89 03 48 89 6b 08 48 89 5d 00 eb 02 <0f> 0b 49 ff c7 4c 89 d8 4d 39 dc 49 0f 43 c4 48 3b 04 24 0f 82 cb 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RIP: 0010:isolate_lru_pages.isra.0+0x18b/0x2b9 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,---[ end trace 3842a02541499cc3 ]--- 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,Modules linked in: vhost_net tun vhost tap kvm_intel kvm md_mod nvidia_uvm(O) nfsv3 nfs lockd grace xt_CHECKSUM sunrpc ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables xt_nat veth ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs bonding nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) mxm_wmi wmi_bmof intel_wmi_thunderbolt crc32_pclmul intel_rapl_perf intel_uncore pcbc aesni_intel aes_x86_64 glue_helper crypto_simd ghash_clmulni_intel cryptd intel_cstate drm_kms_helper coretemp crct10dif_pclmul intel_powerclamp crc32c_intel drm x86_pkg_temp_thermal syscopyarea sysfillrect sysimgblt fb_sys_fops e1000e i2c_i801 agpgart i2c_core ahci libahci wmi pcc_cpufreq button [last unloaded: md_mod] 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,ret_from_fork+0x35/0x40 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? kthread_park+0x89/0x89 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,kthread+0x10c/0x114 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? collapse_shmem+0xacd/0xacd 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? wait_woken+0x6a/0x6a 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,khugepaged+0xa67/0x1829 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? pagevec_lru_move_fn+0xaa/0xb9 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? __lru_cache_add+0x51/0x51 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,__alloc_pages_nodemask+0x423/0xae1 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,try_to_free_pages+0xb2/0xcd 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,do_try_to_free_pages+0x1a1/0x300 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,shrink_node+0xf1/0x3cb 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? compaction_suitable+0x25/0x61 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? compaction_suitable+0x25/0x61 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? __compaction_suitable+0x77/0x96 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,shrink_node_memcg+0x4c4/0x64a 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,? move_to_new_page+0x169/0x21b 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,shrink_inactive_list+0xd8/0x47e 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,Call Trace: 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,CR2: 000014d1aebde718 CR3: 0000000001e0a002 CR4: 00000000001606e0 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,FS: 0000000000000000(0000) GS:ffff88881fa00000(0000) knlGS:0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,R13: ffff88881f41a000 R14: 0000000000000002 R15: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,R10: ffffc90003ba3a48 R11: 0000000000000001 R12: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RBP: ffff88881f41a020 R08: ffffc90003ba3cb3 R09: 0000000000000000 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RDX: ffffc90003ba3a50 RSI: 0000000000000000 RDI: ffffea0002e89e80 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RAX: 00000000ffffffea RBX: ffffea0002e89e88 RCX: ffffc90003ba3990 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RSP: 0018:ffffc90003ba3960 EFLAGS: 00010082 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,Code: 53 08 48 89 1a eb 25 48 8b 43 08 48 8b 3b 48 89 47 08 48 89 38 48 8b 45 00 48 89 58 08 48 89 03 48 89 6b 08 48 89 5d 00 eb 02 <0f> 0b 49 ff c7 4c 89 d8 4d 39 dc 49 0f 43 c4 48 3b 04 24 0f 82 cb 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,RIP: 0010:isolate_lru_pages.isra.0+0x18b/0x2b9 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,"Hardware name: ASUS All Series/X99-A, BIOS 4101 07/10/2019" 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,CPU: 8 PID: 349 Comm: khugepaged Tainted: P O 4.19.107-Unraid #1 2020-07-07,21:25:56,Warning,barad-dur,kern,kernel,invalid opcode: 0000 [#1] SMP PTI 2020-07-07,21:25:56,crit,barad-dur,kern,kernel,kernel BUG at mm/vmscan.c:1703!
  14. I'm passing through a GTX 1660 to an Ubuntu guest. It's the primary and only GPU on the unraid server. I had to modify the xml for the VM, as the nvidia USB and serial bus controllers weren't visible under "Other PCI devices." After doing so, I'm able to boot into the machine just fine. However, running `lspci` shows only the USB and serial bus controllers, no GPU or sound card. Tried installing nvidia drivers in the guest, still, the card isn't detected. I've tried the vbios available online, which I also modified as per SpaceInvaderOne's guide. This is the config for my machine: <?xml version='1.0' encoding='UTF-8'?> <domain type='kvm'> <name>gondor</name> <uuid>ae3dc2c7-26b0-3a9e-2b95-24aa81e4888e</uuid> <metadata> <vmtemplate xmlns="unraid" name="Ubuntu" icon="ubuntu.png" os="ubuntu"/> </metadata> <memory unit='KiB'>12582912</memory> <currentMemory unit='KiB'>12582912</currentMemory> <memoryBacking> <nosharepages/> </memoryBacking> <vcpu placement='static'>10</vcpu> <cputune> <vcpupin vcpu='0' cpuset='1'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='2'/> <vcpupin vcpu='3' cpuset='8'/> <vcpupin vcpu='4' cpuset='3'/> <vcpupin vcpu='5' cpuset='9'/> <vcpupin vcpu='6' cpuset='4'/> <vcpupin vcpu='7' cpuset='10'/> <vcpupin vcpu='8' cpuset='5'/> <vcpupin vcpu='9' cpuset='11'/> </cputune> <os> <type arch='x86_64' machine='pc-q35-4.2'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader> <nvram>/etc/libvirt/qemu/nvram/ae3dc2c7-26b0-3a9e-2b95-24aa81e4888e_VARS-pure-efi.fd</nvram> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-passthrough' check='none'> <topology sockets='1' cores='5' threads='2'/> <cache mode='passthrough'/> </cpu> <clock offset='utc'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/local/sbin/qemu</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <source file='/mnt/user/domains/gondor/vdisk1.img'/> <target dev='hdc' bus='virtio'/> <boot order='1'/> <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/user/isos/ubuntu-20.04-live-server-amd64.iso'/> <target dev='hda' bus='sata'/> <readonly/> <boot order='2'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='ich9-ehci1'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/> </controller> <controller type='usb' index='0' model='ich9-uhci1'> <master startport='0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/> </controller> <controller type='usb' index='0' model='ich9-uhci2'> <master startport='2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/> </controller> <controller type='usb' index='0' model='ich9-uhci3'> <master startport='4'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/> </controller> <controller type='pci' index='0' model='pcie-root'/> <controller type='pci' index='1' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='1' port='0x8'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> </controller> <controller type='pci' index='2' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='2' port='0x9'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='pci' index='3' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='3' port='0xa'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='4' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='4' port='0x13'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/> </controller> <controller type='pci' index='5' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='5' port='0x14'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/> </controller> <controller type='pci' index='6' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='6' port='0xb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/> </controller> <controller type='pci' index='7' model='pcie-root-port'> <model name='pcie-root-port'/> <target chassis='7' port='0xc'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/> </controller> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> </controller> <controller type='sata' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> </controller> <interface type='bridge'> <mac address='52:54:00:8a:04:53'/> <source bridge='br0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </interface> <serial type='pty'> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </source> <rom file='/mnt/user/vbios/MSI.GTX1660.rom'/> <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x2'/> </source> <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x3'/> </source> <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/> </hostdev> <memballoon model='none'/> </devices> </domain>
  15. Happened again, this time after 6 hours of uptime. I was able to get to the server while it was down this time - I can confirm it had no network access, and the syslog does confirm the NIC is going down. Then it just came back after about 2 minutes. Here's the latest syslog. isengard-syslog-20200704-0522.zip
  16. tower-diagnostics-20200703-1539.zip tower-syslog-20200703-2011.zip After about 2 hours uptime, my unRAID server becomes unresponsive over the network, and is unable to be reached. The last time this happened, I went to the box itself and plugged a keyboard in. It was unresponsive for a few seconds, but recovered, and also came back on the network. I've attached the diagnostics zip. Specifically, I do see a Call Trace near the end, with the eth0 network going down. Edit: The call trace https://pastebin.com/0TGtuFDm