JesterEE

Members
  • Posts

    168
  • Joined

  • Last visited

Recent Profile Visitors

2423 profile views

JesterEE's Achievements

Apprentice

Apprentice (3/14)

62

Reputation

  1. With the Unraid 6.12 series on Linux kernel 6.1 natively, I decided to finally revisit this topic with my update to 6.12.8. After the OS update, I checked the lspci output to see if the OS was correctly assigning the correct memory size allocation for my ASUS KO GeForce RTX 3070 V2 OC Edition 8GB. I was pleasantly surprised that without doing anything, it was assigning the resource space to the maximum video memory allotment my card is able to provide (i.e. 8GB) (see full lspci output at bottom of this post). # lspci -vvvs 0c:00.0 0c:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. GA104 [GeForce RTX 3070 Lite Hash Rate] Capabilities: [bb0 v1] Physical Resizable BAR BAR 0: current size: 16MB, supported: 16MB BAR 1: current size: 8GB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB BAR 3: current size: 32MB, supported: 32MB Note the BAR 1 size is set to 8GB. Before the kernel update (and with the kernel patch referenced in the earlier pages of this thread), it was set to a default of 256MB. All is looking good so far! I followed the these baseline steps ✅ Host BIOS UEFI w/o CSM ✅ Host BIOS Enable ReBAR support ✅ Host BIOS Enable 4G Decoding ⬛ Enable & Boot Custom Kernel syslinux configuration (near beginning of this thread) not needed anymore Before modifying my Windows 10 Pro VM configuration, I booted up the VM to see if anything was needed for the Guest OS to recognize ReBAR. I did make sure my guest bios was set to OVMF TPM (regular OVMF provided the same result as shown below though). Windows booted without issue and I ran both GPU-Z 2.57.0 and the NVIDIA Control Panel to check ReBAR support: This is what I saw: GPU-Z reported ReBAR as Enabled, but when I went into the Advanced settings, 4G Decode was shown as Disabled in BIOS. NVIDIA Control Panel shows ReBAR an Enabled and shows it's correctly allocating 8GB of dedicated video memory with an additional 16GB of shared memory for 24GB total. If I close the apps and relaunch them, GPU-Z reports differently, showing ReBAR as Disabled with the same advanced details (NVIDIA Control Panel stays reporting ReBAR Enabled with the same details). I shut down the VM and tried the XML edits noted in this thread and other online spaces talking about VFIO ReBAR: <domain type='kvm'> ➡️ <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <qemu:commandline> <qemu:arg value='-fw_cfg'/> <qemu:arg value='opt/ovmf/X-PciMmio64Mb,string=65536'/> </qemu:commandline> After relaunching the VM, I found the results to be the same. So, this is interesting in that the XML may not be required for ReBAR anymore either. However, since I'm getting inconsistent reporting using GPU-Z and the NVIDIA Control Panel, I can't be sure. I think I trust NVIDIA Control Panel more than GPU-Z on this one even though GPU-Z has never steered me wrong in the past. I figure the hardware vendors driver information software probably knows better and GPU-Z is looking at some inconsistent information and reporting incorrectly. But, I think putting a synthetic benchmark to test Host BIOS setting differences is probably called for in this scenario (ReBar and 4G Decoding On vs Off). I'll report some of that in a follow-up post. Anyone else see something similar to what I'm seeing and have verified ReBAR functional in their VM? -JesterEE # lspci -vvvs 0c:00.0 0c:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. GA104 [GeForce RTX 3070 Lite Hash Rate] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 141 IOMMU group: 30 Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at 7c00000000 (64-bit, prefetchable) [size=8G] Region 3: Memory at 7e00000000 (64-bit, prefetchable) [size=32M] Region 5: I/O ports at f000 [size=128] Expansion ROM at fc000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee00000 Data: 0000 Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s (downgraded), Width x8 (downgraded) TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp+ 10BitTagReq+ OBFF Via message, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5-16GT/s, Crosslink- Retimer+ 2Retimers+ DRS- LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+ EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [258 v1] L1 PM Substates L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+ PortCommonModeRestoreTime=255us PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns L1SubCtl2: T_PwrOn=10us Capabilities: [128 v1] Power Budgeting <?> Capabilities: [420 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Capabilities: [bb0 v1] Physical Resizable BAR BAR 0: current size: 16MB, supported: 16MB BAR 1: current size: 8GB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB BAR 3: current size: 32MB, supported: 32MB Capabilities: [c1c v1] Physical Layer 16.0 GT/s <?> Capabilities: [d00 v1] Lane Margining at the Receiver <?> Capabilities: [e00 v1] Data Link Feature <?> Kernel driver in use: vfio-pci Kernel modules: nvidia_drm, nvidia
  2. Has this plugin been delisted from CA? Fix Common Problems is showing this after upgrade to 6.12.8 from 6.11.
  3. Here is a link to the libtorrent bug tracker for this issue: https://github.com/arvidn/libtorrent/issues/6952
  4. Been running it continuously since my 11/18/2022 post. No issues.
  5. Yes. Above 4G Decoding: Enabled Resize BAR Support: Auto (other option is Disabled)
  6. Started using a new video card in Unraid this week and noticed the card name and PCIe Gen columns on the first line are overlapping for may card with a long name. Can the card name be truncated depending on the width of the window (and subsequently the column)?
  7. Yup, messed that up in the copypasta while experimenting. Anyway, not a big deal...it works for me if I want to set the ReBAR to acceptable values lower than the default 256MB (for my card [64MB, 128MB, 256MB]) ... But it will not set them higher (for my card [512MB, 1GB, 2GB, 4GB, 8GB]). If I try and set it to a value lower than 64MB or higher than 256MB I will get the error. # -bash: echo: write error: Device or resource busy Here is the is the memory allocation info for my card # lspci -vvvs 0b:00.0 0b:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] (rev a1) (prog-if 00 [VGA controller]) ... Region 0: Memory at fb000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 3: Memory at c8000000 (64-bit, prefetchable) [size=32M] ... Physical Resizable BAR BAR 0: current size: 16MB, supported: 16MB BAR 1: current size: 256MB, supported: 64MB 128MB 256MB 512MB 1GB 2GB 4GB 8GB BAR 3: current size: 32MB, supported: 32MB Thanks for publishing the patch and modified kernel even though it didn't work for me completely. Hope others give it a shot too to report their mileage.
  8. No, I changed the addressed on my side, but I posted the command so it could easily be referenced from your post. Here is my version: #!/bin/bash echo -n "0000:0b:00.0" > /sys/bus/pci/drivers/vfio-pci/unbind echo 14 > /sys/bus/pci/devices/0000\:0b\:00.0/resource1_resize # <<<< Gets stuck here echo -n "10de 2488" > /sys/bus/pci/drivers/vfio-pci/new_id || echo -n "0000:0b:00.0" > /sys/bus/pci/drivers/vfio-pci/bind
  9. @Skitals So I tried the script commands you specified in your previous post, but got stuck when actually sizing the ReBar with: # echo 14 > /sys/bus/pci/devices/0000\:0d\:00.0/resource1_resize # -bash: echo: write error: Device or resource busy Did some searching and I couldn't find a way to correct this. Not looking for tech support necessarily, just reporting my experience. On my system, the video card is bound to VFIO and the system is booting with a syslinux config including ... video=efifb:off ...
  10. Tried your patch today with my ASUS KO RTX 3070 and a Windows 10 VM. GPU-Z is still reporting Resizeable Bar as Disabled. Was there any additional setup needed to set the initial state of the Bar or should it be on by default with the patched kernel?
  11. I checked the repo again just now, this is still the latest LSIO release on libtorrent v1. I use the Gluetun container for my VPN and I've never seen this issue. Actually just the opposite. I intentionally test this from time to time to see if I'm leaking my IP and when the VPN is off and does not revert to the default internet connection (essentially a built in kill switch). I do not create a custom docker network as this write-up has shown. Instead, in the template for the container you want to use the VPN network, I set: Network Type: None and add --network=container:VPN_CONTAINER_NAME on the extra parameters line. I'm pretty sure this is essentially doing the same thing except without naming the network, so I'm not sure why we have different experiences with dropped connections. What is important to note, doing it this way will require the client containers to rebuild when the VPN container is updated. This is because docker needs to point the clients (deluge, etc.) to the new endpoint since it has a new hash associated with the VPN container. So when you update your VPN container via the WebUI, since Unraid 6.9 (I think), the OS has been smart enough to rebuild the attached containers automatically, and after a minute or so for rebuild and restarting the client containers, all is well. However, if the VPN container gets updated automatically by the Auto Update Applications plugin, the rebuild will not be triggered (since this rebuild control is implemented in the Docker WebUI php code), and all clients will lose their network connection. This will still not leak my IP and revert to the default network, but the client containers will just have no network connectivity. So, in the Auto Update Applications settings, I turn autoupdate off for the VPN client and do that one manually from time to time. Hope this helps!
  12. 3+ days torrent uptime, no crashes in 6.11.5. I'm satisfied with the current resolution that libtorrent v2 is to blame. If you all want to get back on v2, I suggest following the open issue on the libtorrent tracker to see when they correctly support transparent hugepages.
  13. Cross linking to an Unraid support thread where libtorrent (used by deluge) is causing a kernel error on the 6.11 release series. If you are on 6.11.* and see your syslog contain the error shown in this thread accompanied by your deluge/Unraid webUI being unresponsive, you may also be experiencing the same issue. This is not unique to Unraid, but seemingly all distros utilizing newer versions of the Linux v5 kernel.
  14. Quick note to deluge users that are now using the older version I linked a couple days ago, I updated the post with a newer release since I had an issue with state corruption between restarts. See the updated comment here. I did a couple quick tests (starting/stopping/restarting the container) and I don't see the same corruption occurring on the newer version of deluge still with libtorrent v1.
  15. @binhex you run one of the more widely used qbittorrent images in the community. Have you seen many more reports on your support channels?