jaybee

Members
  • Posts

    242
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

jaybee's Achievements

Explorer

Explorer (4/14)

1

Reputation

  1. I just had a similar issue and thought I would post here for anyone else that encounters similar behaviour with black screen (or no output) from the GPU where everything else looks like it should work. My Nvidia 3060ti also got detected as an "NVIDIA Corporation Device 2489" but this is not an indicator of any problem in itself. Some Nvidia GPUs do get detected as strange names, especially the newer RTX 30XX series. Symptoms I got: VM when started with GPU passed though correctly in terms of config on unraid, goes green in the UI and everything looks as though it would work fine. But no output occurs to a monitor on HDMI or Display port. When used with pure VNC as graphics card it functions fine. When the server boots a display output can be seen during post and BIOS so you know it works normally. What I changed which caused this issue to occur: Upgraded the GPU in my system from an Nvidia 3060 to an Nvidia 3060ti Actual issue: Nvidia drivers in windows VM itself. I believe that because I did not unintall the old nvidia driver before I did the upgrade, Windows kept trying to use the existing driver to work with the card. The problem I think, was that the driver I had installed was a specific one which only works with non LHR 3060 cards as this was the leaked dev driver which unlocks 3060s so they can be mined with. I should have uninstalled this driver first whilst the card was in the server still and whilst inside the VM and used a generic windows auto one, or just installed the latest nvidia drivers first for the 3060. They would probably be similar enough to the 3060ti that it may have initiated it and worked. I don't think you can simply use VNC connection to go in and remove the driver because then the GPU is not listed in device manager to be able to uninstalled if you see what I mean. Although... How you might be able to fix it: A. It may be possible to use something like DDU free software to force to remove all references of the nvidia drivers. Then it may be possible to start the VM with GPU passed through and it may become auto detected and initiated with a basic found driver by Windows. B. Create a new VM from scratch would probably solve the issue and initiate the GPU fresh. How I fixed it: In my case since I already had the new GPU installed in the server and didn't want to put the old one back in or try either of the above two things yet, I decided to: (If remote desktop is already possible on your VM then skip to step 6) 1. Configure the VM to only use VNC as GPU. 2. Start the VM in unraid GUI 3. Using VNC, access the VM and go to remote desktop settings inside windows. Ensure remote desktop connections are allowed and that you can connect with a valid user. 4. Exit remote desktop and VNC. 5. Stop/shutdown the VM from unraid GUI 6. Configure the VM to now use the GPU passed through as per other instructions in various other threads/youtube videos (out of scope for this explanation as too long) 7. Start the VM form the unraid GUI 8. Remote desktop into the VM 9. In device manager you should now see the nvidia GPU as a device which can be right clicked and then uninstalled. This should remove the problematic and non functional/compatible nvidia driver which is not able to work with the currently installed and passed through GPU. In this case my new 3060ti is the GPU existing, and the driver uninstalled was the old 3060 dev mining driver. 10. Go to nvidia website and download and install the latest driver for your GPU as you normally would. I would recommend with nvidia that you select "CUSTOM" install and then do a "CLEAN" install to properly clear out any remains of older drivers. The VM will reboot to complete installation. Just remote desktop back in once done to verify it completes and exit installation. 11. Disconnect remote desktop and test direct GPU connectivity to a monitor now. You should find you have output as normal on your screen.
  2. Did you ever get the bottom of this? Sometimes testing with larger file sisze in crystal disk mark like 4gb gives results showing sequential write of around only 1000MB/s which may be more realistic given the bare metal performance should be around 3000MB/s. It definitely seems like a caching issue which makes it hard to truly compare passed through nvme performance vs vdisk. Vdisk has the nice portability aspect of it and ease of backup, but nvme has the advantage of supposedly supporting proper trim and wear levelling stuff when passed through to windows properly I have read.
  3. I just encountered this problem. I went to add a cache drive to my existing single SSD pool, to make it a dual SSD redundant one with mirroring. When I added the second SSD it warned that data on it would be formatted. I have read of other threads where the entire pool's contents were formatted so decided to back up everything onto array. I faced the same issues as above. I used the unbalanced plugin to get past the issues. I seem to recall some of the paths/files did in fact exist, but mover could not find them but can't be sure. However, when I moved everything back to cache drive using mover again, I got the same errors and this time checked a few of the paths/files it was talking about to see if they truly did exist, and they didn't. So why would mover try to move files or folders that genuinely do not exist. A strange situation. I confess that I did not have the time to properly troubleshoot this and grab all logs but thought I would put it out there that this may need more investigation. it definitely seems to be the plex app data paths that get very deep and with long names sometimes with dots in them.
  4. Hey guys, I want to setup an nvme drive as pass though in my VM to get close to bare metal performance for gaming. Before I do this, I took some benchmarks of the existing vdisk performance before I did it for comparison. I noticed that the numbers for usual benchmarking tools are claiming very unrealistic performance stating numbers that are simply too fast and impossible. I gather that this is due to the way vdisk works with caching? Is there a way to turn this off and test more easily to get a baseline? I tried i/o zone but could not understand how to use it easily as it is cmd based and does not give easy to read results. I tried i/o meter and the control panel for the software looks ancient and again not sure how to use. The results I got on a vdisk running on a sata cache disk which would normally have read speed of approx 500MB/s and write of around 350 MB/s are below: Crystal disk mark: Sequential - Read = 10,803MB/s, Write = 4,812MB/s 4KiB Q8T8 - Read = 192MB/s, Write = 112MB/s 4KiB Q32T1 - Read = 396MB/s, Write = 94MB/s 4KiB Q1T1 - Read = 46MB/s, Write = 36MB/s ASSSD: Sequential - Read = 6,582MB/s, Write = 2,142MB/s 4K - Read = 47MB/s, Write = 33MB/s 4K 64 Thrd - Read = 292MB/s, Write = 72MB/s Acc. Time - Read = 0.193ms, Write = 0.106ms
  5. What has changed then? Why are hacks occurring more now?
  6. Thanks for the support with this. I've had a pay around with my 3070 and got it passed through to a VM where it does indeed hash at 61mh/s with the memory overclocked. In this docker it sits at 51/52mh/s. I have been successful in adjusting the power limit using nvidia-smi as you proposed above. This worked perfectly and halved my power usage. I was also able to set the core GPU clock also using nvidia-smi with no issues. I did have to set persistance mode on. For the memory however, I can only see a command mentioned as like this: nvidia-smi -i 0 -ac 2000,1000 But when I do that I get the following error: Setting applications clocks is not supported for GPU 00000000:06:00.0. Treating as warning and moving on. All done. So it seems adjusting memory clocks is not possible? I googled and also saw that if you have "N/A" in the output for "nvidia-smi -q" in a lot of places, this means that your GPU does not support that feature. I can see N/A against the application clocks part as per below output. Maybe this is why the above error occurs. Clocks Graphics : 1005 MHz SM : 1005 MHz Memory : 6800 MHz Video : 900 MHz Applications Clocks Graphics : N/A Memory : N/A Default Applications Clocks Graphics : N/A Memory : N/A Max Clocks Graphics : 2100 MHz SM : 2100 MHz Memory : 7001 MHz Video : 1950 MHz Max Customer Boost Clocks I can see mention of using "nvidia-settings" or xconfig or powermizer to adjust things like memory, but I think this is only possible on more feature rich Linux distros like ubuntu possibly. Have you any other ideas how we can overclock memory somehow? This would be a killer feature to have it set and running inside a docket as you say, since then it is not bound to a VM and is accessible as a shared resource for both mining at its full potential, as well as for plex transcoding duties.
  7. The below shows the IOMMU groupings for the Asus B550-E ROG Strix Gaming Motherboard. IOMMU set to enabled in BIOS and ACS override setting in unraid VM settings set to disabled. Lines separate each group. Out of the box the below presents the following issues possibly: 1: The disks (bolded below) all come under the same IOMMU group. I was expecting that the onboard SATA disks (in my case Samsung SSDs below) would be separate to the HBA card mechanical spinners. I assumed this meant that the VM could not have an individual SSD passed through to it as the entire drive. 2: The USB controller 3.0 and 2.0 (bolded below) both appear under the same IOMMU group. I assumed this meant that I could not separate the unraid flash drive to the USB ports I would want to pass through to the VM. IOMMU group 0:[1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 1:[1022:1483] 00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 2:[1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 3:[1022:1482] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 4:[1022:1482] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 5:[1022:1483] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 6:[1022:1483] 00:03.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge IOMMU group 7:[1022:1482] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 8:[1022:1482] 00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 9:[1022:1482] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 10:[1022:1484] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 11:[1022:1482] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge IOMMU group 12:[1022:1484] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B] IOMMU group 13:[1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61) [1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51) IOMMU group 14:[1022:1440] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0 [1022:1441] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1 [1022:1442] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2 [1022:1443] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3 [1022:1444] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4 [1022:1445] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5 [1022:1446] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6 [1022:1447] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7 IOMMU group 15:[1987:5012] 01:00.0 Non-Volatile memory controller: Phison Electronics Corporation E12 NVMe Controller (rev 01) [N:0:1:1] disk Sabrent__1 /dev/nvme0n1 1.02TB IOMMU group 16:[1022:43ee] 02:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ee Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 8087:0029 Intel Corp. Bus 001 Device 003: ID 0b05:18f3 ASUSTek Computer, Inc. AURA LED Controller Bus 001 Device 004: ID 05e3:0610 Genesys Logic, Inc. 4-port hub Bus 001 Device 005: ID 05e3:0610 Genesys Logic, Inc. 4-port hub Bus 001 Device 006: ID 0781:5580 SanDisk Corp. SDCZ80 Flash Drive Bus 001 Device 007: ID 0409:005a NEC Corp. HighSpeed Hub Bus 001 Device 008: ID 051d:0002 American Power Conversion Uninterruptible Power Supply Bus 001 Device 009: ID 1a40:0101 Terminus Technology Inc. Hub Bus 001 Device 010: ID 0557:8021 ATEN International Co., Ltd Hub Bus 001 Device 011: ID 04b4:0101 Cypress Semiconductor Corp. Keyboard/Hub Bus 001 Device 013: ID 045e:0040 Microsoft Corp. Wheel Mouse Optical Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub [1022:43eb] 02:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43eb [1:0:0:0] disk ATA SAMSUNG SSD 830 3B1Q /dev/sdb 128GB [2:0:0:0] disk ATA SAMSUNG SSD 830 3B1Q /dev/sdc 128GB [1022:43e9] 02:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43e9 [1022:43ea] 03:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea [1022:43ea] 03:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43ea [1000:0072] 04:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03) [7:0:0:0] disk ATA SAMSUNG HD203WI 0002 /dev/sdd 2.00TB [7:0:1:0] disk ATA Hitachi HDS5C302 A580 /dev/sde 2.00TB [7:0:2:0] disk ATA Hitachi HDS5C302 A580 /dev/sdf 2.00TB [7:0:3:0] disk ATA ST2000DL003-9VT1 CC32 /dev/sdg 2.00TB [7:0:4:0] disk ATA ST4000DM000-1F21 CC54 /dev/sdh 4.00TB [7:0:5:0] disk ATA WDC WD100EZAZ-11 0A83 /dev/sdi 10.0TB [7:0:6:0] disk ATA WDC WD100EZAZ-11 0A83 /dev/sdj 10.0TB [8086:15f3] 05:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 02) IOMMU group 17:[10de:2484] 06:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1) [10de:228b] 06:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1) IOMMU group 18:[10de:1c82] 07:00.0 VGA compatible controller: NVIDIA Corporation GP107 [GeForce GTX 1050 Ti] (rev a1) [10de:0fb9] 07:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1) IOMMU group 19:[1022:148a] 08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function IOMMU group 20:[1022:1485] 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP IOMMU group 21:[1022:1486] 09:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP IOMMU group 22:[1022:149c] 09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub IOMMU group 23:[1022:1487] 09:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
  8. When you say you are passing it through...how? If it appears in unassigned drives then does that not mean that it is NOT passed through? What happens to the drive in unassigned drives list when you start the VM? Does it dissappear as it at that point gets passed through? When I checked the IOMMU groups on my B550 Asus ROG Strix E board, the IOMMU groups do not look good. Will explain more below. I checked my IOMMU groups and the two controllers I think you speak of are listed like so for me. This is with the latest BIOS from February and with ACS patch under VM manager settings set to disabled: IOMMU group 22:[1022:149c] 09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub You can see that they both fall under the same IOMMU group. I have seen countless threads/posts/videos all saying that if things appear in the same IOMMU group, that it means you have to pass the whole thing through. I don't really understand what people are saying here, but I think they mean that under VM settings, when you select something to pass through from that group, then if the one thing you want is say a soundcard but it also has say 2 x GPUs, then you must also select to pass through those GPUs under the GPU selection part. I saw one of spaceinvader's videos where he was passing through a GPU, and he said something like "you must pass through the HDMI sound as well for this GPU as you can see it is in the same IOMMU group". So he went to the soundcard part of the VM settings and passed through the sound from the GPu as well, and then just disabled it inside device manager for the Windows VM, and only left the actual soundcard he wanted enabled in device manager. Are you suggesting that the above is not required and that you can selectively just pick what you want? Have I misunderstood? The NVME top slot in my Asus ROG Strix E board is inside its own IOMMU group so should not be an issue. I am still reading up on how to pass a GPU through. Seems complex with many different settings and tweaks required. When you say you had ACS set to enabled...which setting? There are 4: PCIe ACS override: disabled / downstream / multifunction / both When I tried each one it did separate things a little more, but even on both I think it was limited because IOMMU groups are still shared. I will post below with all the groupings. As above, do you not need to pass through the entire IOMMU group? Did you not just contradict yourself? You say to not bind them, but then do bind the entire controller? Which is it? What is flr? On boot of what? The VM or the physical bare metal unraid server? Slightly confused by the above statement. I think you mean if you bind it to the VM as a passthrough device, it becomes unavailable to unraid once the VM has booted up. Well...yes, but why would you pass through anything to the VM that you later may want? How does one chose to do either one? I see no such method available to differentiate in the VM manager settings or VM template settings. I just get a drop down box of a list of different devices I can select so does this mean I am only doing PCI-e passthrough?
  9. When you say it's running beautifully .... Have you been able to separate the IOMMU groups out so you can individually pass through selectively what you want? I.e. what are the chances of running a VM with usb, gpu1, sata ssd off main board, M2 off main board all passed through, whilst Unraid used gpu2, sata HDDs off main board as well as off a hba card in pcie slot?
  10. Interesting, on the contrary, I have an AMD RX 590 which is a toasty, loud beast in another PC I have. When mining it gets up to 80-85c on stock clocks. I understand that the 590 is actually somewhat different to the 580 despite that it seems they would be very similar. I believe there was an architecture change between them. It mines in the UK here at approximately £2.50 profit a day at the time of writing this. My 3070 when I have monitored it in the trex webgui during mining, never seems to go above 33% fan speed and a temp of 61c. This is running in my loft currently though which is cold. Perhaps I may try plex streams at the same time to see what happens.
  11. Yes I understand. I have not experimented with using plex on the same GPU at the same time as mining. I wonder if the 3070 would be handle a couple of streams and still mine at the same time. I guess I will have to do some testing.
  12. Thanks! Tried that and all working now. I take it there is no option to adjust clocks or power anyway? I find tht the mh/s for me is 52mh/s when mining ethereum on a 3070. Apparently my card should do 60mh/s. Is using it in this way via a docker pretty much the same as bare metal performance? I take it there would be no benefit of running this inside a Windows VM with the GPU passed through instead? I know this defeats the purpose of the docker but I'm just hypothesizing purely on performance hash rate. Nice work.
  13. Well I simply downloaded the latest trex version that shows in CA listed and it claims on docker status to be running "latest" version. See below logs. Should I be trying to source a newer docker version of trex from elsewhere then? My unraid server is running version 6.9.0rc2 and nvidia-smi shows this: NVIDIA-SMI 455.45.01 Driver Version: 455.45.01 CUDA Version: 11.1 But when I start trex it shows: 20210218 23:58:10 T-Rex NVIDIA GPU miner v0.19.11 - [CUDA v10.0 | Linux]
  14. I tried this trex docker on a new Nvidia 3070 card and it gave the following error: Can't start miner, Geforce RTX 3070 (CC 8.6) is not supported by CUDA v10 build, use T-Rex compiled with CUDA v11.1 or newer
  15. What did you find happened when plex tries to use the card to hardware transcode when it was already in use with mining? Did it refuse to use GPU at all or did it crash or lag? What about multiple stream attempts?