Sparkie

Members
  • Posts

    19
  • Joined

  • Last visited

Everything posted by Sparkie

  1. OK, I will do that and report back. Many thanks, Sparkie
  2. Help is requested. I cannot add or delete docker images. Checking 'Fix Common Problems' it shows I have some errors. 1. Unable to write to cache 2. Unable to write to Docker Image The cache is a pool of two 1Tb nvme drives. Smart status is showing healthy and utilization is 14%. On the surface it seems the cache drives are OK but obviously something is wrong. I have posted diagnostics and included an error file from the cache drive test. Help would be greatly appreciated. Sparkie chapelsvr-diagnostics-20240325-1155.zip cache Errors.zip
  3. I have been getting some machine check events showing up in the Syslog. Below is an extract from that log: Nov 25 17:36:24 Tower kernel: mce: [Hardware Error]: Machine check events logged Nov 25 17:36:24 Tower kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 22: d82000000002080b Nov 25 17:36:24 Tower kernel: mce: [Hardware Error]: TSC 0 MISC d012000200000000 SYND 5a000005 IPID 1002e00000002 Nov 25 17:36:24 Tower kernel: mce: [Hardware Error]: PROCESSOR 2:800f82 TIME 1669410344 SOCKET 0 APIC 0 microcode 800820d Nov 25 17:48:11 Tower root: Fix Common Problems: Error: Machine Check Events detected on your server I have attached diagnostics if this might be needed. Any ideas on what might be causing this? tower-diagnostics-20221125-1825.zip
  4. @ghost82Thanks for the help again. The server at issue is at a remote site and I am managing remotely so the delay in getting back with the current situation. 1. I updated the xml as you suggested ( and spaceinvaderone originally noted ) and can now boot the VM again. Thanks for that. Partial success. 2. I did install windows bare-metal as you suggested with the video card in question and Windows now has multi-monitor (3 monitors) support again. 3. Going back to the VM again and it only has one monitor support, so the bare-metal experiment verifies the Video Card (Gforce 1050 Ti 4MB RAM) is working correctly. 4. In the VM in 3.) above I was working without passing through the vbios. Wanted to keep the variables to the minimum and then add back in if 3-monitor support was working. I will try it again with passed thru bios, Attached below. 5. Regarding diagnostics did not attach as after performing the xml edits you recommended the VM starts OK, but with just one monitor. 6. I was thinking about installing a Linux VM and seeing if that might work with 3-monitor support. I don't know if Linux (say Ubuntu) will auto-detect or revert to a basic resolution if no Nvidia Gforce 1050 drivers are present when Ubuntu starts. Windows will automatically detect the video card being added and load the appropriate drivers. I have checked Nvidia's website and they do have drivers available for Linux. I would setup Linux via the VNC access through the VM, then once setup and working OK then add the 1050 with sound, edit xml and restart and see what happens. Will report when complete. Thanks again... Asus.EditChapel.GTX1050Ti.4096.171212.rom
  5. @ghost82OK, I have done some testing. Tried all combinations as you noted. Same result. I swapped out video card, this time a Asus Phoenix GTX1050Ti with 4Gb RAM instead of the Gigabyte GTX1050 with 2GB RAM. Again same result only one monitor out of three can be enabled at a time. I disconnected the projector, same result only one monitor out of 2 can be enabled at a time. Double checked the template for the VM, everything looks OK (did not check the XML). So I started to suspect maybe the Win10 Image was damaged somehow. So... I created a new Windows 11 VM and fired that up. Set it all up with the Virtual display. Win 11 working fine with all virtio drivers installed. Installed the Teamviewer client for remote access. Went in and selected the Asus Video Card and associated sound card. Updated the template and started the VM. Checking "display settings" I see three monitors (previously reconnected all three) via TeamViewer, working remotely. Accessing via Teamviewer I could login OK. So video via the Asus card was working fine (although only seeing one screen). But I could not verify if all three monitors were in actual fact were displaying. Shutdown the VM after Win11 wanted to do some updates and have not been able to restart the VM ever since. The VM just hangs. Blew away that VM and recreated a new W11 VM again. Virtual display works just fine but selecting the video card and associated sound and the VM just hangs. Have not been able to get video working with the video card ever since, rebooted the server several times, same result, even powered down the server just in case the video needed full power off, no joy. Now I don't know if changes have been made to the XML for VMs with Graphics Card passthrough for Unraid 6.11.1 or 6.10.x (I was running 6.9.x previously). This is the only video card in the Unraid server. In 6.9.x you had to hand-edit the XML to add the multifunction parameter to the video (43.00.00) and edit the slot & function for the sound (43.00.01). Do you know if the necessity to edit the XML has been removed in 6.10.x or 6.11.1 for a passed thru video card with sound? Here is a snippet form the current XML for video and associated sound. Note there is no multi-function parameter included which I had to add previously as per SpaceInvaderOne's video instructions for a Windows 10 VM. I note that his recent video for Windows 11 makes no mention of hand-editing the XML, he just selects the Video Card and assoicated sound, clicks update and starts the VM no problem. Hence my qestion, is this not a problem now? Or is there some work-around 'under the hood' we are not seeing. Here is the XML snippet for Video and associated sound: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x43' slot='0x00' function='0x0'/> MY EDIT: VIDEO LINE NO MULTIFUNCTION PARAMETER </source> <alias name='hostdev0'/> <rom file='/mnt/user/domains/Windows 10/Asus.EditChapel.GTX1050Ti.4096.171212.rom'/> <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x43' slot='0x00' function='0x1'/> MY EDIT: THIS IS THE SOUND </source> Is the above correct for only one video card in the system and being passed-thru to the VM? Sorry for being long-winded, but any suggestions would be greatly appreciated. At this point just trying to get the VM up and running again on the video card which would be fantastic and then I can go to the site to verify everything. Cheers
  6. Thanks for your response, I will try your recommendations and report back on the results. Thanks again
  7. Looks like I have lost multiple monitor functionality on my Windows 10 VM. I was running version 6.9 of Unraid and all was working perfectly. Two 1080p monitors and a Epson Projector. While I was away it looks like a power failure occurred that lasted over 20 minutes upon which time my UPS shutdown. Upon restart Unraid started up OK but my Win10 VM was no longer functional. Anyway upon logging in to the GUI Unraid flagged that I was running an unsupported version of Unraid and recommended upgrade to latest version which I did. Reboot and everything looked OK. VM still would not start. Then realizing my VirtIO was probably out of date I upgraded to virtio-win-0.1.225-1.iso. I tried restarting the VM, no joy. I edited the VM and updated the Machine Bios: to OVMF TPM (could not get OVMF to work). The Machine parameter is set to: i440fx-7.1 (this is the latest, previously it was 6.1 but not absolutely certain). The video card is passed thru to the VM along with it's sound. Tried rebooting and it worked. Windows 10 started OK, but only had one monitor. I installed the latest Nvidia drivers for the 1050 graphics card and still no joy, only one monitor of the three works. Opened the Nvidia control panel and clicked on multiple monitors and it shows the 3 connected monitors with an option to select the other two monitors besides the working one. Now the problem: When clicking on one of the other monitors I get the following messages: "This GPU supports 1 display". This is not true because I ran it with three monitors previously (the card has a DVI port, an HDMI port and a Display Port). If I click on another monitor and click Apply that monitor works but the other two do not, even though the Nvidia drivers sees all three. BTW I updated all the drivers on the new virtio iso, EXCEPT the Balloon drivers. Could not find how to do that one. I did see somewhere on the forums that you will see under Device Manager and entry for Other devices, but I do not see that. How to update the Balloon drivers? Also I have passed thru the graphics card to the VM with an edited ROM bios with the Header section removed. So I know that is good, it worked before no problems. So to summarize: 1. How to reinstate multiple monitor support? 2. How to update the Balloon drivers from the VirtIO ISO (it is mounted as drive E: in Windows 10.) Any help on this issue would be greatly appreciated. Thanks
  8. OK Thanks, I will proceed with the upgrade using the Parity Swap Procedure. Thanks for all the help. Cheers, Sparkie
  9. Thanks again for the excellent help. Ran the extended SMART test. Diagnostics attached. Unraid reported: Errors occurred - Check SMART report Looks like the drive is needing replacement. On another but related subject: I now have the failed disk and it is being properly emulated. I have two parity disks both 6TB Seagate Ironwolf 7200RPM. My replacement disk for the failed disk is an 8TB WD Red Plus 7200RPM. This will be a problem as the replacement data disk cannot be bigger than the parity drives. SpaceInvader One talked about my situation almost exactly but his scenario had only one parity drive. His procedure involved a parity copy procedure by removing the parity drive replacing it with a larger drive and assigning it to parity then copying the removed parity disk to the new larger parity disk. In my case I have two 6TB parity drives and replacing one of them with the 8TB then copying the removed parity drive to that one. I assume in that case the max size of parity at the end of the parity copy is still 6TB until I get around to replacing the last remaining 6TB parity with an 8TB? With a failed disk in the array will that impact the data rebuild after the parity drive is replaced assuming I can do this. Cheers, Sparkie tower-smart-20220528-1635.zip
  10. Status update: Stopped array, shuddown (power-cycle). Restarted, started array. Disk13 still RED-X but checking data shows everything is still properly emulated. Here are the Diagnostics after restart. Thanks, Sparkie tower-diagnostics-20220528-1553.zip
  11. Thanks for the information, that's very helpful. On a whim I checked the local BestBuy and they had an 8TB WD Red SAS drive which I purchased. So from what I have read given my Parity drives are 6TB the max size of this new drive in the array will be 6TB. In the meantime I will shutdown and restart the server and see if the 'failed' drive comes back online. If so I will download a new diagnostics and post. Again, thanks very much for the quick response, very much appreciated. Cheers, Sparkie
  12. Hello Guys: Need a little help on this one. I have a Seagate Ironwolf drive in my array, 6TB data-drive which is failing (I think). 7200RPM. I have two 6TB Parity drives (Seagate Ironwolf as above) and another 6TB Seagate datadrive (again same). No errors from these drives. All the other drives are WD and Hitachi. Got to eventually get around to replacing the smaller drives with bigger ones to free up space in the array cabinet LOL. I am going to RMA the drive back to Seagate after checking the warranty status which expires in 7 days, Yikes! Just in time. I have already received an RMA authorization from Seagate BUT they require me to ship the drive to them before they will send a replacement. This will take about 7-10 days round trip. Hopefully sooner. I have not rebooted Unraid until I hear from you guys nor have I stopped and restarted the array. I have been getting sector errors and I/O errors on this 'failed' drive and such as seen in the logs. So what is the best means to remove the 'failed' drive without causing further issues with the array. And can I run without the failed drive in the array and when the new replacement arrives then pre-clear and so-on? I have attached diagnostics. Any help would be greatly appreciated, Sparkie tower-diagnostics-20220528-1305.zip
  13. Thanks Hoopster: Reverted back to 6.9.1 Unbound LSI Controller from VMs via Tools/System Devices Deleted vfio-pci.cfg & the .bak file from the flash drive as I had nothing bound to the VMs I am using. I did at one time (see below). I think what happened was I had another video card at 08:00:0 and a third-party USB controller in the original system and I was thinking these might have been my issue when the server was frozen so I powered down and removed them. Possibly looks like the LSI controller grabbed that address? Anyway all is now working, array back on-line, cache drives connected. All is well, much thanks for all the help.
  14. Hi again guys: I am a bit of a noob wrt linux/Unraid. With six missing disks will not be able to start the array and therefore cannot access the VM's. How can I unbind the HBA? Is there a setting in Tools or Settings or must this be done via command line and if so what command would I use? Cheers, S.
  15. Hi Guys: Thanks for the quick response. When I upgraded to 6.9.1 I seem to remember "Fix Common Problems" stating that something to do with VFIO was now a part of 6.9 and no longer needed as a separate "add-on" and suggested removing it. So I removed it. I wonder if doing that remapped the LSI controller to the VM. I will investigate and post an update. Thanks again for all the help. S.
  16. Hi Guys: After upgrading from 6.8.3 to 6.9.1 started getting parity errors (168 total). Check smart status on all drives and all seems OK. The next day the Server locked up. Upon reboot all drives connected to the LSI Controller were shown as missing. Rebooting the server and looking at the LSI boot-up text before Unraid loads, all the missing drives are shown with their correct sizes. Continuing the boot-up into Unraid the drives are still shown missing. Note that all eight drives connected to the Motherboard Sata ports are showing up OK. All this started happening a few days after upgrading to 6.9.1 so not sure if this is just a coincidence or something else. As noted above first indicator was the Parity Check. Have been running successful parity check for well over a year on this server with no errors. I reverted back to 6.8.3 and rebooted but with no joy hoping that might be connected to this issue. Drives connected to the LSI Controller are still missing. BTW the controller is flashed to IT for well over a year at original install with no problems. I do note that upon boot-up the missing drives connected to the LSI controller do flash their drive indicators briefly indicating they are being scanned or so it seems but no data shows up in Unraid on the Main menu. The LSI pre-Unraid Boot shows all drives connected and shows the correct sizes (7-drives). Also noted that upon boot-up and after logging into Unraid both my Cache Drives (Western Digital NVME 1TB) were not show in the Cache and both appeared in Unassigned Devices. Re-assigning them to Cache and rebooting and they again appear in Unassigned Devices and not in the actual cache. I have attached diagnostics, any help or advice would be greatly appreciated. System Specs: Fatal!ty X399 Gaming MB AMD Threadripper 2950 (16C/32T) CPU 32Gb HyperX Memory 3 - Supermicro 5-Drive SATA Drive Cages LSI 9702-8i Drive Controller Zotac Geforce GTX 1050 OC 2Gb Video 1000W EVGA 1000GQ Power Supply Roswill 4U Rackmount Case APC Back-UPS XS 1300 tower-diagnostics-20210326-1305.zip
  17. Sorry for the late reply. I would second Scoopsy13 to snapshot the bios in your particular card and edit as per Spaceinvaders instructions. He also has a video on how to download a program available on Tech Powerup’s website to snapshot your bios and edit it to remove header info. Further note that if you edit the form view of the VM the XML will revert back to default and you will have to make all the changes again. Good luck. Since I have got mine working it has been rock solid even through a number of windows updates and upgrades, flawless.
  18. Further to my post above a clarification, please check the Bus/Slot combo in the edited version and make sure nothing else is occupying that Bus/Slot combo. I did not check it when I did the edit. In other words you cannot have two "cards" inserted into the same Bus/Slot combo in the VM.
  19. I have been working to get my ASUS GTX-1050Ti working for several weeks.It has been an extremely frustrating experience, always a black screen after starting the VM. To note, I did get the VM to sucessfully boot using Windows 10 using VNC as the video driver. Updated windows and enabled Remote Desktop in Windows. This was to ensure I could get to the VM if when I passed through the GTX 1050 I got a black screen. So under VNC graphics everything was working correctly. I tested Remote Desktop and could get to the VM no problem. After all that was working OK I passed through my GTX 1050Ti video card and started the VM. Black Screen! Remoted into the machine using RDP and logged in and saw in device manager that Windows only saw the RDP Display Adapter and not the Nivida graphics card. Therefore I surmised that Windows was not seeing the hardware. BTW check your IOMMU groups to ensure that the video card is shown and in its own IOMMU group along with the audio and optinally usb if your card has that. I closely followed SpaceInvader's videos to no avail (or at least I thought). Going back to recheck everything after several weeks of frustrating effort I finally got it working, There was a lot on the forums and from SpaceInvader as well about the MOBO Bios maybe breaking IOMMU. I tried all valid BIOS's from the oldest supporting my CPU (Threadripper 2950X) to the latest with no positive result. My mobo is a ROG STRIX X399 Gaming-E and I used bios 808 which was on the board when I received it. Anyway I edited the XML per SpaceInvaders instructions to ensure the video card was on the same bus/slot. My comments in the below XML are in curly braces: {An extract from your XML} An extract of your XML (ORIGINAL): <source> <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> {Bare metal machine} </source> <alias name='hostdev0'/> <rom file='/mnt/user/iso/EVGA.RTX2080Ti.11264.181023.rom'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> {Virtual Machine} </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/> {Bare metal machine} </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> {Invalid:RED Color see edit XML below} {VM} As per SpaceInvader's instructions this XML extract is invalid and he says (if I understand him correctly) there is a bug here. Your video card is a multifunction device therefore it as Video/Audio and I think USB, it is performing three functions. The card in the bare metal machine is inserted into one slot, therefore the the card must reside in ONE SLOT in the VM. So your XML above should be edited to the following: <source> <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/> {bare metal} </source> <alias name='hostdev0'/> <rom file='/mnt/user/iso/EVGA.RTX2080Ti.11264.181023.rom'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0' multifunction='on'/> {Video card is a multifunction device} </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x02' slot='0x00' function='0x1'/> {Bare Metal} </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x1'/> {Same slot as the VM for video above and fun=1 for audio} <<<looks like your card is missing the USB port, did you select it in FORM view?>>> Does your card have a usb port? If not disregard. Note that in the VM you have the 2080 sitting in two slots (3 and 6). They must be in oneSlot.Note the multifunction parameter added as well on the VM's bus/slot (make sure a space is included before the "multifunction='on'/>" text. Also make one additional change as recommended by SpaceInvader, make sure your Unraid Server boots in Legacy mode and not UEFI mode. After making the above changes to the XML it did not work but when I booted in Legacy mode it did, at least it worked for me, others here in the forums might have had a different experience. One point of further note, I see that the 2080Ti has a USB 3 port. As you selected the audio card in the form view you need to check further in form view for USB addin and select the USB associated with your 2080Ti. As the video in the vm as Function ='0x0', the audio has Function='0x1', I assume the USB will have Function='0x2' in the same slot as the two previous. If you don't have this setup properly with all three components when you or windows install the Nvida dirvers it is going to be missing the USB and most likely fail. Same will happen if the Card is not shown in the same bus/slot for all three functions, it will most likely fail. SpaceInvader clearly states this. Anyway it worked for me. One final point, you selected an i440 machine which is "correct" for Windows 10. Some have said in the forums that the Q35 machine can give better performance but I could never get it to work in all the troubleshooting I did until... Went into form view on the VM with Q35 selected and reselected everything or minimum make a change to reset the XML. Started the VM, black screen. After much tearing out of hair (I don't have much left) I saw that the XML was showing the Bus=4, Slot=0, function=0 for video on the VM and Bus=5, Slot=0, function=0 for audio on the VM. On a whim I changed to the following: Bus=0, Slot=5, function=0 for video on the VM and Bus=0, Slot=5, function=1 and it worked, Don't ask me why but it worked for me. Maybe someone can elaborate. I tried this with multiple VM's using the Q35 machine and using this parameter I was able to get all of them to work when none of them did previously. Regarding booting in Legacy Mode in Unraid goto MAIN>Flash> and scroll to the bottom and uncheck "Permit UEFI boot mode" after setting you MOBO BIOS to boot into Legacy Mode. Note that I have a single graphics card in Slot 1 of the MOBO with a passed through ROM BIOS. I have worked with that as the simplest configuration to eliminate as many variables as possible. If you have one Graphic card you MUST pass thru the ROM Bios, check with Tech Powerup as SpaceInvader recommends. If your card is still black screen try multiple ROM Bios from TechPowerUP and try all. Also edit the ROM bios to remove the Header Info as per SpaceInvader's instructions otherwise it will fail from the get-go. Or you can download the ROM Bios from your card, a little more complicated but very doable. Anyway the above worked for me after many weeks of trial and error and very sorry for the long post. Hope this helps someone.