Jump to content

Holmesware

Members
  • Posts

    40
  • Joined

  • Last visited

Posts posted by Holmesware

  1. Win 10 update (KB5028166) - uninstall and re-apply - Fixed all my issues

    Note: uninstalling the update then rebooting the system triggered installing the update before the login screen.

    FALSE - This did not happen. The update was removed and stayed removed, but it looks like it will reinstall on the next Windows Update.

     

    This has to do with a SAMBA bug in the Zentyal Domain Controller world. https://forum.zentyal.org/index.php/topic,35598.0.html

     

    - Unraid mounting an SMB share on a Window 10 Workstation

    - Remote Assistance now works when initiated remotely

    - Shared bi-directional USB laser printer now works from remote workstations

     

    The Actual Samba Bug - https://bugzilla.samba.org/show_bug.cgi?id=15418

     

    Solved-ish - Until there is a Samba fix for my domain controller I will not be able to reapply KB5028166 to my workstations.

    SOLVED - Patch for Ubuntu 18.04 LTS Bionc - https://launchpad.net/~ahasenack/+archive/ubuntu/samba-kb5028166/

  2. This happens from 2 different Unraid servers (6.11.5 and 6.12.2) to any Windows 10 Domain connected computer.

     

    Mounting a Windows Share <REMOTE COMPUTER> from an Unraid <SERVER> no longer works.
    Has been working for close to 2 years until now.

     

    <SERVER> kernel: CIFS: Attempting to mount \\<REMOTE COMPUTER>\ServerData
    <SERVER> kernel: CIFS: Status code returned 0xc000018d STATUS_TRUSTED_RELATIONSHIP_FAILURE
    <SERVER> kernel: CIFS: VFS: \\<REMOTE COMPUTER> Send error in SessSetup = -5
    <SERVER> kernel: CIFS: VFS: cifs_mount failed w/return code = -5
    <SERVER> unassigned.devices: SMB 3.1.1 mount failed: 'mount error(5): Input/output error

     

    The mounting script goes through SMB 3.0, 2.0 and 1.0 with the same error.

     

    Lookup up this error:
    0xc000018d STATUS_TRUSTED_RELATIONSHIP_FAILURE 

     

    Comes up with this description.
    The logon request failed because the trust relationship between this workstation and the primary domain failed.

     

    Removing any of these  computers from the domain and rejoining it doesn't fix this.
    The same <REMOTE COMPUTER> (Windows 10) can connect to the an Unraid SMB share with no issues.

    Trying to manually make the connection from the command line generate the same error.

     

    Domain Controller is Zentyal 6.1.6 on Ubuntu 18.04.6 LTS. I have posted this issue in their forums as well.

     

    Other things that have stopped working at roughly the same time.

    - WIndows Remote Assistance - unless it's initiated from the end user with a file or email link.

    - Shared USB printers that require bi-directional communication.

      - Basic Kyocera Laser printer - no longer working. 

      - Zebra Label printer - Works fine.

  3. The Whole Picture:

     

    On Server

    ZFS Data set = pool/data

    Mount Point = /pool/data

     

    1. Make sure sharenfs=off (zfs set sharenfs=off pool/data)

    2. edit /etc/exports, add line "/pool/data" -async,no_subtree_check,fsid=111 *(sec=sys,rw,no_root_squash,no_all_squash)"

    Note: fsid has to be a unique number for each share. Unraid starts them at 100

    3. restart nfsd (/etc/rc.d/rc.nfsd restart)

     

    Make a script that inserts the line in step 2 into the /etx/exports file on boot.

     

    On Client

    Use the Unraid GUI on the Main tab to mount and NFS share

     

    Yes it connects

     

    My Issue: Permissions of files and folders on Client (all files 777 nobody:users) do not reflect those on the Server (770 <user name>:domain users>

    Is this something NFSv4 can solve?

     

    Making it an SMB share has the same issue with permissions.

    Both Client and Server connected to the same domain.

     

  4. I'll give this a try on Thrusday when I stay after work to do server updates. It may solve an issue with VM's in Unraid not responding to logitech wireless keyboard and mouse when the monitor goes to sleep. The current solution is to move the usb receiver to a different port on the usb hub and they system wakes up and responds. This also happens intermittantly so it's been hard to troubleshoot.

  5. It a 5 desktop system for in a helicopter hangar. We needed a mobile, small foot printer with as few wires as possible. WiFi sucks around large metal objects. So I came up with this. It has 1 power cord and 1 ethernet cord. The Techs working on the helicopter can have their workstation with manuals, drawings, our managment software, email, etc. all right next to whichever aircraft they are working on. The UPS allows them to unplug and move without powering down, the printer is right there too.  I've got a second one running with the USB Manager plugin and now the guys can plug in their camera's and phones to get pics off them as well. It's really working well.

  6. It would definatly be used becuase I've got 5 VM's with a 4 Port USB 3.0 hub each, 2 mappings per port = 40 mapped ports. 

    I also found that if you have more than say 8 ports mapped, the unRAID gui button for going to the top of the page interfears with the X to remove the mapping on the last mapping entry. It's still able to be selected, you just need very fine mouse control. See if you can replicate this. Thanks.

  7. I have the Hydra2 machine in my office and plugged a USB 3.0 card in the open PCIe slot. It came up on it's own IOMMU group.

    image.thumb.png.a0ae2ad301516eb14f74695dca1afb77.png

     

    I can now confirm that all PCIe slots come up in their own IOMMU group.

    Testing the Video Card issue in this slot that I was having before. Trying to disable the auto SLI bridging in the BIOS.

     

    UPDATE: can not get passthrough to work with video cards in both PCIe x16 slots. Motherboard trys to run both cards in SLI mode. Solution: run anything but a video card in the second PCIe x16 slot.

  8. Here is my output from the Tools > PCI Devices and IOMMU Groups

    I'm pretty sure each PCIe slot is in it's own IOMMU group and I'm NOT using the ACS patch.

     

    Note that due to something with the motherboard BIOS, I'm unable to pass through the video cards in the PCIe x16 slots to different VM's. Something to do with and auto SLI system in the BIOS.

     

    I have another unit with the same motherboard that I'm trying to repeat the process with. If I figure out how to disable the auto SLI feature I'll let you know.

    I was having success with having an AMD video card and nVidia card in these slots becuase they shouldn't be SLI compatible but it caused more errors than just having another nvidia card in the second PCIe x16 slot and not using it.

     

    My current setup has been so stable that I forget about it most of the time.

     

    IOMMU group 0:    [1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 1:    [1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    IOMMU group 2:    [1022:1482] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 3:    [1022:1482] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 4:    [1022:1483] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    IOMMU group 5:    [1022:1483] 00:03.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge
    IOMMU group 6:    [1022:1482] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 7:    [1022:1482] 00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 8:    [1022:1482] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 9:    [1022:1484] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
    IOMMU group 10:    [1022:1482] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
    IOMMU group 11:    [1022:1484] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]
    IOMMU group 12:    [1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)
    [1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
    IOMMU group 13:    [1022:1440] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0
    [1022:1441] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1
    [1022:1442] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2
    [1022:1443] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3
    [1022:1444] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4
    [1022:1445] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5
    [1022:1446] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6
    [1022:1447] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7
    IOMMU group 14:    [1022:57ad] 01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream
    IOMMU group 15:    [1022:57a3] 02:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 16:    [1022:57a3] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 17:    [1022:57a3] 02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 18:    [1022:57a3] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 19:    [1022:57a3] 02:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 20:    [1022:57a3] 02:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    IOMMU group 21:    [1022:57a4] 02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    [1022:1485] 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
    [1022:149c] 09:00.1 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
    [1022:149c] 09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
    IOMMU group 22:    [1022:57a4] 02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    [1022:7901] 0a:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
    IOMMU group 23:    [1022:57a4] 02:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge
    [1022:7901] 0b:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
    IOMMU group 24:    [144d:a808] 03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
    IOMMU group 25:    [10de:128b] 04:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
    [10de:0e0f] 04:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
    IOMMU group 26:    [10de:128b] 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
    [10de:0e0f] 05:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
    IOMMU group 27:    [10de:128b] 06:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
    [10de:0e0f] 06:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
    IOMMU group 28:    [8086:1539] 07:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)
    IOMMU group 29:    [10de:128b] 08:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
    [10de:0e0f] 08:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
    IOMMU group 30:    [10de:128b] 0c:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)
    [10de:0e0f] 0c:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)
    IOMMU group 31:    [1002:6779] 0d:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM]
    [1002:aa98] 0d:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Caicos HDMI Audio [Radeon HD 6450 / 7450/8450/8490 OEM / R5 230/235/235X OEM]
    IOMMU group 32:    [1022:148a] 0e:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
    IOMMU group 33:    [1022:1485] 0f:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
    IOMMU group 34:    [1022:1486] 0f:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
    IOMMU group 35:    [1022:149c] 0f:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
    IOMMU group 36:    [1022:1487] 0f:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller

     

  9. Finally found this, looks like I got a bad stick of RAM.

    I'm running ECC RAM and reseated the ram during the first server crash.

    memtest didn't show anything after a quick run, didn't have time do a full test.

    Heat is not an issue with my setup. I have a good quality 750W PSU.

    Going to swap the DIMM on channel 0 with the one in channel 3 and see if this shows up again.

     

    Mar  4 07:03:17 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:04:56 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:15:59 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:19:42 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:22:31 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:23:28 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:25:47 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:26:18 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:26:38 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:55:16 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:58:00 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 07:59:02 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:01:05 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:09:12 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:09:58 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:10:03 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:18:00 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:21:35 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)
    Mar  4 08:21:56 kernel: EDAC MC0: 1 CE ie31200 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x0 offset:0x0 grain:8 syndrome:0x52)

     

    Edit: To help find what DIMM is having the error:

    root@system~: grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count

    /sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:34 <- ERROR COUNT /mc0/csrow1/ch0
    /sys/devices/system/edac/mc/mc0/csrow1/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow2/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow3/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow3/ch1_ce_count:0

     

    mcX = Memory Controller (single, dual CPU)

    chX = Channel 0, Channel 1, Channel 3 (single, dual, triple Channel RAM)

    csrowX = see chart

     

                    Channel 0     Channel 1     Channel 3
    ============================================
    csrow0  |  DIMM_A0   |   DIMM_B0   |   DIMM_C0   |
    csrow1   |  DIMM_A0   |   DIMM_B0   |   DIMM_C0   |
    ============================================
    ============================================
    csrow2   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    csrow3   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    ============================================

    ============================================
    csrow4   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    csrow5   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    ============================================

    ============================================
    csrow6   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    csrow7   |  DIMM_A1   |   DIMM_B1   |   DIMM_C0   |
    ============================================

     

    root@system~: dmidecode -t memory | grep 'Locator'

    Locator: DIMMA1 <- THIS ONE - DIMM_A0
    Bank Locator: P0_Node0_Channel0_Dimm0
    Locator: DIMMA2
    Bank Locator: P0_Node0_Channel0_Dimm1
    Locator: DIMMB1
    Bank Locator: P0_Node0_Channel1_Dimm0
    Locator: DIMMB2
    Bank Locator: P0_Node0_Channel1_Dimm1

     

    EDIT: Moved the stick of RAM and got an error in another slot. Ordering new stick of RAM.

    /sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow1/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow1/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow2/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow2/ch1_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow3/ch0_ce_count:0
    /sys/devices/system/edac/mc/mc0/csrow3/ch1_ce_count:1 <- NEW ERROR

     

    EDIT2: 

    /sys/devices/system/edac/mc/mc0/csrow3/ch1_ce_count:2 <- NEW ERROR

    Errors have slowed but at least I have a second error now. RAM incoming.

     

    EDIT3:
    replaced defective DIMM, 4 days no error. Turned on WDS. Watching Logs and CPU useage. WDS script still running.

     

    EDIT4:

    No more memory errors and WDS is running without eating a full CPU. Calling this solved.

     

     

  10. This is the issue I'm having, 1 cpu pegged at 100%, wdss is the process, this goes on until the kernel start panicing or running out of memory.

    Trying the restart script at the end of the thread and will report back.

     

    https://forums.unraid.net/topic/85073-wsdd-100-using-1-core/page/2/

     

    EDIT: Script did not reset the 100% cpu usage. Disabled WDS. Kept script running for now.

  11. I'm not good with decoding syslog kernel panics and need some help please.

     

    System Specs:

    Supermicro X11SSH-LN4F - c states disable in bios

    Xeon E3-1240 v6

    64 GB DDR4 ECC

     

    2 ZFS pools, one Nytro SSD pool and one HDD pool

    UNRAID disk is on an nvme, using a Samsung Fit flash drive.

    No longer running VMs or Dockers but ran stable with them.

     

    Got some remote replication to another machine.

    Enabled saving syslog to the flash drive and waited for another failure.

     

    Found a Kernel panic Feb 23rd 1412 and repeats going forward. The system worked but cranky.

    System gets worse around Feb 24th 1400.

    Out of memory errors at Feb 24th 1408 and repeats. kernel: Out of memory: Kill process 12302 (monitor) score 0 or sacrifice child 

    System required hard reboot Feb 24th 1443.

     

    Been digging through forums and google and come up with possible bad RAM issues but it's ECC RAM.

    Possible left over routes from VMs and Dockers. Easy to remote but can't see that doing it.

    Takes about 6 days to blow up. Currently rebooting on day 4.

     

    There is a record of a boot where the two pools tried mounting to the same mount point. Ignore that, it got fixed.

    Another Kernel panic at Feb 24th 17:24 but it doesn't repeat. System stable from this point on.

     

    Thanks in advance.

     

    syslog

  12. What I'm trying to do and mostly successful at this point is create an UNRAID box with 5 GPU's, monitors, keyboards and mice so I can have multiple, physical Windows 10 VM's running at once.

    I've got an ASUS PRIME X-570-PRO board becuase it has 6 PCIe Slots, with a Ryzen 9 3900X CPU and 32 GB of RAM and 6 nVidia GT 710 PCIe 1x video cards with good HDMI cables.

    Storage is a single Samsung 512 NVMe. PSU is a Corsair RM850x. Case is a rack mount 4U with plenty of fans. BIOS on the Motherboard is latest stable release.

    I've read LOTS of forums to ge to this point so let me list the sucesses so far:

     

    Succeses:

    - All Video cards are in their own IOMMU groups with their respective sound devices

    EDIT: See post below about USB Manager - skips this whole USB XML mess

    - Using Logitech Wireless Keyboard and Mice with a USB 3.0 7 port hub, I was able to map the Logitech Unifier based on it's Address Bus and Device Number.

    The sucky thing is that if you make a change to the VM, you have to uncheck all the Logitech devices, save it, then add the XML code manually afterwards.

    xml example

    <hostdev mode='subsystem' type='usb' managed='no'>
          <source>
            <vendor id='0x046d'/> *** This Number is the same for every Logitech Unifier
            <product id='0xc52b'/> *** This Number as well
            <address bus='5' device='4'/> *** This is were the magic is.

          </source>
          <address type='usb' bus='0' port='1'/> *** This I believe related to the USB Hub.
        </hostdev>

    - Dumping the BIOS of all the video cards. Even though they are all the same, you need a seperate file for each VM

    - Creating a Windows 10 version 1909 VM with hardware GPU passthough and a specific Logitech Keyboard and Mouse works just fine.

    - I can even use the Primary Video Card as a pass though so I don't need 6 video cards, I only need 5 video cards.

     

    Where It Fails.

    - The 5th VM just doesn't want to accept the driver for the video card. It'll run at the standard resolution of a basic video adapter (800x600) and the device manager has an exclamation mark on the GT 710 saying there was an issue and shut it down. In the Device Listing below it shows cards at addresses:

    04:00.0 - Works

    05:00.0 - Works

    06:00.0 - Works (this one usually flakes out)

    08:00.0 - Works

    0c:00.0 - Works (Primary Video Card)

    Edit: This is the first time I've had it boot with this hardware addressing scheme and I think it's working.

     

    What I have tried.

    - Swapping all video card around. (possible bad card)

    - Dummping the specific bios out of each card and matching them up their serial number. (the serial numbers are almost sequential)

    - Going from 6 Cards to 5 Cards (using the primary card as passthough)

    - Put in a GT 730 video card (which works but still number 5 is not alive) 

     

    What I have noticed.

    - The problem gets worse if you add the sound devices to the VM's. (I only get 2 VM to work at proper resolution)

     

    What I'm yet to try.

    - Now that I have 5 video cards I'm going to cycle through the empty slot. (Change which slot is empty)

     

    The Ask: HELP! Edit: I helped my self by moving the video cards around.

    I've posted the System Information and one of the VMs XML files. I hope I'm missing something.

     

    The Purpose: Will be revealed upon success. It's 2 parts Fun and 1 part Work.

    Edit: It's a Mobile 5 Workstation machine for light duty work on a hanger floor. 1 power cord, 1 ethernet cable. The fun part is where I load up a random 4-5 player retor arcade game for Fridays (Bomberman, Gauntlet, etc)

     

    SOLUTION: Move cards around and try different address configurations until one works.

    SOLVED: Not until this thing is stable.

    SECOND ASK: ANY IDEA WHY?

     

    Thank you,

    Holmesware

     

    IOMMU group 0:[1022:1482] 00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 1:[1022:1483] 00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge

    IOMMU group 2:[1022:1482] 00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 3:[1022:1482] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 4:[1022:1483] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge

    IOMMU group 5:[1022:1482] 00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 6:[1022:1482] 00:05.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 7:[1022:1482] 00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 8:[1022:1484] 00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]

    IOMMU group 9:[1022:1482] 00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge

    IOMMU group 10:[1022:1484] 00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Internal PCIe GPP Bridge 0 to bus[E:B]

    IOMMU group 11:[1022:790b] 00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 61)

    [1022:790e] 00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)

    IOMMU group 12:[1022:1440] 00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0

    [1022:1441] 00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1

    [1022:1442] 00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2

    [1022:1443] 00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3

    [1022:1444] 00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4

    [1022:1445] 00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5

    [1022:1446] 00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6

    [1022:1447] 00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7

    IOMMU group 13:[1022:57ad] 01:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse Switch Upstream

    IOMMU group 14:[1022:57a3] 02:01.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 15:[1022:57a3] 02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 16:[1022:57a3] 02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 17:[1022:57a3] 02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 18:[1022:57a3] 02:05.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 19:[1022:57a3] 02:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    IOMMU group 20:[1022:57a4] 02:08.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:1485] 09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP

    [1022:149c] 09:00.1 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller

    [1022:149c] 09:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller

    IOMMU group 21:[1022:57a4] 02:09.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:7901] 0a:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

    IOMMU group 22:[1022:57a4] 02:0a.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Matisse PCIe GPP Bridge

    [1022:7901] 0b:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)

    IOMMU group 23:[144d:a808] 03:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

    IOMMU group 24:[10de:128b] 04:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)

    [10de:0e0f] 04:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

    IOMMU group 25:[10de:128b] 05:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)

    [10de:0e0f] 05:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

    IOMMU group 26:[10de:128b] 06:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)

    [10de:0e0f] 06:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

    IOMMU group 27:[8086:1539] 07:00.0 Ethernet controller: Intel Corporation I211 Gigabit Network Connection (rev 03)

    IOMMU group 28:[10de:128b] 08:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)

    [10de:0e0f] 08:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

    IOMMU group 29:[10de:128b] 0c:00.0 VGA compatible controller: NVIDIA Corporation GK208B [GeForce GT 710] (rev a1)

    [10de:0e0f] 0c:00.1 Audio device: NVIDIA Corporation GK208 HDMI/DP Audio Controller (rev a1)

    IOMMU group 30:[1022:148a] 0d:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function

    IOMMU group 31:[1022:1485] 0e:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP

    IOMMU group 32:[1022:1486] 0e:00.1 Encryption controller: Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP

    IOMMU group 33:[1022:149c] 0e:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller

    IOMMU group 34:[1022:1487] 0e:00.4 Audio device: Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller

     

    <?xml version='1.0' encoding='UTF-8'?>
    <domain type='kvm'>
      <name>Windows10 2</name>
      <uuid>0f65f96e-2665-8b69-9f1e-25af41d896cf</uuid>
      <description>Hydra2</description>
      <metadata>
        <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
      </metadata>
      <memory unit='KiB'>8388608</memory>
      <currentMemory unit='KiB'>8388608</currentMemory>
      <memoryBacking>
        <nosharepages/>
      </memoryBacking>
      <vcpu placement='static'>4</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='2'/>
        <vcpupin vcpu='1' cpuset='14'/>
        <vcpupin vcpu='2' cpuset='3'/>
        <vcpupin vcpu='3' cpuset='15'/>
      </cputune>
      <os>
        <type arch='x86_64' machine='pc-q35-4.2'>hvm</type>
        <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
        <nvram>/etc/libvirt/qemu/nvram/0f65f96e-2665-8b69-9f1e-25af41d896cf_VARS-pure-efi.fd</nvram>
      </os>
      <features>
        <acpi/>
        <apic/>
        <hyperv>
          <relaxed state='on'/>
          <vapic state='on'/>
          <spinlocks state='on' retries='8191'/>
          <vendor_id state='on' value='none'/>
        </hyperv>
      </features>
      <cpu mode='host-passthrough' check='none'>
        <topology sockets='1' cores='2' threads='2'/>
        <cache mode='passthrough'/>
        <feature policy='require' name='topoext'/>
      </cpu>
      <clock offset='localtime'>
        <timer name='hypervclock' present='yes'/>
        <timer name='hpet' present='no'/>
      </clock>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>restart</on_crash>
      <devices>
        <emulator>/usr/local/sbin/qemu</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/disk1/domains/Windows10-gfx08_2/vdisk1.img'/>
          <target dev='hdc' bus='virtio'/>
          <boot order='1'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/virtio-win-0.1.173-2.iso'/>
          <target dev='hdb' bus='sata'/>
          <readonly/>
          <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <controller type='usb' index='0' model='qemu-xhci' ports='15'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
        </controller>
        <controller type='pci' index='0' model='pcie-root'/>
        <controller type='pci' index='1' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='1' port='0x8'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
        </controller>
        <controller type='pci' index='2' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='2' port='0x9'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0xa'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='4' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='4' port='0xb'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
        </controller>
        <controller type='pci' index='5' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='5' port='0xc'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
        </controller>
        <controller type='pci' index='6' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='6' port='0xd'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
        </controller>
        <controller type='pci' index='7' model='pcie-to-pci-bridge'>
          <model name='pcie-pci-bridge'/>
          <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
        </controller>
        <controller type='virtio-serial' index='0'>
          <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
        </controller>
        <controller type='sata' index='0'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
        </controller>
        <interface type='bridge'>
          <mac address='52:54:00:3b:f9:82'/>
          <source bridge='br0'/>
          <model type='virtio'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
        </interface>
        <serial type='pty'>
          <target type='isa-serial' port='0'>
            <model name='isa-serial'/>
          </target>
        </serial>
        <console type='pty'>
          <target type='serial' port='0'/>
        </console>
        <channel type='unix'>
          <target type='virtio' name='org.qemu.guest_agent.0'/>
          <address type='virtio-serial' controller='0' bus='0' port='1'/>
        </channel>
        <input type='mouse' bus='ps2'/>
        <input type='keyboard' bus='ps2'/>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0b' slot='0x00' function='0x0'/>
          </source>
          <rom file='/mnt/disk1/isos/Hydra2-710.rom'/>
          <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='usb' managed='no'>
          <source>
            <vendor id='0x046d'/>
            <product id='0xc52b'/>
            <address bus='5' device='3'/>
          </source>
          <address type='usb' bus='0' port='1'/>
        </hostdev>
        <memballoon model='none'/>
      </devices>
    </domain>

     

    • Like 1
×
×
  • Create New...