Jump to content

eribob

Members
  • Posts

    100
  • Joined

  • Last visited

Posts posted by eribob

  1. 37 minutes ago, JorgeB said:

    Diags are after rebooting so we can't see what happened, but CRC errors are a connection problem, usually a bad SATA cable.

    Since replacing the SATA cable did not help is there anything else in the SMART report that indicates why the drive failed? The drive is a Seagate IronWolf 4TB.

     

    #	ATTRIBUTE NAME	FLAG	VALUE	WORST	THRESHOLD	TYPE	UPDATED	FAILED	RAW VALUE
    1	Raw read error rate	0x000f	077	064	044	Pre-fail	Always	Never	55465484
    3	Spin up time	0x0003	095	093	000	Pre-fail	Always	Never	0
    4	Start stop count	0x0032	100	100	020	Old age	Always	Never	132
    5	Reallocated sector count	0x0033	100	100	010	Pre-fail	Always	Never	0
    7	Seek error rate	0x000f	089	060	045	Pre-fail	Always	Never	803115697
    9	Power on hours	0x0032	088	088	000	Old age	Always	Never	11183 (164 86 0)
    10	Spin retry count	0x0013	100	100	097	Pre-fail	Always	Never	0
    12	Power cycle count	0x0032	100	100	020	Old age	Always	Never	132
    184	End-to-end error	0x0032	100	100	099	Old age	Always	Never	0
    187	Reported uncorrect	0x0032	100	100	000	Old age	Always	Never	0
    188	Command timeout	0x0032	100	099	000	Old age	Always	Never	1
    189	High fly writes	0x003a	100	100	000	Old age	Always	Never	0
    190	Airflow temperature cel	0x0022	071	062	040	Old age	Always	Never	29 (min/max 29/30)
    191	G-sense error rate	0x0032	100	100	000	Old age	Always	Never	0
    192	Power-off retract count	0x0032	100	100	000	Old age	Always	Never	3
    193	Load cycle count	0x0032	071	071	000	Old age	Always	Never	59695
    194	Temperature celsius	0x0022	029	040	000	Old age	Always	Never	29 (0 17 0 0 0)
    197	Current pending sector	0x0012	100	100	000	Old age	Always	Never	0
    198	Offline uncorrectable	0x0010	100	100	000	Old age	Offline	Never	0
    199	UDMA CRC error count	0x003e	200	199	000	Old age	Always	Never	42
    240	Head flying hours	0x0000	100	253	000	Old age	Offline	Never	3737 (41 241 0)
    241	Total lbas written	0x0000	100	253	000	Old age	Offline	Never	30794778888
    242	Total lbas read	0x0000	100	253	000	Old age	Offline	Never	237940858011

     

    /Erik

  2. Thank you for the suggestion, 

    Unfortunately I tried to replace the SATA cable (it is connected to an HBA card, so I switched it to another SATA connector from that card). The drive is still disabled by unraid. All other drives that are connected to that HBA card seem to be working normally. 

     

    The UDMA CRC Error count has increased over the last couple of months from about 5 to 42 now. 

     

    /Erik

  3. Hi! 

    One of my drives failed today. I have attached the diagnostics. The SMART error I think is the culprit is "UDMA CRC Error Count". It seems to not always be so serous though, do you have any thoughts on it?

     

    I have more space than I need on the array and would like to remove the drive from the array, but keep the data that was on it. The drive is 4TB and I have more than 4TB free on the array. The guide in the Unraid wiki is a bit confusing, I would be very happy if someone could walk me through how to achieve this? 

     

    Thank you!

    Erik

    monsterservern-diagnostics-20210419-1028.zip

  4. Same problem here. 

     

    System log is suddenly full of these messages: 

    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [alert] 7525#7525: worker process 21758 exited on signal 11
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [crit] 21883#21883: ngx_slab_alloc() failed: no memory
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21883#21883: shpool alloc failed
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21883#21883: nchan: Out of shared memory while allocating channel /var. Increase nchan_max_reserved_memory.
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [alert] 21883#21883: *483824 header already sent while keepalive, client: 192.168.2.165, server: 0.0.0.0:80
    Nov 29 10:05:45 Monsterservern kernel: nginx[21883]: segfault at 0 ip 0000000000000000 sp 00007fff9d8f5f58 error 14 in nginx[400000+22000]
    Nov 29 10:05:45 Monsterservern kernel: Code: Bad RIP value.
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [alert] 7525#7525: worker process 21883 exited on signal 11
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [crit] 21884#21884: ngx_slab_alloc() failed: no memory
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: shpool alloc failed
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: nchan: Out of shared memory while allocating channel /disks. Increase nchan_max_reserved_memory.
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: *483826 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/disks?buffer_length=1 HTTP/1.1", host: "localhost"
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [crit] 21884#21884: ngx_slab_alloc() failed: no memory
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: shpool alloc failed
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: nchan: Out of shared memory while allocating channel /cpuload. Increase nchan_max_reserved_memory.
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: *483827 nchan: error publishing message (HTTP status code 507), client: unix:, server: , request: "POST /pub/cpuload?buffer_length=1 HTTP/1.1", host: "localhost"
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [crit] 21884#21884: ngx_slab_alloc() failed: no memory
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: shpool alloc failed
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [error] 21884#21884: nchan: Out of shared memory while allocating channel /dockerload. Increase nchan_max_reserved_memory.
    Nov 29 10:05:45 Monsterservern nginx: 2020/11/29 10:05:45 [alert] 21884#21884: *483828 header already sent while keepalive, client: 192.168.2.165, server: 0.0.0.0:80
    Nov 29 10:05:45 Monsterservern kernel: nginx[21884]: segfault at 0 ip 0000000000000000 sp 00007fff9d8f5f58 error 14 in nginx[400000+22000]

    On the dashboard the memory for the log is 100% full. 

    monsterservern-diagnostics-20201129-1235.zip

  5. Update! 

    I ran a Memtest and after about 15 minutes I got a lot of errors. So I removed my two oldest RAM-sticks and re-ran the test for about 25 minutes without error. I know that is a bit short (not even one pass hehe) but I figured that since I got the errors so soon the first time I would get them again if the remaining RAM-sticks were the faulty ones. 

     

    So it was probably a memory issue? I just hope that I will not get any more corruption in my BTRFS now... fingers crossed. 

     

    I also ran "btrfs check --readonly /dev/nvme0n1p1" and  "btrfs check --readonly /dev/nvme1n1p1" (the two disks that are part of the BTRFS pool in question) and got no errors. Can I then assume that my BTRFS filesystem is intact for that pool? 

     

    BIG thanks!

     

    /Erik

  6. Hi,

    The solution worked for a couple of days, but just now one of my BTRFS pools again went into read only mode. I changed my RAM to 2133MHz (the "auto" setting in BIOS). 

     

    The system log says the following: 

    Nov 22 19:44:56 Monsterservern kernel: BTRFS error (device nvme0n1p1): block=1141445836800 write time tree block corruption detected
    Nov 22 19:44:56 Monsterservern kernel: BTRFS: error (device nvme0n1p1) in btrfs_commit_transaction:2323: errno=-5 IO failure (Error while writing out transaction)
    Nov 22 19:44:56 Monsterservern kernel: BTRFS info (device nvme0n1p1): forced readonly
    Nov 22 19:44:56 Monsterservern kernel: BTRFS warning (device nvme0n1p1): Skipping commit of aborted transaction.
    Nov 22 19:44:56 Monsterservern kernel: BTRFS: error (device nvme0n1p1) in cleanup_transaction:1894: errno=-5 IO failure

     

    Diagnostics are attached.

     

    What is the problem? It is really annoying now...

     

    /Erik

    monsterservern-diagnostics-20201122-1953.zip

  7. Wow that is great help! I was wondering why these issues were building up. So since I run 4 RAM sticks I should limit them to 2667? I guess both pools need to be reformatted then. Is it worth trying to do a "btrfs check --repair" first? It seems that it can corrupt your pool, but I have nothing to loose if I am about to wipe it anyway? In that case, can you give me an example of how to run such a command? 

     

    Also, what is the easiest way to format the cache pool? 

     

    Thanks! 

    Erik

  8. My cache array went to "read only file system" today and the following message is repeated many times in the system log: 

    Nov 14 21:18:16 Monsterservern kernel: blk_update_request: I/O error, dev loop2, sector 6665664 op 0x1:(WRITE) flags 0x100000 phys_seg 4 prio class 0
    Nov 14 21:18:16 Monsterservern kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 136, rd 0, flush 0, corrupt 0, gen 0

    Is the BTRFS corrupted? Why did it happen? How can I fix it?

     

    The exact same thing actually happened to my other BTRFS pool today as well. I restarted the server and it works normally again, but after running a BTRFS scrub I get "uncorrectable errors" in that pool: 

    UUID:             109edb7d-32a7-4c8c-9dfd-d8901216e5e1
    Scrub started:    Sat Nov 14 09:47:37 2020
    Status:           finished
    Duration:         0:03:25
    Total to scrub:   786.37GiB
    Rate:             3.83GiB/s
    Error summary:    csum=6
      Corrected:      0
      Uncorrectable:  6
      Unverified:     0

    Attached diagnostics.

    cache_filesystem_corrupted_201114.zip

  9. 10 hours ago, Decto said:

    Have a look here, poster seems to have the same board working, it may help with the USB.

     

    Integrated USB can be a challenge to pass through fully, often people fall back to an add in card.

     

     

     

    Edit

     

    Also seems to be an AMD bug with passthrough 

     

     

     

    Thanks! I have seen the taichi guide and followed it. My bios is updated (a couple of months ago when I bought it) and I did the settings and xml he recommended. 
     

    the other thread about the 5.8 kernel was very interesting though! I have also tried passing through onboard audio before and it resulted in the server hanging completely as well. 
     

     I would like to try to upgrade but I saw that some people have troubles passing through hard drives via id (/dev/disk/by-id/XX) after upgrading so I would like to wait since I use that for my VM. Also I have a LSI 9211-8i SAS controller and some SAS-controllers stop working with beta 29 (not sure about mine though). 

  10. 49 minutes ago, Decto said:

    To the best of my knowledge.

     

    If you want to install drivers for the mouse then you need to isolate and pass through a USB controller which you can then connect a  mouse to and install device drivers. Same for the WIFI though that is likely to be difficult to isolate depending on your IOMMU groups.

     

    Unless the devices are isolated, linux drivers are in use so while the VM will translate the function, you cannot install a second layer of device drivers inside the VM.

     

     

     

     

     

     

    Thank you. Both the usb and wifi controllers are isolated and passed through to the VM. The mouse is connected to the passed through usb ports.

  11. I have a windows 10 VM with a logitech mouse attached to it via usb pass-through. It works fine, but when I try to install the drivers the VM hangs with a blue screen. The same problem occured earlier when I tried to install drivers for wifi/bluetooth that is passed through from the motherboard. 

     

    Attached diagnostics

     

    Here is my VM XML

    <?xml version='1.0' encoding='UTF-8'?>
    <domain type='kvm' id='5' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
      <name>Windows</name>
      <uuid>8dcf9d3b-6010-947b-543a-6216bf778f0b</uuid>
      <description>Main computer</description>
      <metadata>
        <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
      </metadata>
      <memory unit='KiB'>33554432</memory>
      <currentMemory unit='KiB'>33554432</currentMemory>
      <memoryBacking>
        <nosharepages/>
      </memoryBacking>
      <vcpu placement='static'>16</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='8'/>
        <vcpupin vcpu='1' cpuset='24'/>
        <vcpupin vcpu='2' cpuset='9'/>
        <vcpupin vcpu='3' cpuset='25'/>
        <vcpupin vcpu='4' cpuset='10'/>
        <vcpupin vcpu='5' cpuset='26'/>
        <vcpupin vcpu='6' cpuset='11'/>
        <vcpupin vcpu='7' cpuset='27'/>
        <vcpupin vcpu='8' cpuset='12'/>
        <vcpupin vcpu='9' cpuset='28'/>
        <vcpupin vcpu='10' cpuset='13'/>
        <vcpupin vcpu='11' cpuset='29'/>
        <vcpupin vcpu='12' cpuset='14'/>
        <vcpupin vcpu='13' cpuset='30'/>
        <vcpupin vcpu='14' cpuset='15'/>
        <vcpupin vcpu='15' cpuset='31'/>
      </cputune>
      <resource>
        <partition>/machine</partition>
      </resource>
      <os>
        <type arch='x86_64' machine='pc-q35-5.0'>hvm</type>
        <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
        <nvram>/etc/libvirt/qemu/nvram/8dcf9d3b-6010-947b-543a-6216bf778f0b_VARS-pure-efi.fd</nvram>
      </os>
      <features>
        <acpi/>
        <apic/>
        <hyperv>
          <relaxed state='on'/>
          <vapic state='on'/>
          <spinlocks state='on' retries='8191'/>
          <vendor_id state='on' value='none'/>
        </hyperv>
      </features>
      <cpu mode='host-passthrough' check='none'>
        <topology sockets='1' dies='1' cores='8' threads='2'/>
        <cache mode='passthrough'/>
        <feature policy='require' name='topoext'/>
      </cpu>
      <clock offset='localtime'>
        <timer name='hypervclock' present='yes'/>
        <timer name='hpet' present='no'/>
      </clock>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>restart</on_crash>
      <devices>
        <emulator>/usr/local/sbin/qemu</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/user/VM_disks/Windows/windows_disk.img' index='5'/>
          <backingStore/>
          <target dev='hdc' bus='virtio'/>
          <boot order='1'/>
          <alias name='virtio-disk2'/>
          <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/Win10_1809Oct_v2_Swedish_x64.iso' index='4'/>
          <backingStore/>
          <target dev='hda' bus='sata'/>
          <readonly/>
          <boot order='2'/>
          <alias name='sata0-0-0'/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/virtio-win-0.1.173-2.iso' index='3'/>
          <backingStore/>
          <target dev='hdb' bus='sata'/>
          <readonly/>
          <alias name='sata0-0-1'/>
          <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <disk type='block' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source dev='/dev/disk/by-id/ata-Samsung_SSD_840_EVO_250GB_S1DBNSBF464063N' index='2'/>
          <backingStore/>
          <target dev='hdd' bus='sata'/>
          <alias name='sata0-0-3'/>
          <address type='drive' controller='0' bus='0' target='0' unit='3'/>
        </disk>
        <disk type='block' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source dev='/dev/disk/by-id/ata-INTEL_SSDSC2CT180A3_CVMP2224015B180CGN' index='1'/>
          <backingStore/>
          <target dev='hde' bus='sata'/>
          <alias name='sata0-0-4'/>
          <address type='drive' controller='0' bus='0' target='0' unit='4'/>
        </disk>
        <controller type='usb' index='0' model='qemu-xhci' ports='15'>
          <alias name='usb'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
        </controller>
        <controller type='pci' index='0' model='pcie-root'>
          <alias name='pcie.0'/>
        </controller>
        <controller type='pci' index='1' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='1' port='0x8'/>
          <alias name='pci.1'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
        </controller>
        <controller type='pci' index='2' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='2' port='0x9'/>
          <alias name='pci.2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0xa'/>
          <alias name='pci.3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='4' model='pcie-to-pci-bridge'>
          <model name='pcie-pci-bridge'/>
          <alias name='pci.4'/>
          <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
        </controller>
        <controller type='pci' index='5' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='5'/>
          <alias name='pci.5'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x0'/>
        </controller>
        <controller type='pci' index='6' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='6'/>
          <alias name='pci.6'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x02' function='0x0'/>
        </controller>
        <controller type='pci' index='7' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='7'/>
          <alias name='pci.7'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x03' function='0x0'/>
        </controller>
        <controller type='pci' index='8' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='8'/>
          <alias name='pci.8'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x04' function='0x0'/>
        </controller>
        <controller type='pci' index='9' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='9'/>
          <alias name='pci.9'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x05' function='0x0'/>
        </controller>
        <controller type='pci' index='10' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='10' port='0xb'/>
          <alias name='pci.10'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
        </controller>
        <controller type='pci' index='11' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='11' port='0xc'/>
          <alias name='pci.11'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
        </controller>
        <controller type='pci' index='12' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='12' port='0xd'/>
          <alias name='pci.12'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
        </controller>
        <controller type='pci' index='13' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='13' port='0xe'/>
          <alias name='pci.13'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
        </controller>
        <controller type='pci' index='14' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='14' port='0xf'/>
          <alias name='pci.14'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
        </controller>
        <controller type='virtio-serial' index='0'>
          <alias name='virtio-serial0'/>
          <address type='pci' domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/>
        </controller>
        <controller type='sata' index='0'>
          <alias name='ide'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
        </controller>
        <interface type='bridge'>
          <mac address='52:54:00:99:fc:d9'/>
          <source bridge='br0'/>
          <target dev='vnet4'/>
          <model type='virtio-net'/>
          <alias name='net0'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x07' function='0x0'/>
        </interface>
        <serial type='pty'>
          <source path='/dev/pts/4'/>
          <target type='isa-serial' port='0'>
            <model name='isa-serial'/>
          </target>
          <alias name='serial0'/>
        </serial>
        <console type='pty' tty='/dev/pts/4'>
          <source path='/dev/pts/4'/>
          <target type='serial' port='0'/>
          <alias name='serial0'/>
        </console>
        <channel type='unix'>
          <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-5-Windows/org.qemu.guest_agent.0'/>
          <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
          <alias name='channel0'/>
          <address type='virtio-serial' controller='0' bus='0' port='1'/>
        </channel>
        <input type='tablet' bus='usb'>
          <alias name='input0'/>
          <address type='usb' bus='0' port='1'/>
        </input>
        <input type='mouse' bus='ps2'>
          <alias name='input1'/>
        </input>
        <input type='keyboard' bus='ps2'>
          <alias name='input2'/>
        </input>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0f' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev0'/>
          <rom file='/mnt/user/isos/RX580.rom'/>
          <address type='pci' domain='0x0000' bus='0x0b' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0f' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev1'/>
          <address type='pci' domain='0x0000' bus='0x0b' slot='0x00' function='0x1'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev2'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev3'/>
          <rom file='/mnt/user/isos/RX580.rom'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev4'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x3'/>
          </source>
          <alias name='hostdev5'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x2'/>
        </hostdev>
        <memballoon model='none'/>
      </devices>
      <seclabel type='dynamic' model='dac' relabel='yes'>
        <label>+0:+100</label>
        <imagelabel>+0:+100</imagelabel>
      </seclabel>
      <qemu:commandline>
        <qemu:arg value='-cpu'/>
        <qemu:arg value='host,topoext=on,invtsc=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-synic,hv-stimer,hv-reset,hv-frequencies,host-cache-info=on,l3-cache=off,-amd-stibp'/>
      </qemu:commandline>
    </domain>

     

    monsterservern-diagnostics-20201001-1725-hanged-USB-driver-install.zip

  12. Hi! Yes I have tried it. It works, however, after a few resets I suddenly got a read error on one of my disks. I suspect it was some kind of bug because the smart-report from the disk was without errors. So I did a rebuild of the disk data and it works fine. My suspicion is that putting the server in a suspended state and then restarting it may have caused this bug. 

     

    So I stopped using the reset script. 

     

    What I am curious about is the fact that my monitor turns on and I see the boot logo from the VM bios. Then it turns off again and I have to force stop it. It sounds more like a driver/software issue that might be solvable... 

     

    /erik 

  13. Hi! 

    I am passing through a Rx 580 (Sapphire nitro+ 4GB) to my Windows 10 VM. It works fine until the VM has to restart for any reason (even if it is only a "normal" restart). When it boots up again the display turns on and I see the "Tiano Core" bios logo and the circulating dots notifying me that windows 10 is starting. However, when this phase of the boot process is finished and Windows 10 is about to start the screen goes black and turns off. Now I have to force stop the VM and reboot the server in order to get it working again. If I try to start the VM again without rebooting the server, the same thing happens again.

     

    I find it strange that the GPU is turning on at first, but then seems to turn off again, just as windows is about to load. Could it have something to do with the AMD drivers that are probably being loaded as windows is starting? 

     

    I have the vBIOS (downloaded using GPU-Z) passed through to the VM. I tried removing that and it did not help. The GPU is not bound to VFIO at boot. I tried binding it and it did not help. I am running Unraid 6.9 beta 25. 

     

    Windows 10 is version 2004.

    Radeon drivers are 19.12.2 (not the latest actually - I can try to update that.)

     

    Hardware: x570 Taichi, Ryzen 9 3950X, 80GB DDR4, Rx580 and and R7 370 (used for other VM:s).

     

    Here is my VM XML: 

    <?xml version='1.0' encoding='UTF-8'?>
    <domain type='kvm' id='5' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
      <name>Windows</name>
      <uuid>8dcf9d3b-6010-947b-543a-6216bf778f0b</uuid>
      <description>Main computer</description>
      <metadata>
        <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
      </metadata>
      <memory unit='KiB'>33554432</memory>
      <currentMemory unit='KiB'>33554432</currentMemory>
      <memoryBacking>
        <nosharepages/>
      </memoryBacking>
      <vcpu placement='static'>16</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='8'/>
        <vcpupin vcpu='1' cpuset='24'/>
        <vcpupin vcpu='2' cpuset='9'/>
        <vcpupin vcpu='3' cpuset='25'/>
        <vcpupin vcpu='4' cpuset='10'/>
        <vcpupin vcpu='5' cpuset='26'/>
        <vcpupin vcpu='6' cpuset='11'/>
        <vcpupin vcpu='7' cpuset='27'/>
        <vcpupin vcpu='8' cpuset='12'/>
        <vcpupin vcpu='9' cpuset='28'/>
        <vcpupin vcpu='10' cpuset='13'/>
        <vcpupin vcpu='11' cpuset='29'/>
        <vcpupin vcpu='12' cpuset='14'/>
        <vcpupin vcpu='13' cpuset='30'/>
        <vcpupin vcpu='14' cpuset='15'/>
        <vcpupin vcpu='15' cpuset='31'/>
      </cputune>
      <resource>
        <partition>/machine</partition>
      </resource>
      <os>
        <type arch='x86_64' machine='pc-q35-5.0'>hvm</type>
        <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
        <nvram>/etc/libvirt/qemu/nvram/8dcf9d3b-6010-947b-543a-6216bf778f0b_VARS-pure-efi.fd</nvram>
      </os>
      <features>
        <acpi/>
        <apic/>
        <hyperv>
          <relaxed state='on'/>
          <vapic state='on'/>
          <spinlocks state='on' retries='8191'/>
          <vendor_id state='on' value='none'/>
        </hyperv>
      </features>
      <cpu mode='host-passthrough' check='none'>
        <topology sockets='1' dies='1' cores='8' threads='2'/>
        <cache mode='passthrough'/>
        <feature policy='require' name='topoext'/>
      </cpu>
      <clock offset='localtime'>
        <timer name='hypervclock' present='yes'/>
        <timer name='hpet' present='no'/>
      </clock>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>restart</on_crash>
      <devices>
        <emulator>/usr/local/sbin/qemu</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/user/VM_disks/Windows/windows_disk.img' index='5'/>
          <backingStore/>
          <target dev='hdc' bus='virtio'/>
          <boot order='1'/>
          <alias name='virtio-disk2'/>
          <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/Win10_1809Oct_v2_Swedish_x64.iso' index='4'/>
          <backingStore/>
          <target dev='hda' bus='sata'/>
          <readonly/>
          <boot order='2'/>
          <alias name='sata0-0-0'/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/virtio-win-0.1.173-2.iso' index='3'/>
          <backingStore/>
          <target dev='hdb' bus='sata'/>
          <readonly/>
          <alias name='sata0-0-1'/>
          <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <disk type='block' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source dev='/dev/disk/by-id/ata-Samsung_SSD_840_EVO_250GB_S1DBNSBF464063N' index='2'/>
          <backingStore/>
          <target dev='hdd' bus='sata'/>
          <alias name='sata0-0-3'/>
          <address type='drive' controller='0' bus='0' target='0' unit='3'/>
        </disk>
        <disk type='block' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source dev='/dev/disk/by-id/ata-INTEL_SSDSC2CT180A3_CVMP2224015B180CGN' index='1'/>
          <backingStore/>
          <target dev='hde' bus='sata'/>
          <alias name='sata0-0-4'/>
          <address type='drive' controller='0' bus='0' target='0' unit='4'/>
        </disk>
        <controller type='usb' index='0' model='qemu-xhci' ports='15'>
          <alias name='usb'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
        </controller>
        <controller type='pci' index='0' model='pcie-root'>
          <alias name='pcie.0'/>
        </controller>
        <controller type='pci' index='1' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='1' port='0x8'/>
          <alias name='pci.1'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/>
        </controller>
        <controller type='pci' index='2' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='2' port='0x9'/>
          <alias name='pci.2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0xa'/>
          <alias name='pci.3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
        </controller>
        <controller type='pci' index='4' model='pcie-to-pci-bridge'>
          <model name='pcie-pci-bridge'/>
          <alias name='pci.4'/>
          <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
        </controller>
        <controller type='pci' index='5' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='5'/>
          <alias name='pci.5'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x01' function='0x0'/>
        </controller>
        <controller type='pci' index='6' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='6'/>
          <alias name='pci.6'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x02' function='0x0'/>
        </controller>
        <controller type='pci' index='7' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='7'/>
          <alias name='pci.7'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x03' function='0x0'/>
        </controller>
        <controller type='pci' index='8' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='8'/>
          <alias name='pci.8'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x04' function='0x0'/>
        </controller>
        <controller type='pci' index='9' model='pci-bridge'>
          <model name='pci-bridge'/>
          <target chassisNr='9'/>
          <alias name='pci.9'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x05' function='0x0'/>
        </controller>
        <controller type='pci' index='10' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='10' port='0xb'/>
          <alias name='pci.10'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/>
        </controller>
        <controller type='pci' index='11' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='11' port='0xc'/>
          <alias name='pci.11'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x4'/>
        </controller>
        <controller type='pci' index='12' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='12' port='0xd'/>
          <alias name='pci.12'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x5'/>
        </controller>
        <controller type='pci' index='13' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='13' port='0xe'/>
          <alias name='pci.13'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x6'/>
        </controller>
        <controller type='pci' index='14' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='14' port='0xf'/>
          <alias name='pci.14'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x7'/>
        </controller>
        <controller type='virtio-serial' index='0'>
          <alias name='virtio-serial0'/>
          <address type='pci' domain='0x0000' bus='0x0a' slot='0x00' function='0x0'/>
        </controller>
        <controller type='sata' index='0'>
          <alias name='ide'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
        </controller>
        <interface type='bridge'>
          <mac address='52:54:00:99:fc:d9'/>
          <source bridge='br0'/>
          <target dev='vnet4'/>
          <model type='virtio-net'/>
          <alias name='net0'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x07' function='0x0'/>
        </interface>
        <serial type='pty'>
          <source path='/dev/pts/4'/>
          <target type='isa-serial' port='0'>
            <model name='isa-serial'/>
          </target>
          <alias name='serial0'/>
        </serial>
        <console type='pty' tty='/dev/pts/4'>
          <source path='/dev/pts/4'/>
          <target type='serial' port='0'/>
          <alias name='serial0'/>
        </console>
        <channel type='unix'>
          <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-5-Windows/org.qemu.guest_agent.0'/>
          <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
          <alias name='channel0'/>
          <address type='virtio-serial' controller='0' bus='0' port='1'/>
        </channel>
        <input type='tablet' bus='usb'>
          <alias name='input0'/>
          <address type='usb' bus='0' port='1'/>
        </input>
        <input type='mouse' bus='ps2'>
          <alias name='input1'/>
        </input>
        <input type='keyboard' bus='ps2'>
          <alias name='input2'/>
        </input>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0f' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev0'/>
          <rom file='/mnt/user/isos/RX580.rom'/>
          <address type='pci' domain='0x0000' bus='0x0b' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0f' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev1'/>
          <address type='pci' domain='0x0000' bus='0x0b' slot='0x00' function='0x1'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x08' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev2'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev3'/>
          <rom file='/mnt/user/isos/RX580.rom'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev4'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x0c' slot='0x00' function='0x3'/>
          </source>
          <alias name='hostdev5'/>
          <address type='pci' domain='0x0000' bus='0x0c' slot='0x00' function='0x2'/>
        </hostdev>
        <memballoon model='none'/>
      </devices>
      <seclabel type='dynamic' model='dac' relabel='yes'>
        <label>+0:+100</label>
        <imagelabel>+0:+100</imagelabel>
      </seclabel>
      <qemu:commandline>
        <qemu:arg value='-cpu'/>
        <qemu:arg value='host,topoext=on,invtsc=on,hv-time,hv-relaxed,hv-vapic,hv-spinlocks=0x1fff,hv-vpindex,hv-synic,hv-stimer,hv-reset,hv-frequencies,host-cache-info=on,l3-cache=off,-amd-stibp'/>
      </qemu:commandline>
    </domain>

    I also attach diagnostics after the error occurred. In the VM error log I get the following message (0000:0f:00.1 is the address of my GPU):

    2020-09-05T11:49:00.017364Z qemu-system-x86_64: vfio_err_notifier_handler(0000:0f:00.1) Unrecoverable error detected. Please collect any data possible and then kill the guest
    2020-09-05T11:49:00.017488Z qemu-system-x86_64: vfio_err_notifier_handler(0000:0f:00.0) Unrecoverable error detected. Please collect any data possible and then kill the guest

    Any help would be very much appreciated!

    /Erik

    monsterservern-diagnostics-20200905-1354-GPU-VM-error.zip

  14. 16 hours ago, hrubak said:

    @eribob Thank you for your input :) Is that multiplayer or single?

     

    That sounds very promising for my new build, I am aware that the single threaded performance of my old xeon is bad :D

     

    Would you also have a possibility to test Battlefield 1 or V to see how that would run? 

     

    Please let me know 

     

    Thanks again :)

    No problem. Single player GTA V. Unfortunately I do not have battlefield... 

×
×
  • Create New...