unRAID Server Release 6.2.0-beta23 Available


Recommended Posts

I figured out why I was getting the following errors after upgrading from 6.1.9

Warning: libvirt_domain_xml_xpath(): namespace warning : xmlns: URI unraid is not absolute in /usr/local/emhttp/plugins/dynamix.vm.manager/classes/libvirt.php on line 936 Warning: libvirt_domain_xml_xpath():

It was because of the lines in the config that had to do with the VM icon

  <metadata>
    <vmtemplate xmlns="unraid" name="OSX" icon="OSX-10.10.png" os="OSX"/>
  </metadata>

 

I have a second test server I upgraded from 6.1.9 and if I remove the metadata lines above the error goes away, but if I try and use the unRAID VM editor to add another image back the same warning message appears no matter which icon I choose. As of right now my second server just has the metadata lines removed as I do not want to set up that server from scratch like I did on my first server. Could it be because I was using custom icons when I upgraded and once I got into 6.2 it didn't know how to handle that situation?

 

@archedraft,Have you read the section Virtual Machines here ? http://lime-technology.com/forum/index.php?topic=47408.0

Link to comment
  • Replies 229
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Not sure but I think the location might have changed in 6.2. .../dynamix.vm.manager/templates/images

 

I'll double check this afternoon on my first server but I'm pretty sure on my first server my S01.sh script to copy the icons on boot was the same location. I'll double check though. Either way it sounds like something I am messing up.

Link to comment

I have had an issue with unraid 6.2Beta since its release that was not present in 6.1.9. It's not a major issue and is extremely intermittent. Basically I have multiple VM's from windows 8 and 7 to fedora 23 with no problems at all. However I have one VM which is my Windows 10 VM which is passing through my 980Ti GPU and sound blaster Z sound card which I use for gaming.

 

The problem I'm having is when starting the Windows 10 VM after rebooting the server, or if I have to reboot just the VM for updates etc, it intermittently locks up at the windows 10 blue center square and does not boot. To rectify this I simply force stop the VM, at which point the blue center window 10 boot icon stays on the screen even though the VM is now off. I then simply start it up again, repeating this process until it finally boots. It's more of an anoyance than anything else. I have attached diagnostics for your review and I know this is probably a difficult one to solve as it's intermittent.

 

If you line up the libvirtd.log with the syslog, and adjust times by one hour (libvirtd+1H=syslog times), you can see the VM start activity.  The non-Win10 starts are uneventful, the Win10 starts are accompanied by considerable USB instability, lots of USB port resets and often disabled inputs ('Dell USB Keyboard' and 'ROCCAT Kone XTD').  I can't say whether the USB troubles are the cause, or just another casualty of something else wrong, but this *may* be an avenue for troubleshooting.  One known troublemaker is USB 3.0, ports and drivers.  If you can avoid it, try testing without using any USB3.

 

The fact that it works some times and not others, and all other functions work correctly, implies a race condition somewhere.  It could be a hardware component that's marginal, responds correctly or fast enough only part of the time.  This would most probably (but not necessarily) be a component used only by the Win10 VM.  Or it could be a load order issue, where it's almost random who loads first, and in the wrong order some component is blocked.  If it's specific to the Win10 VM, then you'll have to play with its config, make sure its drivers are up to date, etc.  Less likely but if it involves something more basic to the unRAID system, then you may have to play with the unRAID config, and underlying hardware.  Check for newer BIOS, etc.  You have a lot of plugins and packages, so you might try Safe Mode, just to eliminate that possibility.

 

Excellent, Thanks for the feedback. I will do some troubleshooting based on your recommendations and see if I can resolve it... :)

Link to comment

My system became unstable yesterday after a long run (almost 30 days) of being stable with 6.2.0-beta21.

Web services and VMs became unresponsive.  Not sure if the Shares did as well, but I assume the did - though in retrospect I regret not checking.  I could SSH in (and did so), run Powerdown -r.  diagnostics were saved bu the system COULD NOT shut itself down as devices remained busy.

 

Here is my diagnostics file:

unraid-diagnostics-20160613-2116.zip

Link to comment

Wow so quiet. Is no one using this beta or has it finally become bug-free and stable enough that there is just so little to say? Should the rest of us start doing a happy dance for an -RC?

Hellooooo! *echo* hello...hello...hello...hello

8);D

 

*happy dance*

Link to comment

Wow so quiet. Is no one using this beta or has it finally become bug-free and stable enough that there is just so little to say? Should the rest of us start doing a happy dance for an -RC?

Very little this beta :D

Only some people with small problems, and the docker update bug.

Link to comment

Anyone running any of the betas as a VM on ESXi? I know it's not officially supported, but that's how I have my all-in-one box setup. 6.1.9 has been great, but looking to upgrade to 6.2 (or 6.2 beta before RC/GA is ready).

 

I am.  The one thing you need to do is add a line to the syslinux.cfg file to allow hardware passthrough of an HBA to work. I'll link the fix here when I find out.

 

The only issue I've been having with the 6.2b23 is upon startup, even though my array starts and all my shares are accessible, I can't access the WebGui for a good 60-90 seconds.

 

EDIT: Passthrough fix is here.

Link to comment

My system became unstable yesterday after a long run (almost 30 days) of being stable with 6.2.0-beta21.

Web services and VMs became unresponsive.  Not sure if the Shares did as well, but I assume the did - though in retrospect I regret not checking.  I could SSH in (and did so), run Powerdown -r.  diagnostics were saved bu the system COULD NOT shut itself down as devices remained busy.

 

I'm not VM experienced, so others might be better, but here are a few comments -

 

* Several MCE's (Machine check events) were noted, no apparent cause.  You may want to try a Memtest, etc.  As hardware events, it's hard to relate then to any specific software symptom, but they may be a source of trouble.

 

* At time of diagnostics collection, there are 2 defunct VM processes, using CPU of 14 and 25, much more than anything else, so possibly hung.  Both Dev and Theater had been shut down shortly before this.

root      7119     1 14 Jun12 ?        04:52:54 [qemu-system-x86] <defunct>
root      7320     1 25 Jun12 ?        08:39:18 [qemu-system-x86] <defunct>

 

* On June 13, a series of CPU stalls were logged (you should check which VM had these assigned) -

  - 11:37:41 CPU 6

  - 15:32:14 primarily CPU 6, but also mentions CPU 8 and 9

  - 17:06:49 CPU 6

  - 17:41:30 primarily CPU 6, but also mentions CPU 9

 

* At 21:15, you attempted to power down, but devices remained busy (they look VM related), blocking a good shutdown.  libvirtd.log has a lot of errors when trying to stop the Dev VM, and as seen above, both VM's remained as 'Defunct' processes, still using CPU resources instead of quitting.

 

The fact there were MCE's makes it hard to say whether the issues were local hardware related, or a normal support issue, or actually due to a defect in the beta.

Link to comment

 

* Several MCE's (Machine check events) were noted, no apparent cause..... The fact there were MCE's makes it hard to say whether the issues were local hardware related, or a normal support issue, or actually due to a defect in the beta.

 

 

Since I know myself and others have had MCE issues in the past (with memtest usually not finding an issue), I was curious if LT might consider adding mcelog from http://mcelog.org/index.html to the unRAID betas? I may be mistaken, but from what I've read it seems to be the only way to acertain what exactly an MCE log event was actually caused by (even if ultimately benign). If not, maybe an instructional on how to install it yourself for those that do have MCEs?

 

 

On a side note, I'm also keeping a close eye on this thread to see if I should move my personal and production servers over from beta21. I'll probably give it a few more days and see if anyone has any major blowouts before stepping into the b23 ring, lol. I'm also curious to see how those who changed their num_stripes back to default are doing with large SMB transfers.

Link to comment

Have an issue when I pass through a VGA card AND a physical hard drive as a secondary drive (OS runs off image)

 

The system will boot with PCI Passthrough for the GPU if I dont have the physical drive assigned and it will boot if I have the physical drive assigned and use the VNC display instead, but wont boot if im passing through the VGA and the physical drive.

 

XML for the offending machine:

<domain type='kvm'>
  <name>Gaming Rig</name>
  <uuid>de03a158-d39a-8384-55c7-7aa33d1150ad</uuid>
  <description>Gaming Rig</description>
  <metadata>
    <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
  </metadata>
  <memory unit='KiB'>16777216</memory>
  <currentMemory unit='KiB'>16777216</currentMemory>
  <memoryBacking>
    <nosharepages/>
    <locked/>
  </memoryBacking>
  <vcpu placement='static'>6</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0'/>
    <vcpupin vcpu='1' cpuset='1'/>
    <vcpupin vcpu='2' cpuset='2'/>
    <vcpupin vcpu='3' cpuset='3'/>
    <vcpupin vcpu='4' cpuset='4'/>
    <vcpupin vcpu='5' cpuset='5'/>
  </cputune>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type>
    <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
    <nvram>/etc/libvirt/qemu/nvram/de03a158-d39a-8384-55c7-7aa33d1150ad_VARS-pure-efi.fd</nvram>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='6' threads='1'/>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/mnt/user/domains/Gaming Rig/vdisk1.img'/>
      <target dev='hdc' bus='virtio'/>
      <boot order='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source dev='/dev/sdc'/>
      <target dev='hdd' bus='sata'/>
      <address type='drive' controller='0' bus='0' target='0' unit='3'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/ISOS/Win10_1511_1_EnglishInternational_x64.iso'/>
      <target dev='hda' bus='ide'/>
      <readonly/>
      <boot order='2'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/mnt/user/ISOS/virtio-win-0.1.118-1.iso'/>
      <target dev='hdb' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0' model='ich9-ehci1'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci1'>
      <master startport='0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci2'>
      <master startport='2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
    </controller>
    <controller type='usb' index='0' model='ich9-uhci3'>
      <master startport='4'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </controller>
    <interface type='bridge'>
      <mac address='52:54:00:c0:37:e3'/>
      <source bridge='br0'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <channel type='unix'>
      <source mode='connect'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='usb' managed='no'>
      <source>
        <vendor id='0x046d'/>
        <product id='0x0a01'/>
      </source>
    </hostdev>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/>
    </memballoon>
  </devices>
</domain>

 

Diagnostics are attached.

uss-enterprise-diagnostics-20160618-0008.zip

Link to comment

Am running beta 22 and yesterday replaced my parity drive with a new shiny 8tb one, just wanted to report that all is running fine, parity sync is allmost done.

 

(did this on beta 22 because I had a preclear running for a week that I did not want to interrupt)

Link to comment

Small issue, when bonding NICs the bond IP should be the one reported on the main page.

 

That's an awfully strange IP to see on the main page!

169.x.x.x IP range is generally a self served DHCP address :P

 

That's why he's reporting it.  You'll see the correct one down in the ifconfig report.

Link to comment

* Several MCE's (Machine check events) were noted, no apparent cause..... The fact there were MCE's makes it hard to say whether the issues were local hardware related, or a normal support issue, or actually due to a defect in the beta.

 

Since I know myself and others have had MCE issues in the past (with memtest usually not finding an issue), I was curious if LT might consider adding mcelog from http://mcelog.org/index.html to the unRAID betas? I may be mistaken, but from what I've read it seems to be the only way to acertain what exactly an MCE log event was actually caused by (even if ultimately benign).

 

That's a great idea, and I agree.  If it's not too large, I hope LimeTech will consider adding mcelog, and run it in the recommended daemon mode.  I'm not sure it's the best way, but you might also use the --logfile option for persistence, force the logging to /boot (don't know how chatty this is though).

 

Without this, we really don't have any tools for solving users' MCE issues.  Plus, this actually can in some cases sideline faulty memory and processes, and possibly other live fixes, allowing continued operation and better troubleshooting.

Link to comment

My system became unstable yesterday after a long run (almost 30 days) of being stable with 6.2.0-beta21.

Web services and VMs became unresponsive.  Not sure if the Shares did as well, but I assume the did - though in retrospect I regret not checking.  I could SSH in (and did so), run Powerdown -r.  diagnostics were saved bu the system COULD NOT shut itself down as devices remained busy.

 

I'm not VM experienced, so others might be better, but here are a few comments -

 

* Several MCE's (Machine check events) were noted, no apparent cause.  You may want to try a Memtest, etc.  As hardware events, it's hard to relate then to any specific software symptom, but they may be a source of trouble.

 

Yeah, I've seen those.  Not sure what is causing them.  I ran memtest for 36 hours and it came up with nothing, so i dont think its ram.  Not quite sure how to proceed on that front.

 

* At time of diagnostics collection, there are 2 defunct VM processes, using CPU of 14 and 25, much more than anything else, so possibly hung.  Both Dev and Theater had been shut down shortly before this.

root      7119     1 14 Jun12 ?        04:52:54 [qemu-system-x86] <defunct>
root      7320     1 25 Jun12 ?        08:39:18 [qemu-system-x86] <defunct>

 

* On June 13, a series of CPU stalls were logged (you should check which VM had these assigned) -

  - 11:37:41 CPU 6

  - 15:32:14 primarily CPU 6, but also mentions CPU 8 and 9

  - 17:06:49 CPU 6

  - 17:41:30 primarily CPU 6, but also mentions CPU 9

 

* At 21:15, you attempted to power down, but devices remained busy (they look VM related), blocking a good shutdown.  libvirtd.log has a lot of errors when trying to stop the Dev VM, and as seen above, both VM's remained as 'Defunct' processes, still using CPU resources instead of quitting.

 

The fact there were MCE's makes it hard to say whether the issues were local hardware related, or a normal support issue, or actually due to a defect in the beta.

 

What causes CPU stalls? 

Its odd, because CPU 6 IS pinned to my Dev VM, but that has been an extremely light duty VM.    CPU 8 and 9 arent pinned to anything however! 

 

After the reboot its been 3 days 13 hours of uptime with normal usage patterns so we'll have to see if this happens again....

 

Link to comment

My system became unstable yesterday after a long run (almost 30 days) of being stable with 6.2.0-beta21.

Web services and VMs became unresponsive.  Not sure if the Shares did as well, but I assume the did - though in retrospect I regret not checking.  I could SSH in (and did so), run Powerdown -r.  diagnostics were saved bu the system COULD NOT shut itself down as devices remained busy.

 

I'm not VM experienced, so others might be better, but here are a few comments -

 

* Several MCE's (Machine check events) were noted, no apparent cause.  You may want to try a Memtest, etc.  As hardware events, it's hard to relate then to any specific software symptom, but they may be a source of trouble.

 

Yeah, I've seen those.  Not sure what is causing them.  I ran memtest for 36 hours and it came up with nothing, so i dont think its ram.  Not quite sure how to proceed on that front.

I believe this would be a good time to run the more advanced Memtest, PassMark's Memtest86.  You'll have to download and build it yourself (CD or USB drive), but it's free.  It may or may not find anything either, but it's updated for the latest systems and technologies.

 

If it's not RAM, it will be hard to figure out.  The next thing you can check is for overheating, of the CPU or motherboard chipsets.  However the initial MCE was at boot, which would seem to make overheating a very unlikely cause.

 

No ideas on the stalls.  Hopefully Tom or another will have ideas.

Link to comment
Guest
This topic is now closed to further replies.