clowrym Posted June 20, 2014 Share Posted June 20, 2014 the arch vm i have is NOT based on Ironic Badgers build so i dont think its related to that, my theory right now is that it maybe related to either having cores pinned for the vm and/or to do with having STP turned off, thats the only "odd" options i have set that i can think of, i will post my arch cfg file when i get home. edit - sorry forgot to mention, i am using autofs for nfs only, no smb for me, so i dont believe its related to smb. I have not pinned any CPU's to anything, and I switched from SMB to NFS as a test, and mine still dies after a few minutes. Hopefully this can be resolved soon. I'll probably end up just moving to using docker for the things I'm currently using Arch for now, but still, I'd like to get this working. serious grasp at straws here , dont suppose you have STP turned off? tried turning of STP on my test server.... killed my whole network!! Quote Link to comment
jonp Posted June 20, 2014 Share Posted June 20, 2014 tried turning of STP on my test server.... killed my whole network!! Really? That's weird. All our setups have that turned off. Got to be honest, I'm not a networking expert. What kind of setup do you have? Quote Link to comment
jumperalex Posted June 20, 2014 Share Posted June 20, 2014 the arch vm i have is NOT based on Ironic Badgers build so i dont think its related to that, my theory right now is that it maybe related to either having cores pinned for the vm and/or to do with having STP turned off, thats the only "odd" options i have set that i can think of, i will post my arch cfg file when i get home. edit - sorry forgot to mention, i am using autofs for nfs only, no smb for me, so i dont believe its related to smb. I have not pinned any CPU's to anything, and I switched from SMB to NFS as a test, and mine still dies after a few minutes. Hopefully this can be resolved soon. I'll probably end up just moving to using docker for the things I'm currently using Arch for now, but still, I'd like to get this working. serious grasp at straws here , dont suppose you have STP turned off? tried turning of STP on my test server.... killed my whole network!! are you running two nics? bonding? virtually multiple nics? pretty sure turning off STP causes switch havoc in those cases. it was discussed a tad when bonding first arrived and there were issues with slow network start and STP turned on. the delay was lowered, but the discussion explained why STP was needed. It was posts by Tom himself. Quote Link to comment
JustinChase Posted June 20, 2014 Share Posted June 20, 2014 the arch vm i have is NOT based on Ironic Badgers build so i dont think its related to that, my theory right now is that it maybe related to either having cores pinned for the vm and/or to do with having STP turned off, thats the only "odd" options i have set that i can think of, i will post my arch cfg file when i get home. edit - sorry forgot to mention, i am using autofs for nfs only, no smb for me, so i dont believe its related to smb. I have not pinned any CPU's to anything, and I switched from SMB to NFS as a test, and mine still dies after a few minutes. Hopefully this can be resolved soon. I'll probably end up just moving to using docker for the things I'm currently using Arch for now, but still, I'd like to get this working. serious grasp at straws here , dont suppose you have STP turned off? tried turning of STP on my test server.... killed my whole network!! are you running two nics? bonding? virtually multiple nics? pretty sure turning off STP causes switch havoc in those cases. it was discussed a tad when bonding first arrived and there were issues with slow network start and STP turned on. the delay was lowered, but the discussion explained why STP was needed. It was posts by Tom himself. I actually thought it was the opposite. I'd posted several weeks ago about HORRIBLE network performance, and remember the STP discussion, but vaguely, and not enough of the specifics to talk intelligently about it right now, and I'm fighting with docker stuff now, so can't look it up. But, I have STP off, and am using network bonding, and am not currently suffering any latency/issues, so I suspect this is the 'correct' setting for use with bonding. but, I could certainly be wrong Quote Link to comment
jumperalex Posted June 20, 2014 Share Posted June 20, 2014 STP prevents multiple nics from sending packets in a circle. I don't know if bonding prevents that by looking like a single pipe vs. having two independent nic paths creating a loop. http://en.wikipedia.org/wiki/Spanning_Tree_Protocol Quote Link to comment
clowrym Posted June 20, 2014 Share Posted June 20, 2014 I have bonding turned off actually,i couldn't get bridge to work with bonding on... so i didn't see any reason why it should be turned on. STP is also turned off on my managed switch. I dropped back down to 6b5a, going to tuen off stp see if it still happens! # Generated settings:^M USE_DHCP="yes"^M IPADDR="192.168.1.86"^M NETMASK="255.255.255.0"^M GATEWAY="192.168.1.254"^M DHCP_KEEPRESOLV="no"^M DNS_SERVER1="192.168.1.254"^M DNS_SERVER2=""^M DNS_SERVER3=""^M BONDING="no"^M BONDING_MODE="0"^M BRIDGING="yes"^M BRNAME="br0"^M BRSTP="yes"^M Quote Link to comment
pyrater Posted June 20, 2014 Share Posted June 20, 2014 FYI two issues with beta6 no logs, rolling back to beta4. 1. Both my VM's would boot fine then after about 5 minutes they would drop inet connections. 2. When doing a shutdown on my VM's they hang at a black screen in TightVNC and XL List outputs: root@Icarus:~# xl list Name ID Mem VCPUs State Time(s) Domain-0 0 1921 1 r----- 739.8 (null) 1 0 1 --ps-d 26.7 (null) 2 0 3 --p--d 1464.7 (null) 3 0 7 --p--d 488.8 root@Icarus:~# xl destroy 1 libxl: error: libxl_dm.c:1467:kill_device_model: unable to find device model pid in /local/domain/1/image/device-model-pid libxl: error: libxl.c:1421:libxl__destroy_domid: libxl__destroy_device_model fai led for 1 root@Icarus:~# xl destroy 3 libxl: error: libxl_device.c:934:device_backend_callback: unable to remove devic e with path /local/domain/0/backend/vif/3/0 libxl: error: libxl.c:1457:devices_destroy_cb: libxl__devices_destroy failed for 3 During server unmount the console outputs a ton of "vif vif-1-0-vif1.0: draining TX queue" and hangs not looking for a "fix" just a heads up to others if they have a problem like mine. Quote Link to comment
jonp Posted June 20, 2014 Share Posted June 20, 2014 FYI two issues with beta6 no logs, rolling back to beta5. 1. Both my VM's would boot fine then after about 5 minutes they would drop inet connections. 2. When doing a shutdown on my VM's they hang at a black screen in TightVNC and XL List outputs: root@Icarus:~# xl list Name ID Mem VCPUs State Time(s) Domain-0 0 1921 1 r----- 739.8 (null) 1 0 1 --ps-d 26.7 (null) 2 0 3 --p--d 1464.7 (null) 3 0 7 --p--d 488.8 root@Icarus:~# xl destroy 1 libxl: error: libxl_dm.c:1467:kill_device_model: unable to find device model pid in /local/domain/1/image/device-model-pid libxl: error: libxl.c:1421:libxl__destroy_domid: libxl__destroy_device_model fai led for 1 root@Icarus:~# xl destroy 3 libxl: error: libxl_device.c:934:device_backend_callback: unable to remove devic e with path /local/domain/0/backend/vif/3/0 libxl: error: libxl.c:1457:devices_destroy_cb: libxl__devices_destroy failed for 3 During server unmount the console outputs a ton of "vif vif-1-0-vif1.0: draining TX queue" and hangs not looking for a "fix" just a heads up to others if they have a problem like mine. it seems that all issues relating to beta 6 usage are specific to Xen. I have yet to see any issues reported with respect to docker or KVM yet. we are looking into the Xen issues, but these aren't issues we are able to solve ourselves. the Xen team needs to. Sent from my Nexus 5 using Tapatalk Quote Link to comment
pyrater Posted June 20, 2014 Share Posted June 20, 2014 Roger, no issues Beta 4 works fine for me ATM so ill use that and try next release =) Quote Link to comment
Thornwood Posted June 21, 2014 Share Posted June 21, 2014 Will The Main array move to btrfs in the future as the main format? Thornwood Quote Link to comment
gwl Posted June 21, 2014 Share Posted June 21, 2014 This post is more for LT's information and beta testing problem report purposes than myself needing any response. My upgrade experience from beta5a to beta6 is fine now, but to get it running, I did run into a problem when booting the first time with beta6 into Xen. Below are the steps I took that produced the problem. I haven't been able to reproduce it. I don't know if anything was at all coincidental or not, but better to report it than not as this is beta testing... Before upgrading, I stopped all arch VMs, and took a backup copy of the flash share. I copied across 4 beta6 files (xen, bzimage, bzroot and readme.txt) from my windows box (via explorer) to the unRaid usb (which is internally connected to my SuperMicro X10-SL7 m/b). Note: this was how I had upgraded in the past from beta3 onwards. I stopped the array and shutdown unRaid and restarted via IPMI minutes later. In IPMI's KVM console window I saw the boot process running along and did not see anything noticeably wrong through to the login prompt. Initially, I was not able to get access into the WebGUI for unRaid, even after a minute or two. At the console login, I logged in as root and it went immediately to a shell prompt instead of asking for root's password, which was odd because I had set a password for root. I accepted the password could have been reset as potentially possible given that I was upgrading, but still took note of it. I checked /mnt and did not have any disks or shares present. And even my /boot was empty! Weird! I know, I should captured more information at the time, but I was thinking let's get back into 5a first and confirm all drives/shares/everything is as it should be. I shutdown the server (and powered off at wall) and this time, I pulled the internal usb from the m/b and inserted it into the windows pc's usb slot and copied back beta5a's 4 files I had backed up. However, I recopied beta6 files back again to the usb overwriting the beta5a backup copies I had just done. With the usb back in the same port on the SM board, I booted up via IPMI again and was soon at unRaid's console login prompt again. This time, when I logged in as root, I was asked for the password. And then once logged in, /mnt had the disks mounted and the shares were all present. The WebGUI also loaded successfully showing beta6 status. Anyway, all is back on track now and more playing testing to be had. It was an experience I thought best to share in the hope it is helpful if others encounter something similar. Or perhaps something wrong happened when initially copying the files to \\tower\flash? cheers, gwl Quote Link to comment
gwl Posted June 21, 2014 Share Posted June 21, 2014 Safe to use? Was this in reference to my prior post? If so, I am happily plodding away using beta6, and right now just learning more about btrfs. cheers, gwl Sent from my iPad using Tapatalk Quote Link to comment
pras1011 Posted June 21, 2014 Share Posted June 21, 2014 Nope. I was just wondering in general if this new beta is safe to use. Sent from my GT-I9505 using Tapatalk Quote Link to comment
itimpi Posted June 21, 2014 Share Posted June 21, 2014 Nope. I was just wondering in general if this new beta is safe to use. Safe to use for what? It appears that SOME people are experiencing Xen related issues. So far no other significant issues appear to have arisen. Quote Link to comment
Packalacky Posted June 21, 2014 Share Posted June 21, 2014 Haven't seen anyone mention this on the thread so here goes. Updating from Beta4. 1. Copied the 4 files as mention on the readme.txt file (bzimage, bzroot, xen, readme.txt) 2. Start Unraid + Xen, everything works fine, no issues whatsoever, everything as expected including an ArchVM 3. Reboot 4. Start Unraid alone (no Xen) 5. Get error messages upon starting and then system reboots/shutdown. Here are a few pictures of where it hangs up/shutsdown/reboots. These are the most common errors but sometimes it just reboots at random or hangs at a certain line, it's not always the same: On these 2 errors, it simply hangs: On this one it reboots: After getting these errors, I got a different USB stick and used it to test a clean install of UnraidBeta6 and got exactly the same results. Unraid+Xen works perfectly fine, but Unraid by itself doesn't. Everything worked fine under the Unraid beta 4, including Unraid by itself. There were no hardware changes between the 2 betas. Quote Link to comment
jonp Posted June 21, 2014 Share Posted June 21, 2014 Haven't seen anyone mention this on the thread so here goes. Updating from Beta4. 1. Copied the 4 files as mention on the readme.txt file (bzimage, bzroot, xen, readme.txt) 2. Start Unraid + Xen, everything works fine, no issues whatsoever, everything as expected including an ArchVM 3. Reboot 4. Start Unraid alone (no Xen) 5. Get error messages upon starting and then system reboots/shutdown. Here are a few pictures of where it hangs up/shutsdown/reboots. These are the most common errors but sometimes it just reboots at random or hangs at a certain line, it's not always the same: On these 2 errors, it simply hangs: On this one it reboots: After getting these errors, I got a different USB stick and used it to test a clean install of UnraidBeta6 and got exactly the same results. Unraid+Xen works perfectly fine, but Unraid by itself doesn't. Everything worked fine under the Unraid beta 4, including Unraid by itself. There were no hardware changes between the 2 betas. thank you for reporting this and with this detail. this will be reviewed next week by the lime tech team. Sent from my Nexus 5 using Tapatalk Quote Link to comment
balloob Posted June 22, 2014 Share Posted June 22, 2014 Since I upgraded to this beta from beta5a I have seen the following kernel bug happening in UnraidOS: ------------[ cut here ]------------ kernel BUG at drivers/net/xen-netback/netback.c:629! invalid opcode: 0000 [#1] SMP Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 nf_nat iptable_filter ip_tables md_mod vhost_net vhost tun i2c_i801 e1000e ahci ptp libahci pps_core CPU: 0 PID: 1841 Comm: vif1.0-guest-rx Not tainted 3.15.0-unRAID #4 Hardware name: ASUS All Series/H87I-PLUS, BIOS 1005 01/06/2014 task: ffff8801fec55540 ti: ffff88009a814000 task.ti: ffff88009a814000 RIP: e030:[<ffffffff81404fba>] [<ffffffff81404fba>] xenvif_rx_action+0x484/0x7ff RSP: e02b:ffff88009a817da0 EFLAGS: 00010202 RAX: 0000000000000013 RBX: 0000000000000012 RCX: ffffea0004e1e300 RDX: ffff88009b77b3c8 RSI: ffff8800725676e0 RDI: 00000000000b99ac RBP: ffff88009a817e70 R08: 0000000000000000 R09: 0000000000000001 R10: 0000160000000000 R11: ffff8801b878c000 R12: ffff8800725676e0 R13: 0000000000000011 R14: 0000000000000000 R15: ffff88009b770800 FS: 0000000000000000(0000) GS:ffff88020f800000(0000) knlGS:ffff88020f800000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002af2b606e000 CR3: 00000001fc1ee000 CR4: 0000000000042660 Stack: ffff88009a817df4 0000000000000000 ffff8802ffffffff ffff88009a817dd8 ffff88009a817e08 00000011ff18a6c0 ffff88009b77b148 00000011000889c2 ffff88009b770800 0000000000000000 0000000000000001 ffff88009a817df8 Call Trace: [<ffffffff814068ab>] xenvif_kthread_guest_rx+0x108/0x1db [<ffffffff8106e362>] ? __wake_up_sync+0xd/0xd [<ffffffff814067a3>] ? xenvif_stop_queue+0x53/0x53 [<ffffffff81057fbb>] kthread+0xd6/0xde [<ffffffff81057ee5>] ? kthread_create_on_node+0x162/0x162 [<ffffffff8157e94c>] ret_from_fork+0x7c/0xb0 [<ffffffff81057ee5>] ? kthread_create_on_node+0x162/0x162 Code: 8b 09 e8 25 f5 ff ff e9 0f ff ff ff 8b 45 b8 2b 85 6c ff ff ff 41 89 44 24 28 41 8b 87 34 a9 00 00 2b 85 68 ff ff ff 39 d8 76 02 <0f> 0b 48 8b 45 a0 48 8b 9d 50 ff ff ff 49 89 44 24 08 49 89 1c RIP [<ffffffff81404fba>] xenvif_rx_action+0x484/0x7ff RSP <ffff88009a817da0> ---[ end trace 49676c959adf5538 ]--- I run on a Core i5 in Xen mode, I have 1 virtual machine with 2 cores that runs Plex. It happens whenever I'm streaming/transcoding movies from Plex. Quote Link to comment
suleimant Posted June 22, 2014 Share Posted June 22, 2014 Not sure if anyone else is having similar issues. I have my cache drive connected to a pci sata card which has a marvell 88SE91xx controller. In beta5a - using normal boot the PC would detect my cache drive . Using Xen boot it would not see the drive. I used the pci- phantom in the syslinux.cfg to resolve (see http://lime-technology.com/forum/index.php?topic=33511.0). In beta6 - the Xen boot is working fine but now when I try the normal boot (which is also KVM), it doesn't see the drive again. Extract from syslog: Jun 22 12:51:28 suleimant kernel: ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jun 22 12:51:28 suleimant kernel: dmar: DRHD: handling fault status reg 3 Jun 22 12:51:28 suleimant kernel: dmar: DMAR:[DMA Read] Request device [01:00.1] fault addr fffe0000 Jun 22 12:51:28 suleimant kernel: DMAR:[fault reason 02] Present bit in context entry is clear Not sure if this is easy to resolve in KVM or not? Google seems less helpfully hen when I had the issue with Xen. I am busy waiting for delivery of a new pci sata card with a different controller but figured this could affect a few other users when in production. Quote Link to comment
dmacias Posted June 22, 2014 Share Posted June 22, 2014 Had anyone having rocket ride problems with xen vms tried turning off cpu offloading within the vm. ethtool -K eth0 tso off gso off Quote Link to comment
dlandon Posted June 22, 2014 Share Posted June 22, 2014 Since I upgraded to this beta from beta5a I have seen the following kernel bug happening in UnraidOS: ------------[ cut here ]------------ kernel BUG at drivers/net/xen-netback/netback.c:629! invalid opcode: 0000 [#1] SMP Modules linked in: ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_nat_ipv4 nf_nat iptable_filter ip_tables md_mod vhost_net vhost tun i2c_i801 e1000e ahci ptp libahci pps_core CPU: 0 PID: 1841 Comm: vif1.0-guest-rx Not tainted 3.15.0-unRAID #4 Hardware name: ASUS All Series/H87I-PLUS, BIOS 1005 01/06/2014 task: ffff8801fec55540 ti: ffff88009a814000 task.ti: ffff88009a814000 RIP: e030:[<ffffffff81404fba>] [<ffffffff81404fba>] xenvif_rx_action+0x484/0x7ff RSP: e02b:ffff88009a817da0 EFLAGS: 00010202 RAX: 0000000000000013 RBX: 0000000000000012 RCX: ffffea0004e1e300 RDX: ffff88009b77b3c8 RSI: ffff8800725676e0 RDI: 00000000000b99ac RBP: ffff88009a817e70 R08: 0000000000000000 R09: 0000000000000001 R10: 0000160000000000 R11: ffff8801b878c000 R12: ffff8800725676e0 R13: 0000000000000011 R14: 0000000000000000 R15: ffff88009b770800 FS: 0000000000000000(0000) GS:ffff88020f800000(0000) knlGS:ffff88020f800000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00002af2b606e000 CR3: 00000001fc1ee000 CR4: 0000000000042660 Stack: ffff88009a817df4 0000000000000000 ffff8802ffffffff ffff88009a817dd8 ffff88009a817e08 00000011ff18a6c0 ffff88009b77b148 00000011000889c2 ffff88009b770800 0000000000000000 0000000000000001 ffff88009a817df8 Call Trace: [<ffffffff814068ab>] xenvif_kthread_guest_rx+0x108/0x1db [<ffffffff8106e362>] ? __wake_up_sync+0xd/0xd [<ffffffff814067a3>] ? xenvif_stop_queue+0x53/0x53 [<ffffffff81057fbb>] kthread+0xd6/0xde [<ffffffff81057ee5>] ? kthread_create_on_node+0x162/0x162 [<ffffffff8157e94c>] ret_from_fork+0x7c/0xb0 [<ffffffff81057ee5>] ? kthread_create_on_node+0x162/0x162 Code: 8b 09 e8 25 f5 ff ff e9 0f ff ff ff 8b 45 b8 2b 85 6c ff ff ff 41 89 44 24 28 41 8b 87 34 a9 00 00 2b 85 68 ff ff ff 39 d8 76 02 <0f> 0b 48 8b 45 a0 48 8b 9d 50 ff ff ff 49 89 44 24 08 49 89 1c RIP [<ffffffff81404fba>] xenvif_rx_action+0x484/0x7ff RSP <ffff88009a817da0> ---[ end trace 49676c959adf5538 ]--- I run on a Core i5 in Xen mode, I have 1 virtual machine with 2 cores that runs Plex. It happens whenever I'm streaming/transcoding movies from Plex. I had the same issue. I'm running a Debian VM with Owncloud. Could this shed some light on the fix for this problem? http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/xen-netback?id=59ae9fc67007da8b5aea7b0a31c3607745cfbfee Quote Link to comment
dmacias Posted June 22, 2014 Share Posted June 22, 2014 A couple times the webgui has frozen up. The first time I was doing alot of testing with btrfs cache pools, vms and docker. I used the powerdown script to shutdown. I removed dom0 mem from syslinux and cpus from vm. I don't have cpus pinned to dom0 because they have always caused a hard crash even when cpus are reserved in vms. The second time it froze was after I shutdown to upgrade a drive. I removed a old 2tb and replaced with a precleared 4tb. I clicked on start the array to rebuild and expand and at or after loading disk 2 (the one to be rebuilt) the log and webgui stopped. Used powerdown again and upon reboot drive built successfully. Log wasn't helpful cause it stopped at loading md2. I wanted to post some positive feedback. I run just one Xen vm Ubuntu 14.04 LTS mainly for a Mythtv Backend. I have sab, sick, etc in the vm but have stopped them all and moved to docker containers. So I have one Mythtv vm with LAMP and Mythtv backend only and all other apps in dockers. I have a cache SSD and domain/docker SSD both btrfs. Vm is on its own controller. In the past, during parity checks, syncs or high disk activity, my mythtv streams would buffer or start getting some glitches. On 6b6 during rebuild and parity check mythtv performed perfectly. So either new kernel or new xen had fixed some of my issues. Later I'll try to port to kvm. Quote Link to comment
jbartlett Posted June 23, 2014 Share Posted June 23, 2014 Unable to create a share via the web gui that contains the plus (+) sign. On submitting, a message is displayed that the share was deleted. Editing a share that contains a plus sign that was created via some other means also displays that the share was deleted - but it wasn't. Changes stick however. Suspect that the plus sign is being translated into a space and the script looking for "lost+found" does not see "lost found" and reports that it was deleted. Quote Link to comment
bubbaQ Posted June 23, 2014 Share Posted June 23, 2014 PHP extensions still omitted from this build: /usr/lib64/php/extensions/ is empty Quote Link to comment
JarDo Posted June 23, 2014 Share Posted June 23, 2014 I'm not sure if these libraries were cooked into v5 or if my install of one or more unmenu packages installed them, but I noticed when trying to get Subsonic running that the glib2 and libffi libraries are missing from v6 Beta 6. I had to install the following packages to get these libraries back: glib2-2.36.4-x86_64-1.txz libffi-3.0.13-x86_64-2.txz Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.