RobJ Posted July 30, 2016 Share Posted July 30, 2016 Drive failure on a massive scale.... Some of this is related to 6.2rc2. Twice when I tried to shut the server down in RC2, it took a very long time, and when it did stop the array, it showed a large number of missing disks. Since this is the second time, I got a diagnostics and I think the relevant parts are a mpt2sas_cm1 failure. Server has 2 IBM1015 installed and an Areca 1231ML. There was no indication that anything was wrong while the server was running, but just pressing the stop array button resulted in this massive failure both times. The first time I restarted, the server came up with all the drives again, except parity1 and disk2. Since this is running dual parity, that was fine and it did a rebuild of parity and disk2 without incident. This time, we had disks 1-6 fail as well as parity1, but after a reboot, they all came back except parity and disk2 again. This time I was just in the process of restarting after installing RC3, so RC3 is running now. I immediately stopped the array. This is a backup server so I can explore a few things if anyone has ideas.... You have only one drive on the motherboard (sdc, Cache; has 6 ports 5 unused), only one drive on the Areca (sdb, Parity2; claims 24 ports but probably 16 physical ports), 3 drives on the first SAS card, all fine, and 7 drives on the second SAS card, which crashed hard. System appears to be running fine until array is stopped at Jul 29 12:25:20. Drives are told to spin up, but after 32 seconds, Disk 4 is not working. Disk 4 was the first to act up, but appeared to be recovered at first, so the stopping continues, but about 30 seconds later Disk 4 is still not working correctly. The kernel then begins a series of resets, but they appear only partly not fully successful, and the driver steps in and tries to reset everything and re-enable the port, but that appears to fail badly. At that point, NOTHING works on the card, and it appears the card has crashed. All attempts to read and write to any of the attached drives fail. Disk 2 needs a write, which fails and because a write to a data drive necessitates a write to parity, it fails too as both drives are on the crashed card. It appears to try to read from the rest of the drives, to reconstruct the now emulated Disk 2, but those reads of course fail too, except for the ones to drives not on the card. I can't see how this could be related to any particular release. Did it seem to work before the 6.2 betas? The interaction that fails here looks like between the SAS card driver mpt2sas/mpt3sas and the firmware on the card. I'd check for updated firmware for sure. It's always possible that a new driver will have a new interaction, and if that's true here, you'd either want to go back in time to one that works with yours, or wait for the driver authors to update it again. But even if there's an improved driver, the card firmware should not have crashed! If you can't get the second SAS card to be stable, reliable, you do have lots of ports elsewhere for its 7 drives. Your BIOS is from 2011, strongly advise anyone attempting to use virtualization (and having issues) to try and update their BIOS, too many important technologies involve the BIOS. There's a line in your syslog about HT being turned off because of bugs. Minor - your flash drive does not appear to have been prepared for v6 as a clean install, still has vestiges of the past. Minor too - the Preclear plugin is replacing one loaded library (libevent) with an older version, unknown if it is causing any trouble, but it's a possibility. I suspect Preclear dependencies need to be updated for 6.2. * libevent-2.0.22-x86_64-1 upgraded with new package /boot/config/plugins/preclear.disk/libevent-2.0.21-x86_64-1.txz Link to comment
tr0910 Posted July 30, 2016 Share Posted July 30, 2016 I can't see how this could be related to any particular release. Did it seem to work before the 6.2 betas? The interaction that fails here looks like between the SAS card driver mpt2sas/mpt3sas and the firmware on the card. I'd check for updated firmware for sure. It's always possible that a new driver will have a new interaction, and if that's true here, you'd either want to go back in time to one that works with yours, or wait for the driver authors to update it again. But even if there's an improved driver, the card firmware should not have crashed! If you can't get the second SAS card to be stable, reliable, you do have lots of ports elsewhere for its 7 drives. Your BIOS is from 2011, strongly advise anyone attempting to use virtualization (and having issues) to try and update their BIOS, too many important technologies involve the BIOS. There's a line in your syslog about HT being turned off because of bugs. Thanks RobJ, yes all are one that one card. Looks like a hardware issue. Funny that it can do a rebuild without issue. Are you suggesting a mb bios update? It's the very familiar SuperMicro X9SCM. The bios on the 1015 controller is not the latest, but it should be the same as the other 1015 installed. I can easily move to other ports and chuck that 1015 if its garbage. Link to comment
RobJ Posted July 30, 2016 Share Posted July 30, 2016 I can't see how this could be related to any particular release. Did it seem to work before the 6.2 betas? The interaction that fails here looks like between the SAS card driver mpt2sas/mpt3sas and the firmware on the card. I'd check for updated firmware for sure. It's always possible that a new driver will have a new interaction, and if that's true here, you'd either want to go back in time to one that works with yours, or wait for the driver authors to update it again. But even if there's an improved driver, the card firmware should not have crashed! If you can't get the second SAS card to be stable, reliable, you do have lots of ports elsewhere for its 7 drives. Your BIOS is from 2011, strongly advise anyone attempting to use virtualization (and having issues) to try and update their BIOS, too many important technologies involve the BIOS. There's a line in your syslog about HT being turned off because of bugs. Thanks RobJ, yes all are one that one card. Looks like a hardware issue. Funny that it can do a rebuild without issue. The real cause of the issue isn't reported. The event that appears coincident is the spinning up of all 7 drives on the card. One thing you may want to check is for sufficient power. Have you added hardware to the system lately? Are you suggesting a mb bios update? It's the very familiar SuperMicro X9SCM. Yes, if available. The bios on the 1015 controller is not the latest, but it should be the same as the other 1015 installed. I can easily move to other ports and chuck that 1015 if its garbage. I'd update both, if newer firmware is available. Link to comment
eschultz Posted July 31, 2016 Share Posted July 31, 2016 after update, eth1 seems to be renamed to eth119 and is no longer connected (I did a remote update) eth0: 1000 Mb/s, full duplex, mtu 1500 eth119: not connected gona check cables anyhow once I'm on site. Same issue here. Eth1 doesn't show up in ifconfig. Network bond0 IEEE 802.3ad Dynamic link aggregation, mtu 1500 eth0 1000 Mb/s, full duplex, mtu 1500 eth118 not connected eth2 1000 Mb/s, full duplex, mtu 1500 lo loopback Are your MAC addresses unique on all your network interfaces? (examine them via 'ifconfig -a') Link to comment
johnodon Posted July 31, 2016 Share Posted July 31, 2016 Finally decided to bite the bullet and move to v6.2 and I am following the guide from the first beta announcement that explains the migration process. Since I knew that Docker and KVM would need some extra attention, I disabled both prior to shutting down the server. I formatted my flash drive since I had some corruption, installed 6.2rc3 on it, and restored the following: /config/disk,cfg /config/docker.cfg /config/domain.cfg /config/network.cfg /config/ident.cfg /config/Pro.key /config/super.dat /config/shares/* Booted unRAID without issue and I can access my shares. I continued to tweak a few settings and follow the guide when I ran into my first issue... When starting the array with at least one cache device, a share called "system" will be automatically created. Inside, two subfolders will be created (docker and libvirt respectively). Each of these folders will contain a loopback image file for each service. This did not happen for me. The 'system' share was not created even though I have a cache drive. What should be my course of action? Do I need to manually create the 'system' share? Diagnostics are attached. John unraid-diagnostics-20160731-0937.zip Link to comment
trurl Posted July 31, 2016 Share Posted July 31, 2016 Finally decided to bite the bullet and move to v6.2 and I am following the guide from the first beta announcement that explains the migration process. Since I knew that Docker and KVM would need some extra attention, I disabled both prior to shutting down the server. I formatted my flash drive since I had some corruption, installed 6.2rc3 on it, and restored the following: /config/disk,cfg /config/docker.cfg /config/domain.cfg /config/network.cfg /config/ident.cfg /config/Pro.key /config/super.dat /config/shares/* Booted unRAID without issue and I can access my shares. I continued to tweak a few settings and follow the guide when I ran into my first issue... When starting the array with at least one cache device, a share called "system" will be automatically created. Inside, two subfolders will be created (docker and libvirt respectively). Each of these folders will contain a loopback image file for each service. This did not happen for me. The 'system' share was not created even though I have a cache drive. What should be my course of action? Do I need to manually create the 'system' share? Diagnostics are attached. John See reply #22 from Limetech on this thread. The .cfg files you kept would have kept your old defaults for these paths. Link to comment
johnodon Posted July 31, 2016 Share Posted July 31, 2016 See reply #22 from Limetech on this thread. The .cfg files you kept would have kept your old defaults for these paths. TY. I think I am just going to start with a clean slate. John Link to comment
Squid Posted July 31, 2016 Share Posted July 31, 2016 Can someone clarify for me what I should be doing for Docker and VMs in reference to the shares? It was my understanding that I should map docker .img and appdata to a disk directly and not use user share as it can cause issues, is this correct? Or should I use user share and create hard links? I'm confused now ha. I'm not using a cache drive btw. While LT states that hardlinks do now work on user shares it is still a new feature with not a lot of evidence as of yet as to the stability or usability of that feature. For the immediate future you'll be safest putting docker appdata directly onto a disk share. Sent from my LG-D852 using Tapatalk Although if it's not used then any remaining issues won't get found. Using user shares as both a source and destination (useCache=no) within CA's Backup / Restore appears to work properly with hardlinks, so I'll pump out an update to CA to support them on 6.2rc3+. As a side note, during one test with setting my destination to a useCache=yes, everything worked correctly, but mover took ~8 hours to move the 200,000+ files to the array when a direct rsync from the cache drive to the array only took 1 hour. Feature req: On mover, can it be logged (when logging on the script is disabled) when it starts and stops? Additionally, a further option on mover to log errors only (as it doesn't appear to when logging is disabled) Link to comment
eschultz Posted July 31, 2016 Share Posted July 31, 2016 As a side note, during one test with setting my destination to a useCache=yes, everything worked correctly, but mover took ~8 hours to move the 200,000+ files to the array when a direct rsync from the cache drive to the array only took 1 hour. There is extra overhead in the mover because for each file it's about to copy it'll check if its 'in-use' either by another process or as a loopback image to determine if it's safe to move. The 'in-use' check is crucial to preventing possible data loss when a file is still being written to by another process. The sheer number of files (200K+) in your case amplified this overhead. Feature req: On mover, can it be logged (when logging on the script is disabled) when it starts and stops? Additionally, a further option on mover to log errors only (as it doesn't appear to when logging is disabled) Good ideas, will have to discuss with Tom. Link to comment
Dephcon Posted August 1, 2016 Share Posted August 1, 2016 Upgraded from rc2 before bed, and it kernel panicked this morning. First KP with unraid in 5 years. I didn't think to take a picture of the console output, but i will if it does it again. KP'd again while I was away for the weekend. If it happens again i'm going to have to downgrade to rc2. Sent from my 6055Y using Tapatalk Link to comment
Fuggin Posted August 1, 2016 Share Posted August 1, 2016 Updated to rc3....no issues. Thank you! Link to comment
kamhighway Posted August 1, 2016 Share Posted August 1, 2016 upgraded from 6.1.9 to rc3. First problem is that the docker tab has disappeared. I have disabled and then re-enabled docker but status continues to say "stopped." Link to comment
Squid Posted August 1, 2016 Share Posted August 1, 2016 upgraded from 6.1.9 to rc3. First problem is that the docker tab has disappeared. I have disabled and then re-enabled docker but status continues to say "stopped." Delete the docker.img, recreate it, then re-add your apps via CA's previous apps Link to comment
kamhighway Posted August 1, 2016 Share Posted August 1, 2016 @squid, I've deleted the old docker.img, but do not see where you go to create a new one. I've looked at the advanced docker settings and filled out both fields, but a new docker.img is not created. Update: I had entered /mnt/disks/ssd960 as the location for docker.img. This location is on an SSD that is mounted by Unassigned Devices. When I changed the location to /mnt/cache/docker it worked. Looks like you can no longer have docker.img on a disk that is mounted with unassigned devices. Link to comment
RobJ Posted August 1, 2016 Share Posted August 1, 2016 Upgraded from rc2 before bed, and it kernel panicked this morning. First KP with unraid in 5 years. I didn't think to take a picture of the console output, but i will if it does it again. KP'd again while I was away for the weekend. If it happens again i'm going to have to downgrade to rc2. There's very little helpful in that pic, except that the fault occurred VERY low level, deep in the kernel. We need more info. When it happens again, try paging back (it's Shift-PgUp not PgUp). Also try getting in from a different machine with Telnet or SSH. If you do get a command prompt, type 'diagnostics' and attach them (will be on the flash drive in /logs). Other basic troubleshooting - Memtest (several passes), check for newer motherboard BIOS, make sure you aren't overclocking. Has there been any other change to the system recently? It's hard to blame this on rc3 yet, when you're the only one having this, but I don't have any alternative ideas, so far. Link to comment
afoard Posted August 2, 2016 Share Posted August 2, 2016 after update, eth1 seems to be renamed to eth119 and is no longer connected (I did a remote update) eth0: 1000 Mb/s, full duplex, mtu 1500 eth119: not connected gona check cables anyhow once I'm on site. Same issue here. Eth1 doesn't show up in ifconfig. Network bond0 IEEE 802.3ad Dynamic link aggregation, mtu 1500 eth0 1000 Mb/s, full duplex, mtu 1500 eth118 not connected eth2 1000 Mb/s, full duplex, mtu 1500 lo loopback Are your MAC addresses unique on all your network interfaces? (examine them via 'ifconfig -a') The MAC addresses are not unique. I included my entire ifconfig -a output. root@MiirUnraid:/# ifconfig -a bond0: flags=5443<UP,BROADCAST,RUNNING,PROMISC,MASTER,MULTICAST> mtu 1500 ether 30:b5:c2:05:3b:2f txqueuelen 1000 (Ethernet) RX packets 47992065 bytes 52446775980 (48.8 GiB) RX errors 0 dropped 20311 overruns 0 frame 0 TX packets 56404542 bytes 41142481733 (38.3 GiB) TX errors 0 dropped 2 overruns 0 carrier 0 collisions 0 br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.33.0.33 netmask 255.255.255.0 broadcast 0.0.0.0 ether 30:b5:c2:05:3b:2f txqueuelen 1000 (Ethernet) RX packets 38355406 bytes 51323574860 (47.7 GiB) RX errors 0 dropped 910 overruns 0 frame 0 TX packets 35703751 bytes 40099961660 (37.3 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 docker0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 0.0.0.0 ether 02:42:86:76:2c:e7 txqueuelen 0 (Ethernet) RX packets 3903187 bytes 221449380 (211.1 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 4100180 bytes 6917263713 (6.4 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 30:b5:c2:05:3b:2f txqueuelen 1000 (Ethernet) RX packets 23993416 bytes 26170789626 (24.3 GiB) RX errors 0 dropped 8 overruns 0 frame 0 TX packets 49598758 bytes 32150631625 (29.9 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500 ether 30:b5:c2:05:3b:2f txqueuelen 1000 (Ethernet) RX packets 23998649 bytes 26275986354 (24.4 GiB) RX errors 0 dropped 1 overruns 0 frame 0 TX packets 6805784 bytes 8991850108 (8.3 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 17 memory 0xf79c0000-f79e0000 eth118: flags=4098<BROADCAST,MULTICAST> mtu 1500 ether 30:b5:c2:05:3b:2f txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 18 gre0: flags=128<NOARP> mtu 1476 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 1 (UNSPEC) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 gretap0: flags=4098<BROADCAST,MULTICAST> mtu 1462 ether 00:00:00:00:00:00 txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ip_vti0: flags=128<NOARP> mtu 1364 tunnel txqueuelen 1 (IPIP Tunnel) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.255.255.255 loop txqueuelen 1 (Local Loopback) RX packets 1730080 bytes 1054932866 (1006.0 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1730080 bytes 1054932866 (1006.0 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 tun0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1500 inet 10.99.0.1 netmask 255.255.255.0 destination 10.99.0.1 unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 100 (UNSPEC) RX packets 1311959 bytes 1541507475 (1.4 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 919912 bytes 75710935 (72.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 tunl0: flags=128<NOARP> mtu 1480 tunnel txqueuelen 1 (IPIP Tunnel) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth6188425: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 06:f5:05:ca:17:f0 txqueuelen 0 (Ethernet) RX packets 7 bytes 788 (788.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12959 bytes 921845 (900.2 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth018432d: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether b2:4c:cf:4e:4f:3b txqueuelen 0 (Ethernet) RX packets 2690 bytes 196835 (192.2 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15272 bytes 3833686 (3.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth078279d: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 66:c2:37:5f:0c:77 txqueuelen 0 (Ethernet) RX packets 2640 bytes 193966 (189.4 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15218 bytes 3818990 (3.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth24ee134: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 86:4a:9d:0a:36:b3 txqueuelen 0 (Ethernet) RX packets 2724 bytes 201251 (196.5 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15280 bytes 3883398 (3.7 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth2a0610e: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 5a:36:ba:1d:36:e1 txqueuelen 0 (Ethernet) RX packets 14212 bytes 3043726 (2.9 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 26763 bytes 11746199 (11.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth3a88bf7: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 9a:69:ab:ad:ca:eb txqueuelen 0 (Ethernet) RX packets 2682 bytes 196462 (191.8 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 15247 bytes 3829363 (3.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth4d08f4e: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 66:c0:7e:ed:a1:aa txqueuelen 0 (Ethernet) RX packets 546 bytes 68342 (66.7 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 13524 bytes 1302952 (1.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth67adcac: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether fa:18:38:94:d5:3c txqueuelen 0 (Ethernet) RX packets 678 bytes 45224 (44.1 KiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 13264 bytes 964188 (941.5 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth6da02a8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether f6:39:c7:fd:37:a5 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12964 bytes 922513 (900.8 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 veth9ff9edb: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether b6:90:ee:1e:78:51 txqueuelen 0 (Ethernet) RX packets 108208 bytes 9429802 (8.9 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 120591 bytes 23724853 (22.6 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 vethd053472: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 16:b7:76:95:61:f9 txqueuelen 0 (Ethernet) RX packets 3750956 bytes 261156264 (249.0 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 3955400 bytes 6859510043 (6.3 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 vethe102d85: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 0a:a9:0b:9d:4e:4a txqueuelen 0 (Ethernet) RX packets 17844 bytes 1561338 (1.4 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 26341 bytes 12955492 (12.3 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 virbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255 ether 52:54:00:65:b6:6b txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 virbr0-nic: flags=4098<BROADCAST,MULTICAST> mtu 1500 ether 52:54:00:65:b6:6b txqueuelen 1000 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions Link to comment
bonienl Posted August 2, 2016 Share Posted August 2, 2016 @afoard. Can you post your diagnostics. Link to comment
space Posted August 2, 2016 Share Posted August 2, 2016 Do I have to edit my win10 VM config? Updated from last beta to RC3. Everything works (Also xubuntu VM) except Win10 VM (with gpu passthrough). Execution error internal error: early end of file from monitor, possible problem: 2016-08-02T11:32:03.472214Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/29: Operation not permitted 2016-08-02T11:32:03.472246Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 29 2016-08-02T11:32:03.472254Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed Link to comment
eschultz Posted August 2, 2016 Share Posted August 2, 2016 Do I have to edit my win10 VM config? Updated from last beta to RC3. Everything works (Also xubuntu VM) except Win10 VM (with gpu passthrough). Execution error internal error: early end of file from monitor, possible problem: 2016-08-02T11:32:03.472214Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error opening /dev/vfio/29: Operation not permitted 2016-08-02T11:32:03.472246Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 29 2016-08-02T11:32:03.472254Z qemu-system-x86_64: -device vfio-pci,host=03:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed Yes, just edit the Win10 VM, Save and see if the VM will start after that. Link to comment
space Posted August 2, 2016 Share Posted August 2, 2016 Thanks! Lots of things added I see. VM now starts, but USB passthrough is broken, so no mouse or keyboard action. My USB controller was listed under "Other PCI devices" as Intel C610/X99 series chipset USB xHCI Host Controller | USB controller (00:14.0), which is correct. It still is listed in the XML as: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x00' slot='0x14' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> </hostdev> like before. Well, Address type is different. Old code: <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> But this is auto generated, right? Edit: Solved: Removed all hostdev tags in the xml except usb controller. Started VM. (no screen, but mouse lighted up) Stopped VM. GUI edit VM, and choosing GPU. Save. Opening XML. Voila! New tags <alias name='hostdev1'/> and <alias name='hostdev2'/> Started VM. Everything working again. Link to comment
Squid Posted August 3, 2016 Share Posted August 3, 2016 Can someone clarify for me what I should be doing for Docker and VMs in reference to the shares? It was my understanding that I should map docker .img and appdata to a disk directly and not use user share as it can cause issues, is this correct? Or should I use user share and create hard links? I'm confused now ha. I'm not using a cache drive btw. While LT states that hardlinks do now work on user shares it is still a new feature with not a lot of evidence as of yet as to the stability or usability of that feature. For the immediate future you'll be safest putting docker appdata directly onto a disk share. Sent from my LG-D852 using Tapatalk Although if it's not used then any remaining issues won't get found. Not sure where to post this. Either here, in the hardlink programming thread, or in defect reports, because TBH the meaning of this is beyond me. Here's my rsync command: Executing rsync: /usr/bin/rsync -avXHq --delete --exclude "/mnt/cache/docker.img" --log-file="/var/lib/docker/unraid/community.applications.datastore/appdata_backup.log" "/mnt/cache/appdata/" "/mnt/user/Backups/Docker Appdata/[email protected]" > /dev/null 2>&1 Everything was working perfectly with CA's appdata backup to user shares until I wound up installing a new container just to test that it installed ok. Now I'm getting this error: 2016/08/02 12:44:34 [6720] rsync: mknod "/mnt/user/Backups/Docker Appdata/[email protected]/gitlab-ce/data/gitlab-rails/sockets/gitlab.socket" failed: Function not implemented (38) 2016/08/02 12:44:34 [6720] rsync: mknod "/mnt/user/Backups/Docker Appdata/[email protected]/gitlab-ce/data/gitlab-workhorse/socket" failed: Function not implemented (38) 2016/08/02 12:44:34 [6720] rsync: mknod "/mnt/user/Backups/Docker Appdata/[email protected]/gitlab-ce/data/postgresql/.s.PGSQL.5432" failed: Function not implemented (38) (lots of fun finding 3 errors in a log composed of 200K lines ) Going directly to a disk share instead of a user share works perfectly. TBH not sure how commonly used this function is (I suspect its very rarely used in appdata) Link to comment
johnodon Posted August 3, 2016 Share Posted August 3, 2016 Going directly to a disk share instead of a user share works perfectly. TBH not sure how commonly used this function is (I suspect its very rarely used in appdata) Everything I see om Google about that error talks about Linux exFAT partitions. Is that the case here? Here is one example... http://blog.marcelotmelo.com/linux/ubuntu/rsync-to-an-exfat-partition/ It happens that I automatically use rsync as follows: rsync -av [sRC] [DESTINATION] I always use the av switch, that states for: v: increases verbosity, shows the files being synchronized a: archive, replaces the rlptgoD switches (recurse dirs, preserve symlinks, preserve permissions, preserves modification times, preserve groups, preserve owner and preserve Device files). The problem is that the Linux exFAT does not cope well with the switches that relate to permissions (the pgo), so the solution is to run rsync with the following switches, removing the p, g and o: rsync -rltDv [sRC] [DESTINATION] Link to comment
Squid Posted August 3, 2016 Share Posted August 3, 2016 Going directly to a disk share instead of a user share works perfectly. TBH not sure how commonly used this function is (I suspect its very rarely used in appdata) Everything I see om Google about that error talks about Linux exFAT partitions. Is that the case here? Here is one example... http://blog.marcelotmelo.com/linux/ubuntu/rsync-to-an-exfat-partition/ It happens that I automatically use rsync as follows: rsync -av [sRC] [DESTINATION] I always use the av switch, that states for: v: increases verbosity, shows the files being synchronized a: archive, replaces the rlptgoD switches (recurse dirs, preserve symlinks, preserve permissions, preserves modification times, preserve groups, preserve owner and preserve Device files). The problem is that the Linux exFAT does not cope well with the switches that relate to permissions (the pgo), so the solution is to run rsync with the following switches, removing the p, g and o: rsync -rltDv [sRC] [DESTINATION] Everything is xfs, and works 100% if bypassing user shares so its just shfs isn't implementing that particular function. Sent from my LG-D852 using Tapatalk Link to comment
ryanm91 Posted August 3, 2016 Share Posted August 3, 2016 ive noticed since upgrading to this release my cache will no longer move to the array the mover script simply runs and then stops. Link to comment
Frank1940 Posted August 3, 2016 Share Posted August 3, 2016 ive noticed since upgrading to this release my cache will no longer move to the array the mover script simply runs and then stops. Where is your diagnostics file? Link to comment
Recommended Posts