mr-hexen Posted April 10, 2016 Share Posted April 10, 2016 Look at your VM mappings... some of those shares don't even exist. you had it mapped to /mnt/user/system/libvirt it should have been /mnt/user/libvirt from what I can see you changed it to to get it working /mnt/disk8/libvirt = /mnt/user/libvirt as far as the system cares. Link to comment
clevoir Posted April 11, 2016 Share Posted April 11, 2016 Please note that I awoke this morning to find that my daily backup of the array had failed overnight. On checking I found that all mapped shares to the array had been dropped on Windows Explorer, using a network PC. I was able to access the GUI, and on trying to access the mapped drives manually, they have suddenly become accessable again Please find find attached diagnostic file which I obtained this morning. tower-diagnostics-20160411-0643.zip Link to comment
eschultz Posted April 11, 2016 Share Posted April 11, 2016 Please note that I awoke this morning to find that my daily backup of the array had failed overnight. On checking I found that all mapped shares to the array had been dropped on Windows Explorer, using a network PC. I was able to access the GUI, and on trying to access the mapped drives manually, they have suddenly become accessable again Please find find attached diagnostic file which I obtained this morning. Not sure what time your backup starts, but I noticed in your logs either the network was unplugged or the switch/router was reset that's connect to your unRAID box: Apr 10 18:51:16 Tower kernel: e1000e: eth0 NIC Link is Down Apr 10 18:51:17 Tower ntpd[1573]: Deleting interface #2 eth0, 192.168.1.10#123, interface stats: received=75, sent=75, dropped=0, active_time=7137 secs Apr 10 18:51:17 Tower ntpd[1573]: 192.168.1.200 local addr 192.168.1.10 -> <null> Apr 10 18:52:14 Tower kernel: e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Link to comment
ryoko227 Posted April 11, 2016 Share Posted April 11, 2016 I made reference to this issue in another post located here, but thought better to focus on the issues related to OVMF and beta 21 that I'm having here. http://lime-technology.com/forum/index.php?topic=48241.0 When I try to transfer files around the shares from the VM's vdisks or access them from the primary/secondary vdisk, my Win10 OVMF-440 VM crashes, looses ability to find files, crashes unraid, etc. For the first time tonight I was able to successfully transfer 22GB from a share to the primary vdisk without this sort of crash. However, shortly afterwards while trying to access and update the game located in those folders, I started getting multiple: file does not exist, cannot find specified files, etc. type of errors. Attached is the diagnostics file from after when the VM started acting strangely. For ease of use, I will also repost the xml and VM log as well. The system is... MB- MSI X99A SLI Plus CPU- Intel Xeon E5 2670 V3 Mem- 2x8GB Kingston DDR4-2133 GPU- (2) MSI GTX960 GAMING 4G SSD- (1) 250GB SK hynix (cache) HDD-(2) 3TB Seagate (parity and storage) I have Dockers activated, but nothing installed. Only the 2 default plugins, and have only added the vrom tag to the XML file. XML <domain type='kvm' id='1'> <name>Win10OVMF</name> <uuid>685fc4b3-40bb-64df-d571-cdf37b27f929</uuid> <metadata> <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/> </metadata> <memory unit='KiB'>7340032</memory> <currentMemory unit='KiB'>7340032</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>10</vcpu> <cputune> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='3'/> <vcpupin vcpu='2' cpuset='4'/> <vcpupin vcpu='3' cpuset='5'/> <vcpupin vcpu='4' cpuset='6'/> <vcpupin vcpu='5' cpuset='14'/> <vcpupin vcpu='6' cpuset='15'/> <vcpupin vcpu='7' cpuset='16'/> <vcpupin vcpu='8' cpuset='17'/> <vcpupin vcpu='9' cpuset='18'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader> <nvram>/etc/libvirt/qemu/nvram/685fc4b3-40bb-64df-d571-cdf37b27f929_VARS-pure-efi.fd</nvram> </os> <features> <acpi/> <apic/> <hyperv> <relaxed state='on'/> <vapic state='on'/> <spinlocks state='on' retries='8191'/> <vendor id='none'/> </hyperv> </features> <cpu mode='host-passthrough'> <topology sockets='1' cores='5' threads='2'/> </cpu> <clock offset='localtime'> <timer name='hypervclock' present='yes'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/local/sbin/qemu</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <source file='/mnt/user/vdisks/Win10OVMF/vdisk1.img'/> <backingStore/> <target dev='hdc' bus='virtio'/> <boot order='1'/> <alias name='virtio-disk2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/user/ISOs/OS iso/Windows 10 Pro 64bit.iso'/> <backingStore/> <target dev='hda' bus='sata'/> <readonly/> <boot order='2'/> <alias name='sata0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/user/ISOs/virtio iso/virtio-win-0.1.113.iso'/> <backingStore/> <target dev='hdb' bus='sata'/> <readonly/> <alias name='sata0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <controller type='usb' index='0' model='nec-xhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='sata' index='0'> <alias name='sata0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:1f:4d:ff'/> <source bridge='br0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/0'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/0'> <source path='/dev/pts/0'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-Win10OVMF/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <rom file='/boot/vbios.rom'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/> </source> <alias name='hostdev1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='usb' managed='no'> <source> <vendor id='0x056e'/> <product id='0x0035'/> <address bus='5' device='3'/> </source> <alias name='hostdev2'/> </hostdev> <hostdev mode='subsystem' type='usb' managed='no'> <source> <vendor id='0x1c4f'/> <product id='0x0002'/> <address bus='5' device='8'/> </source> <alias name='hostdev3'/> </hostdev> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </memballoon> </devices> </domain> VM log 2016-04-11 10:46:00.476+0000: starting up libvirt version: 1.3.1, qemu version: 2.5.1, hostname: Beast LC_ALL=C PATH=/bin:/sbin:/usr/bin:/usr/sbin HOME=/ QEMU_AUDIO_DRV=none /usr/local/sbin/qemu -name Win10OVMF -S -machine pc-i440fx-2.5,accel=kvm,usb=off,mem-merge=off -cpu host,hv_time,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_vendor_id=none -drive file=/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/etc/libvirt/qemu/nvram/685fc4b3-40bb-64df-d571-cdf37b27f929_VARS-pure-efi.fd,if=pflash,format=raw,unit=1 -m 7168 -realtime mlock=on -smp 10,sockets=1,cores=5,threads=2 -uuid 685fc4b3-40bb-64df-d571-cdf37b27f929 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-Win10OVMF/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=localtime -no-hpet -no-shutdown -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x7 -device ahci,id=sata0,bus=pci.0,addr=0x3 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/mnt/usio-pci,host=03:00.0,id=hostdev0,bus=pci.0,addr=0x6,romfile=/boot/vbios.rom -device vfio-pci,host=03:00.1,id=hostdev1,bus=pci.0,addr=0x8 -device usb-host,hostbus=5,hostaddr=3,id=hostdev2 -device usb-host,hostbus=5,hostaddr=8,id=hostdev3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 -msg timestamp=on Domain id=1 is tainted: high-privileges Domain id=1 is tainted: host-cpu char device redirected to /dev/pts/0 (label charserial0) 2016-04-11T12:02:53.561741Z qemu-system-x86_64: terminating on signal 15 from pid 6261 2016-04-11 12:02:53.756+0000: shutting down beast-diagnostics-20160411-1950.zip Link to comment
clevoir Posted April 11, 2016 Share Posted April 11, 2016 My backup was carried out at 02:00 today, the brief loss of network connectivity is where I tried a different network cable Link to comment
yippy3000 Posted April 11, 2016 Share Posted April 11, 2016 I have a Docker container that always says "update ready" in the Unraid Docker page. I can update it 6 times in a row and it will still list that. When I do run the update, the logs say that the container is up to date and does not download anything, just re-launches it. The Docker is the official Postgres Docker https://hub.docker.com/_/postgres/ I am running 6.2.0 beta 21 but this has been happening since at least 6.2.0 beta 18 (when I upgraded to the beta). Diagnostics attached aeris-diagnostics-20160411-1055.zip Link to comment
FrozenGamer Posted April 11, 2016 Share Posted April 11, 2016 I am creating a 2nd server - testing it for now, then to put data on once i trust it. I have a few questions if someone can help. 1- Unraid irc channel says broken - should i just wait for beta 22? At the moment all i need to do is load it up with data and samba share, no vms, at some point i would use docker containers. 2- I precleared 4 seagate 8tb harvested from usb enclosure drives 2 cycles each. After that i made 1 a parity and 3 assigned as data, then i just tried to mount, it appears that there are some errors/warnings, are these errors normal in that the server sees drives aren't formatted and formats them? I see a number of other errors in my syslog as well, is there something wrong? 3- It is doing a parity sync/data rebuild, at 2.2%, i know that a parity sync is normal when creating an array for first time, but is it normal to take so long when there is no actual data? If someone has time to look at my log/answer questions i would appreciate it, before i move forward with transferring data over and enabling my 2nd unraid pro key on this trial. pipe-diagnostics-20160411-1152.zip Link to comment
itimpi Posted April 11, 2016 Share Posted April 11, 2016 2- I precleared 4 seagate 8tb harvested from usb enclosure drives 2 cycles each. After that i made 1 a parity and 3 assigned as data, then i just tried to mount, it appears that there are some errors/warnings, are these errors normal in that the server sees drives aren't formatted and formats them? I see a number of other errors in my syslog as well, is there something wrong? It is normal that new drives are seen as unformattted. 3- It is doing a parity sync/data rebuild, at 2.2%, i know that a parity sync is normal when creating an array for first time, but is it normal to take so long when there is no actual data? Parity has no idea what is on the disks and as such is unaware of the data that is on them at the file system level. It just sees the disk as a bunch of sectors that need protecting against failure of anyy disk, so yes on any new system the whole of each disk will be read as part of creating parity. Link to comment
FrozenGamer Posted April 11, 2016 Share Posted April 11, 2016 2- I precleared 4 seagate 8tb harvested from usb enclosure drives 2 cycles each. After that i made 1 a parity and 3 assigned as data, then i just tried to mount, it appears that there are some errors/warnings, are these errors normal in that the server sees drives aren't formatted and formats them? I see a number of other errors in my syslog as well, is there something wrong? It is normal that new drives are seen as unformattted. 3- It is doing a parity sync/data rebuild, at 2.2%, i know that a parity sync is normal when creating an array for first time, but is it normal to take so long when there is no actual data? Parity has no idea what is on the disks and as such is unaware of the data that is on them at the file system level. It just sees the disk as a bunch of sectors that need protecting against failure of anyy disk, so yes on any new system the whole of each disk will be read as part of creating parity. would this email/popup be normal - ? Event: unRAID Parity disk error Subject: Warning [PIPE] - Parity disk, parity-sync in progress Description: ST8000AS0002-1NA17Z_Z8408NKD (sdb) Importance: warning Link to comment
Helmonder Posted April 11, 2016 Share Posted April 11, 2016 Yes... that is normal;.. it is telling you there is no partity and that parity is beiing created. Link to comment
Adam64 Posted April 11, 2016 Share Posted April 11, 2016 This morning my GUI was unresponsive, and I had to power off/power on to get it working again. I was able to telnet into the box and get a diag (attached). I'm suspecting (without much data to base it on) that it had something to do with my plex docker. I tried to stop the plex docker from the command line (docker stop plex) and it would just hang the command line until I hit ctrl-c. Diags attached. Would appreciate input as I've found this to be less reliable than I would like. I had to do the power toggle a couple of times yesterday too. unraid-diagnostics-20160411-0912.zip Link to comment
jbartlett Posted April 11, 2016 Share Posted April 11, 2016 I have a Docker container that always says "update ready" in the Unraid Docker page. I can update it 6 times in a row and it will still list that. When I do run the update, the logs say that the container is up to date and does not download anything, just re-launches it. The Docker is the official Postgres Docker https://hub.docker.com/_/postgres/ I am running 6.2.0 beta 21 but this has been happening since at least 6.2.0 beta 18 (when I upgraded to the beta). Diagnostics attached I have the same issue with the "mysql" app. Likely due to the single name/no user repository. https://registry.hub.docker.com/_/mysql/ Link to comment
eschultz Posted April 12, 2016 Share Posted April 12, 2016 I have a Docker container that always says "update ready" in the Unraid Docker page. I can update it 6 times in a row and it will still list that. When I do run the update, the logs say that the container is up to date and does not download anything, just re-launches it. The Docker is the official Postgres Docker https://hub.docker.com/_/postgres/ I am running 6.2.0 beta 21 but this has been happening since at least 6.2.0 beta 18 (when I upgraded to the beta). Diagnostics attached I have the same issue with the "mysql" app. Likely due to the single name/no user repository. https://registry.hub.docker.com/_/mysql/ We plan on fixing the update issue for official Docker images for a future beta. Link to comment
clevoir Posted April 12, 2016 Share Posted April 12, 2016 I awoke this morning and found that my arrray had been sucessfully backed up overnight @ 2am, also mapped drives were also shown on Windows Explorer. I have copied a file into one of the shares, and deleted it Ok, however if I then try and access any further folders & files in any of the other shares, there is a time delay of several seconds before they appear? Please find attached diagnostics file tower-diagnostics-20160412-0625.zip Link to comment
Squid Posted April 12, 2016 Share Posted April 12, 2016 I awoke this morning and found that my arrray had been sucessfully backed up overnight @ 2am, also mapped drives were also shown on Windows Explorer. I have copied a file into one of the shares, and deleted it Ok, however if I then try and access any further folders & files in any of the other shares, there is a time delay of several seconds before they appear? Please find attached diagnostics file The Dynamix Cache Dirs plugin will help to alleviate this Link to comment
optim Posted April 12, 2016 Share Posted April 12, 2016 This morning my GUI was unresponsive, and I had to power off/power on to get it working again. I was able to telnet into the box and get a diag (attached). I'm suspecting (without much data to base it on) that it had something to do with my plex docker. I tried to stop the plex docker from the command line (docker stop plex) and it would just hang the command line until I hit ctrl-c. Diags attached. Would appreciate input as I've found this to be less reliable than I would like. I had to do the power toggle a couple of times yesterday too. I'm in the same situation. I've tried disabling all plugins, vm's and dockers but it still doesn't improve the lock ups. The system becomes unresponsive with a load (according to top) of >50. Iotop doesn't show any IO activity and top says the CPU is not busy, yet load remains extremely high. Dmesg doesn't have anything of interest, with the last messages being about spindowns. The system will not shut down once the load gets that high, so I have to resort to powering off. I should also mention that I can connect through telnet while this is happening, but depending on what command I issue the session will lock up. For example a "btrfs fi sh" will never return. I also noticed that any significant concurrent IO will bring on the problem quickly, which made me wonder if perhaps there is some kind of deadlock/race condition happening with the new dual parity code. Totally unsubstantiated (sorry Tom, I'm not trying to point fingers!), just offering my uninformed guess at least. My other thought was maybe BTRFS was dying under the concurrent IO, but then again BTRFS was solid when I transferred the 70+ TB from my ZFS disks onto BTRFS so I could move back to Unraid. I did that using Ubuntu 16.04 Beta which used the 4.4 kernel as well, and had 3 disks copying at the same time (from 3 other disks, not thrashing). Average throughput on the hardware saturated a SATA2 connection and I never had a lockup in the two weeks it took me to move the data. All 27 data drives in the array are formatted with BTRFS and are spread across two Norco 24 bay enclosures using an Intel SAS expander. This setup was reliable using Ubuntu 14.04 and ZoL so I know the hardware is solid. Something just needs to be tweaked a little to make it reliable. Also, FWIW, heavy IO that brings on the lock up was not using any of the drives on the expander. It was done by moving data from one drive to another using mc in an ssh session and having NzbGet working on uncompressing a large 200GB download. Diags are attached and you can PM me if you want me to test anything for you... Thanks to the LimeTech staff and volunteers for all your efforts! unmedia-diagnostics-20160412-0734.zip Link to comment
Bjonness406 Posted April 12, 2016 Share Posted April 12, 2016 Is it possible to have turbo write on a share and not other? Would be awesome Link to comment
clevoir Posted April 12, 2016 Share Posted April 12, 2016 I previously had Cache Dir installed, and the problem was happening. I went back to bare metal to try and remove plugins in my testing. Sometimes instead of a delay, I get a hard lockup of Terracopy / Windows Explorer Link to comment
Adam64 Posted April 12, 2016 Share Posted April 12, 2016 I previously had Cache Dir installed, and the problem was happening. I went back to bare metal to try and remove plugins in my testing. Sometimes instead of a delay, I get a hard lockup of Terracopy / Windows Explorer You might try turning off SMB2&3 on your client PC (assuming it's Win10?) and see if that helps. 6.2 includes an updated Samba, but I had issues with it that seemed to go away when I turned off SMB2&3 on my Win10 machine. Link to comment
Squid Posted April 12, 2016 Share Posted April 12, 2016 I previously had Cache Dir installed, and the problem was happening. I went back to bare metal to try and remove plugins in my testing. Sometimes instead of a delay, I get a hard lockup of Terracopy / Windows Explorer Little tip: it helps to quote responses in this thread because there's many conversations going on at once. PM me if you're running docker apps. I have a possible theory but need someone to try it. (and have been trying to justify it to myself). Sent from my LG-D852 using Tapatalk Link to comment
jonp Posted April 13, 2016 Share Posted April 13, 2016 Is it possible to have turbo write on a share and not other? Would be awesome Not possible. Sent from my Nexus 6 using Tapatalk Link to comment
optim Posted April 13, 2016 Share Posted April 13, 2016 Did a little more testing related to the high load/unresponsive server tonight. I re-enabled all the disabled dockers and queued up some downloads. I now have a par2 repair stuck in the download queue that puts enough IO strain on the server to have it lock up within 10 minutes of booting. I've rebooted 3 times to ensure that it will lock up consistently. I then did something daring (or maybe stupid ) to eliminate the possibility of it being the dual parity. I unassigned both my parity drives and rebooted the server to see if it would lock up. The good news is that it locked up with 10 minutes of booting, so I'm now assuming it does not have anything to do with the new/changed dual parity code (sorry for doubting you Tom!). The bad news is I'm more stumped than ever as to what it could be. Below is what top and iotop are reporting at roughly the same time. The server is sitting in an unresponsive state right now, although my previously connected ssh sessions continue to update the top and iotop screens. You'll notice that the top command shows high load and the wa figure indicates its waiting on IO of some sort. But the iotop doesn't show any significant disk use. In fact there is no disk use and if I leave it long enough the drives spin down as per their settings (seen in syslog and dmesg). I'm not enough of a Linux guru to figure out where to look next, so if anyone has suggestions on what next steps could be, please pass them along. Thanks! top: top - 20:53:21 up 42 min, 3 users, load average: 38.53, 38.31, 32.92 Tasks: 1031 total, 2 running, 1029 sleeping, 0 stopped, 0 zombie %Cpu(s): 10.7 us, 9.2 sy, 0.0 ni, 0.0 id, 80.2 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 32989816 total, 17111804 free, 1649940 used, 14228072 buff/cache KiB Swap: 0 total, 0 free, 0 used. 30570024 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21618 nobody 20 0 513616 11372 3148 R 99.7 0.0 29:55.60 /usr/bin/par2 r /incomplete-d+ 18167 nobody 20 0 432220 162120 37060 S 60.2 0.5 17:19.48 ./Plex Media Server 7785 root 20 0 83092 21380 7160 S 6.9 0.1 2:42.74 /usr/bin/python /usr/sbin/iot+ 8696 root 20 0 25892 4212 2468 R 1.0 0.0 0:22.84 top 292 root 39 19 0 0 0 S 0.3 0.0 0:00.22 [khugepaged] 11685 root 20 0 25772 3880 2368 S 0.3 0.0 0:21.00 top 1 root 20 0 4372 1640 1532 S 0.0 0.0 0:07.00 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.03 [kthreadd] 3 root 20 0 0 0 0 S 0.0 0.0 0:00.21 [ksoftirqd/0] 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/0:0H] 7 root 20 0 0 0 0 S 0.0 0.0 0:01.18 [rcu_preempt] 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [rcu_sched] 9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [rcu_bh] 10 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/0] 11 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/1] 12 root 20 0 0 0 0 S 0.0 0.0 0:00.08 [ksoftirqd/1] 14 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/1:0H] 15 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/2] 16 root 20 0 0 0 0 S 0.0 0.0 0:00.02 [ksoftirqd/2] 18 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/2:0H] 19 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/3] 20 root 20 0 0 0 0 S 0.0 0.0 0:00.07 [ksoftirqd/3] 21 root 20 0 0 0 0 S 0.0 0.0 0:00.09 [kworker/3:0] 22 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/3:0H] 23 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/4] 24 root 20 0 0 0 0 S 0.0 0.0 0:00.06 [ksoftirqd/4] 26 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/4:0H] 27 root rt 0 0 0 0 S 0.0 0.0 0:00.01 [migration/5] 28 root 20 0 0 0 0 S 0.0 0.0 0:00.02 [ksoftirqd/5] 30 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 [kworker/5:0H] iotop: Total DISK READ : 0.00 B/s | Total DISK WRITE : 0.00 B/s Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 4945 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.03 % [kworker/u16:12] 1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % init 2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd] 3 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/0] 5 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/0:0H] 7 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_preempt] 8 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_sched] 9 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [rcu_bh] 10 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/0] 11 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/1] 12 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/1] 14 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/1:0H] 15 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/2] 16 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/2] 18 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/2:0H] 19 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/3] 20 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/3] 21 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/3:0] 22 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/3:0H] 23 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/4] 24 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/4] 26 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/4:0H] 27 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/5] 28 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/5] 30 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/5:0H] 31 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/6] 32 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [ksoftirqd/6] 33 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/6:0] 34 be/0 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/6:0H] 35 rt/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [migration/7] dmesg | tail -n 50: [ 646.361780] eth0: renamed from veth85dfe66 [ 646.370907] docker0: port 4(veth71faf9b) entered forwarding state [ 646.370923] docker0: port 4(veth71faf9b) entered forwarding state [ 646.466434] docker0: port 2(veth415f089) entered forwarding state [ 648.772820] device veth6254b33 entered promiscuous mode [ 648.772972] docker0: port 5(veth6254b33) entered forwarding state [ 648.772988] docker0: port 5(veth6254b33) entered forwarding state [ 648.773860] docker0: port 5(veth6254b33) entered disabled state [ 651.842532] docker0: port 3(vethcda3321) entered forwarding state [ 653.168843] eth0: renamed from veth3c62677 [ 653.173696] docker0: port 5(veth6254b33) entered forwarding state [ 653.173712] docker0: port 5(veth6254b33) entered forwarding state [ 661.378664] docker0: port 4(veth71faf9b) entered forwarding state [ 667.530740] BTRFS info (device loop1): disk space caching is enabled [ 667.530744] BTRFS: has skinny extents [ 668.226770] docker0: port 5(veth6254b33) entered forwarding state [ 668.917968] BTRFS info (device loop1): new size for /dev/loop1 is 1073741824 [ 668.925642] tun: Universal TUN/TAP device driver, 1.6 [ 668.925643] tun: (C) 1999-2004 Max Krasnyansky <[email protected]> [ 670.188941] device virbr0-nic entered promiscuous mode [ 670.303816] virbr0: port 1(virbr0-nic) entered listening state [ 670.303830] virbr0: port 1(virbr0-nic) entered listening state [ 670.326386] virbr0: port 1(virbr0-nic) entered disabled state [ 2500.891496] mdcmd (63): spindown 19 [ 2501.318623] mdcmd (64): spindown 21 [ 2503.464412] mdcmd (65): spindown 9 [ 2503.868055] mdcmd (66): spindown 10 [ 2504.154809] mdcmd (67): spindown 11 [ 2505.158121] mdcmd (68): spindown 14 [ 2505.585249] mdcmd (69): spindown 17 [ 2507.589672] mdcmd (70): spindown 1 [ 2508.026666] mdcmd (71): spindown 3 [ 2509.029710] mdcmd (72): spindown 4 [ 2509.456308] mdcmd (73): spindown 8 [ 2510.460182] mdcmd (74): spindown 12 [ 2510.897515] mdcmd (75): spindown 13 [ 2511.184286] mdcmd (76): spindown 15 [ 2511.471072] mdcmd (77): spindown 16 [ 2511.757819] mdcmd (78): spindown 20 [ 2513.185802] mdcmd (79): spindown 5 [ 2514.473687] mdcmd (80): spindown 6 [ 2518.143213] mdcmd (81): spindown 26 [ 2520.572942] mdcmd (82): spindown 2 [ 2527.584083] mdcmd (83): spindown 22 [ 2533.878293] mdcmd (84): spindown 23 [ 2535.306545] mdcmd (85): spindown 7 [ 2535.593328] mdcmd (86): spindown 18 [ 2536.022367] mdcmd (87): spindown 24 [ 2536.449974] mdcmd (88): spindown 25 [ 2536.736169] mdcmd (89): spindown 27 tail -n 50 /var/log/syslog Apr 12 20:22:19 unmedia root: Starting libvirtd... Apr 12 20:22:19 unmedia kernel: tun: Universal TUN/TAP device driver, 1.6 Apr 12 20:22:19 unmedia kernel: tun: (C) 1999-2004 Max Krasnyansky <[email protected]> Apr 12 20:22:19 unmedia emhttp: nothing to sync Apr 12 20:22:19 unmedia rc.unRAID[18670][18674]: Processing /etc/rc.d/rc.unRAID.d/ start scripts. Apr 12 20:22:20 unmedia kernel: device virbr0-nic entered promiscuous mode Apr 12 20:22:21 unmedia avahi-daemon[12607]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1. Apr 12 20:22:21 unmedia avahi-daemon[12607]: New relevant interface virbr0.IPv4 for mDNS. Apr 12 20:22:21 unmedia avahi-daemon[12607]: Registering new address record for 192.168.122.1 on virbr0.IPv4. Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered listening state Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered listening state Apr 12 20:22:21 unmedia dnsmasq[19079]: started, version 2.75 cachesize 150 Apr 12 20:22:21 unmedia dnsmasq[19079]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: DHCP, sockets bound exclusively to interface virbr0 Apr 12 20:22:21 unmedia dnsmasq[19079]: reading /etc/resolv.conf Apr 12 20:22:21 unmedia dnsmasq[19079]: using nameserver 192.168.10.1#53 Apr 12 20:22:21 unmedia dnsmasq[19079]: read /etc/hosts - 2 addresses Apr 12 20:22:21 unmedia dnsmasq[19079]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: read /var/lib/libvirt/dnsmasq/default.hostsfile Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered disabled state Apr 12 20:38:02 unmedia sshd[24198]: Accepted none for root from 192.168.10.248 port 50879 ssh2 Apr 12 20:52:51 unmedia kernel: mdcmd (63): spindown 19 Apr 12 20:52:52 unmedia kernel: mdcmd (64): spindown 21 Apr 12 20:52:54 unmedia kernel: mdcmd (65): spindown 9 Apr 12 20:52:54 unmedia kernel: mdcmd (66): spindown 10 Apr 12 20:52:54 unmedia kernel: mdcmd (67): spindown 11 Apr 12 20:52:55 unmedia kernel: mdcmd (68): spindown 14 Apr 12 20:52:56 unmedia kernel: mdcmd (69): spindown 17 Apr 12 20:52:58 unmedia kernel: mdcmd (70): spindown 1 Apr 12 20:52:58 unmedia kernel: mdcmd (71): spindown 3 Apr 12 20:52:59 unmedia kernel: mdcmd (72): spindown 4 Apr 12 20:53:00 unmedia kernel: mdcmd (73): spindown 8 Apr 12 20:53:01 unmedia kernel: mdcmd (74): spindown 12 Apr 12 20:53:01 unmedia kernel: mdcmd (75): spindown 13 Apr 12 20:53:01 unmedia kernel: mdcmd (76): spindown 15 Apr 12 20:53:02 unmedia kernel: mdcmd (77): spindown 16 Apr 12 20:53:02 unmedia kernel: mdcmd (78): spindown 20 Apr 12 20:53:03 unmedia kernel: mdcmd (79): spindown 5 Apr 12 20:53:05 unmedia kernel: mdcmd (80): spindown 6 Apr 12 20:53:08 unmedia kernel: mdcmd (81): spindown 26 Apr 12 20:53:11 unmedia kernel: mdcmd (82): spindown 2 Apr 12 20:53:18 unmedia kernel: mdcmd (83): spindown 22 Apr 12 20:53:24 unmedia kernel: mdcmd (84): spindown 23 Apr 12 20:53:26 unmedia kernel: mdcmd (85): spindown 7 Apr 12 20:53:26 unmedia kernel: mdcmd (86): spindown 18 Apr 12 20:53:26 unmedia kernel: mdcmd (87): spindown 24 Apr 12 20:53:27 unmedia kernel: mdcmd (88): spindown 25 Apr 12 20:53:27 unmedia kernel: mdcmd (89): spindown 27 Apr 12 21:01:01 unmedia sshd[27286]: Accepted none for root from 192.168.10.248 port 51560 ssh2 Link to comment
RobJ Posted April 13, 2016 Share Posted April 13, 2016 Did a little more testing related to the high load/unresponsive server tonight. I re-enabled all the disabled dockers and queued up some downloads. I now have a par2 repair stuck in the download queue that puts enough IO strain on the server to have it lock up within 10 minutes of booting. I've rebooted 3 times to ensure that it will lock up consistently. ...[snipped]... tail -n 50 /var/log/syslog Apr 12 20:22:19 unmedia rc.unRAID[18670][18674]: Processing /etc/rc.d/rc.unRAID.d/ start scripts. Apr 12 20:22:20 unmedia kernel: device virbr0-nic entered promiscuous mode Apr 12 20:22:21 unmedia avahi-daemon[12607]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1. Apr 12 20:22:21 unmedia avahi-daemon[12607]: New relevant interface virbr0.IPv4 for mDNS. Apr 12 20:22:21 unmedia avahi-daemon[12607]: Registering new address record for 192.168.122.1 on virbr0.IPv4. Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered listening state Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered listening state Apr 12 20:22:21 unmedia dnsmasq[19079]: started, version 2.75 cachesize 150 Apr 12 20:22:21 unmedia dnsmasq[19079]: compile time options: IPv6 GNU-getopt no-DBus i18n IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: DHCP, sockets bound exclusively to interface virbr0 Apr 12 20:22:21 unmedia dnsmasq[19079]: reading /etc/resolv.conf Apr 12 20:22:21 unmedia dnsmasq[19079]: using nameserver 192.168.10.1#53 Apr 12 20:22:21 unmedia dnsmasq[19079]: read /etc/hosts - 2 addresses Apr 12 20:22:21 unmedia dnsmasq[19079]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Apr 12 20:22:21 unmedia dnsmasq-dhcp[19079]: read /var/lib/libvirt/dnsmasq/default.hostsfile Apr 12 20:22:21 unmedia kernel: virbr0: port 1(virbr0-nic) entered disabled state Apr 12 20:38:02 unmedia sshd[24198]: Accepted none for root from 192.168.10.248 port 50879 ssh2 I'm in no way an expert here, but what's curious is that the internal bridge is set up, then disabled. That would leave anything using it hanging. That may not be important though, as nothing has had time to begin using it. Some possible steps to alter what's happening, try removing dnsmasq from the equation (don't know if you can). And try disabling avahi, just to see if anything changes. And perhaps try it without the internal bridging. Link to comment
Bjonness406 Posted April 13, 2016 Share Posted April 13, 2016 Is it possible to have turbo write on a share and not other? Would be awesome Not possible. Sent from my Nexus 6 using Tapatalk Guess it is time for a future request then Thanks Link to comment
FrozenGamer Posted April 13, 2016 Share Posted April 13, 2016 What is the disadvantage of Turbo Write? It made a huge difference in my test array with 4 8tb seagate drives. Difference between sustained 16MB to 40MB moving Terabyte of files. Link to comment
Recommended Posts