stottle

Members
  • Posts

    145
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

stottle's Achievements

Apprentice

Apprentice (3/14)

18

Reputation

  1. The error turned out to be a mismatch in ports between the two routers (mixing which was internal vs. external). Also, to the earlier person who mentioned still getting "insecure" messages due to having staging set to `true` - thanks, I hit that as well.
  2. I've followed Spaceinvaderone's video for setting up SWAG, but the docker container is giving an error: Requesting a certificate for <mySubDomain>.duckdns.org Certbot failed to authenticate some domains (authenticator: standalone). The Certificate Authority reported these problems: Domain: <mySubDomain>.duckdns.org Type: unauthorized Detail: Invalid response from http://<mySubDomain>.duckdns.org/.well-known/acme-challenge/U9o-N70woR3z5jnFl0cEVPWd711PJT8SAqRPiZLYAXc [<My IP>]: "<html>\r\n<head><title>404 Not Found</title></head>\r\n<body>\r\n<center><h1>404 Not Found</h1></center>\r\n<hr><center>nginx</center>\r\n" Hint: The Certificate Authority failed to download the challenge files from the temporary standalone webserver started by Certbot on port 80. Ensure that the listed domains point to this machine and that it can accept inbound connections from the internet. Some challenges have failed. I have two gateways, AT&T for ISP and a Google WiFi mesh, but I believe I have the port forwarding correct. Two reasons for this. 1) I can see my Plex server, so the two hop forwarding to that container is working 2) I was getting timeout errors in the log, but those have now changed to this unauthorized/404 error. For SWAG, I am have AT&T forward 80 and 443 directly (the only option I saw), and Google changing the ports to 180 and 1443. SWAG is set up for 180 and 1443. I'm trying to get http auth working as that seemed like the best place to start. I need to understand the other options better, too. Any tips for debugging?
  3. Thanks for all of the work here. I've got nextcloud/letsencrypt working with duckdns, which I wouldn't have tried without the support here and tutorials. One annoyance - is there an easy way to get unset urls (https://mydomain.duckdns.org/random_garbage) to map to 404 instead of the default "Welcome to our server?" Google searches for 404 and "welcome to our server" don't help...
  4. Balance failed and there are other errors. Here's a snippet Feb 15 18:30:52 Tower2 emhttp: shcmd (147): set -o pipefail ; /sbin/btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/cache |& logger & Feb 15 18:30:52 Tower2 emhttp: shcmd (148): sync Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1937353211904 flags 1 Feb 15 18:30:52 Tower2 emhttp: shcmd (149): mkdir /mnt/user0 Feb 15 18:30:52 Tower2 emhttp: shcmd (150): /usr/local/sbin/shfs /mnt/user0 -disks 62 -o noatime,big_writes,allow_other |& logger Feb 15 18:30:52 Tower2 emhttp: shcmd (151): mkdir /mnt/user Feb 15 18:30:52 Tower2 emhttp: shcmd (152): /usr/local/sbin/shfs /mnt/user -disks 63 2048000000 -o noatime,big_writes,allow_other -o remember=0 |& logger Feb 15 18:30:52 Tower2 emhttp: shcmd (153): cat - > /boot/config/plugins/dynamix/mover.cron <<< "# Generated mover schedule:#01240 3 * * * /usr/local/sbin/mover |& logger#012" Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1936279470080 flags 1 Feb 15 18:30:52 Tower2 emhttp: shcmd (154): /usr/local/sbin/update_cron &> /dev/null Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1935205728256 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1934131986432 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1933058244608 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1931984502784 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1930910760960 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1929837019136 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1928763277312 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1927689535488 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1926615793664 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1925542051840 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1924468310016 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1923394568192 flags 1 Feb 15 18:30:52 Tower2 emhttp: Starting services... Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1922320826368 flags 1 Feb 15 18:30:52 Tower2 kernel: BTRFS info (device sdb1): relocating block group 1921247084544 flags 1 Feb 15 18:30:52 Tower2 kernel: ata8.00: exception Emask 0x10 SAct 0x70000 SErr 0x400000 action 0x6 frozen Feb 15 18:30:52 Tower2 kernel: ata8.00: irq_stat 0x08000000, interface fatal error Diagnostics attached tower2-diagnostics-20170216-1754.zip
  5. Something seems to be wrong. The GUI is running very slowly (several second wait to load webpages) and I've clicked "Balance" twice without it making any changes to the screen. I.e., still looks like the image I sent previous, not saying it is doing a balance operation. I'm seeing the following repeated in the logs Feb 15 18:57:41 Tower2 root: ERROR: unable to resize '/var/lib/docker': Read-only file system Feb 15 18:57:41 Tower2 root: Resize '/var/lib/docker' of 'max' Feb 15 18:57:41 Tower2 emhttp: shcmd (461): /etc/rc.d/rc.docker start |& logger Feb 15 18:57:41 Tower2 root: starting docker ... Feb 15 18:57:51 Tower2 emhttp: shcmd (463): umount /var/lib/docker |& logger Feb 15 18:58:09 Tower2 php: /usr/local/emhttp/plugins/dynamix/scripts/btrfs_balance 'start' '/mnt/cache' '-dconvert=raid1 -mconvert=raid1' Feb 15 18:58:09 Tower2 emhttp: shcmd (475): set -o pipefail ; /usr/local/sbin/mount_image '/mnt/cache/docker.img' /var/lib/docker 20 |& logger Feb 15 18:58:09 Tower2 root: truncate: cannot open '/mnt/cache/docker.img' for writing: Read-only file system Feb 15 18:58:09 Tower2 kernel: BTRFS info (device loop1): disk space caching is enabled Feb 15 18:58:09 Tower2 kernel: BTRFS info (device loop1): has skinny extents
  6. It isn't looking like a balance started automatically. Under Main->Cache Devices, both SSDs were listed. When I ran blkdiscard and refreshed, Cache 2's icon turned from green to blue, so I started the array. Now the cache details look like the attached image, with no balance seeming to be running. Diagnostics attached as well. tower2-diagnostics-20170215-1843.zip
  7. @johnnie.black - thanks for your help and patience. I've updated to 6.3.1, powered down, disconnected cache2 and restarted. Then started the array. At that point, cache mounts to /mnt/cache. So far so good. Should just add the 2nd drive and then balance, or do you have other suggestions in this case? FYI root@Tower2:~# btrfs dev stats /mnt/cache [devid:1].write_io_errs 441396 [devid:1].read_io_errs 407459 [devid:1].flush_io_errs 2047 [devid:1].corruption_errs 0 [devid:1].generation_errs 0 [/dev/sdb1].write_io_errs 0 [/dev/sdb1].read_io_errs 0 [/dev/sdb1].flush_io_errs 0 [/dev/sdb1].corruption_errs 0 [/dev/sdb1].generation_errs 0 I assume that the errors are leftovers, as I didn't zero stat before I removed the 2nd cache drive. NB - the serial matches cache1, the drive that wasn't seeing errors (S21HNXAGC11924P)
  8. <swearing> So I tried changing the sata port for the problematic drive on my mobo. Array and cache drives looked ok on reboot, so I tried stats and got root@Tower2:~# btrfs dev stats /mnt/cache ERROR: cannot check /mnt/cache: No such file or directory ERROR: '/mnt/cache' is not a mounted btrfs device Hmm, ok. Maybe the cache isn't available until I start the array, so I start the array. It now says unmountable disk present Cache • Samsung_SSD_850_EVO_500GB_S21HNXAGC11924P (sdb) Immediately turned off the array. Apologies, but I don't want to touch anything until I get some advice. Diagnostics attached. Main->Cache Devices shows both drives. How do I recover the cache pool? Note, it is the good drive that isn't being recognized. tower2-diagnostics-20170213-1908.zip
  9. Not sure if this is progress or not. The above error looked like a passthrough issue, so opened the edit window and changed the soundcard from the nvidia GPU to "None". So VNC instead of GPU for audio/video, and I removed the passthrough of my pcie usb controller. With these changes, the VM will actually start, but it goes immediately into the BSOD (Windows ran into a problem) in the VNC window. I would have thought windows would have all necessary drivers for VNC, so not sure what the problem is here. Help? And to make matters worse, sdh is already showing new errors after the count was zeroed (earlier runs showed all zero): root@Tower2:~# btrfs dev stats /mnt/cache [/dev/sdh1].write_io_errs 441286 [/dev/sdh1].read_io_errs 407435 [/dev/sdh1].flush_io_errs 2040 [/dev/sdh1].corruption_errs 0 [/dev/sdh1].generation_errs 0 [/dev/sdg1].write_io_errs 0 [/dev/sdg1].read_io_errs 0 [/dev/sdg1].flush_io_errs 0 [/dev/sdg1].corruption_errs 0 [/dev/sdg1].generation_errs 0
  10. Ok, powered off and replaced the sata cables. I had tried starting the VM after running scrub with corrections enabled, but received the same error. Now, after powering back on after swapping cables, I'm getting a new message: Execution error internal error: qemu unexpectedly closed the monitor: 2017-02-13T00:07:35.264400Z qemu-system-x86_64: -device vfio-pci,host=01:00.1,id=hostdev0,bus=pci.0,addr=0x6: vfio: error, group 1 is not viable, please ensure all devices within the iommu_group are bound to their vfio bus driver. 2017-02-13T00:07:35.264413Z qemu-system-x86_64: -device vfio-pci,host=01:00.1,id=hostdev0,bus=pci.0,addr=0x6: vfio: failed to get group 1 2017-02-13T00:07:35.264421Z qemu-system-x86_64: -device vfio-pci,host=01:00.1,id=hostdev0,bus=pci.0,addr=0x6: Device initialization failed
  11. Hmm, I turned off all running dockers (array is still running, but this is the cache drive) and tried running a 2nd readonly scrub. I was curious how repeatable it was. It actually has a few LESS errors. root@Tower2:/# btrfs scrub start -rdB /mnt/cache > /boot/logs/scrub_cache2.log root@Tower2:/# cat /boot/logs/scrub_cache2.log scrub device /dev/sdh1 (id 1) done scrub started at Sun Feb 12 18:00:23 2017 and finished after 00:06:36 total bytes scrubbed: 75.84GiB with 175178 errors error details: verify=679 csum=174499 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 scrub device /dev/sdg1 (id 2) done scrub started at Sun Feb 12 18:00:23 2017 and finished after 00:06:33 total bytes scrubbed: 75.84GiB with 0 errors root@Tower2:/# btrfs dev stats /mnt/cache [/dev/sdh1].write_io_errs 181614787 [/dev/sdh1].read_io_errs 147213104 [/dev/sdh1].flush_io_errs 3842528 [/dev/sdh1].corruption_errs 349010 [/dev/sdh1].generation_errs 1493 [/dev/sdg1].write_io_errs 0 [/dev/sdg1].read_io_errs 0 [/dev/sdg1].flush_io_errs 0 [/dev/sdg1].corruption_errs 0 [/dev/sdg1].generation_errs 0 Diagnostics also attached. tower2-diagnostics-20170212-1810.zip
  12. Thanks for the help. Any suggestions for determining what is causing the errors?
  13. I'm trying to see if there is something else that might be causing the problem. I'm running a btrfs raid1 cache, but get the following root@Tower2:/# btrfs scrub start -rdB /mnt/cache > /boot/logs/scrub_cache.log root@Tower2:/# vi /boot/logs/scrub_cache.log reading /boot/logs/scrub_cache.log Read /boot/logs/scrub_cache.log, 8 lines, 416 chars scrub device /dev/sdh1 (id 1) done scrub started at Sun Feb 12 17:23:16 2017 and finished after 00:06:38 total bytes scrubbed: 75.88GiB with 175313 errors error details: verify=814 csum=174499 corrected errors: 0, uncorrectable errors: 0, unverified errors: 0 scrub device /dev/sdg1 (id 2) done scrub started at Sun Feb 12 17:23:16 2017 and finished after 00:06:38 total bytes scrubbed: 75.88GiB with 0 errors I assume errors in the VM image could cause the issue I am seeing. Since one disk has errors, but the 2nd one doesn't I assume I can run scrub again, but without the readonly flag? Any risk in doing this? Would I be better off reverting from 6.3 back to 6.2.4? Thanks
  14. Already tried that, it didn't help. The original xml was from before I tried the steps listed in the release notes. I tried those steps, with no luck, then tried disabling all passthrough devices as well. Same error message. My current xml is: <domain type='kvm'> <name>Win10</name> <uuid>449c8082-8631-ef95-bd97-1bdad139ddc7</uuid> <description>Windows 10</description> <metadata> <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/> </metadata> <memory unit='KiB'>8388608</memory> <currentMemory unit='KiB'>8388608</currentMemory> <memoryBacking> <nosharepages/> </memoryBacking> <vcpu placement='static'>1</vcpu> <cputune> <vcpupin vcpu='0' cpuset='0'/> </cputune> <os> <type arch='x86_64' machine='pc-i440fx-2.7'>hvm</type> <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader> <nvram>/etc/libvirt/qemu/nvram/449c8082-8631-ef95-bd97-1bdad139ddc7_VARS-pure-efi.fd</nvram> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-passthrough'> <topology sockets='1' cores='1' threads='1'/> </cpu> <clock offset='localtime'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/local/sbin/qemu</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <source file='/mnt/cache/VM/Win10/vdisk1.img'/> <target dev='hdc' bus='virtio'/> <boot order='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <controller type='usb' index='0' model='nec-xhci'> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'/> <controller type='virtio-serial' index='0'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:6b:d2:ee'/> <source bridge='br0'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <target port='0'/> </serial> <console type='pty'> <target type='serial' port='0'/> </console> <channel type='unix'> <target type='virtio' name='org.qemu.guest_agent.0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='en-us'> <listen type='address' address='0.0.0.0'/> </graphics> <video> <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x01' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </hostdev> <memballoon model='virtio'> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </memballoon> </devices> </domain>
  15. Note: This isn't the Trying to start my VM gives a "Invalid Machine Type" error issue noted in the release notes. I've tried the suggestions listed there and they have no effect. Any other suggestions?