dAigo

Members
  • Posts

    108
  • Joined

  • Last visited

Everything posted by dAigo

  1. I am using it, and have no issues, from inside or outside my network. I have done NO change to the OwnCloud docker, so it still runs on port 443 with whatever cert they are using. The docker itself runs on port 8000 and proxy passed through nginx like any other docker. No issues so far. site-conf/default: server { listen 443 ssl; server_name owncloud.domain.com; location / { include /config/nginx/proxy.conf; proxy_pass https://<unraid_ip>:<oc-docker Port>/; }
  2. I guess they are trying ... But because its to much for LT alone to manage, I guess they need to take the community approach. So spread the word and post the stuff you found usefull or helped with an issue HERE. That beeing said, I already found something interesting that may be a usefull tweak in terms of disk perfomance
  3. Depending on your definition of "noticable" it works for me. But unless you take advantage of the features that kvm bring (caching & sharing ressources) it can't be faster than bare-metal unless (like in your case) compared to other/older hardware. In addition to the disk-cache options I posted, I would suggest to experiment with CPU-pinning like dlandon did HERE. Keeping cpu-cores "clean" should result in less context-switches and with that reduce latency. And keeping threads on corresponding cores will speed things up due to shared cpu cache. I only tested it once, but yes perfomance will be "near" bare-metal. (like GPU -> 95-99%) But you have to consider, that not every operating system support booting/installing on a nvme disk. And you need to be carefull with your boot order, because it happened to me once, that my system booted from the os of a VM instead of the USB stick (after a BIOS update). I had no issues with that VM afterwards, but who knows what another OS might do.
  4. You can change the amount of caching by editing your vm.xml. In you disk settings, you have an option "cache=writeback". To avoid any caching through qemu, you should go with "directsync" Other options are explained HERE. To completly bypass any cache, you would still need to disable caching IN your VM or use software that does not use cache for benchmarks. Otherwise even passthrough would not bypass the caching in the VM, just outside. A good start to make sure there is not much caching going on, is to run very long benchmarks (hours) with very large amounts of random data (multiple times the amount of RAM; 10GB+). That way its very unlikely to hit any cached data. To achieve that, I would recommend IOMETER, you can test all sorts of possible workloads. But on the other side, 50-70% of bare-metal is basicly what you can expect of kvm (depending of workload). HERE is a comparison between bare-metal, vmware and kvm (from 2013). So don't blame unraid or LT, its a trade-off for a very cheap, highly compatible and very versatile system. There is always room for optimization, like caching. I recommend reading THIS page and more important all the linked sources. It explaines very good where the problem is and how it "could" be solved. Main Problem is latency that comes with adding layers of virtualisation to support almost every hardware. They are implementing methods to make better use of cpu-power / threads, but It's not yet there.
  5. Ah, I see were I was wrong. We tried other tunables during the first tests and I just asumed you asked me to retry one of it after getting better results with ReiserFS for the Linux VM. But it seems we actually did not try num_stripes before... I'll test it later today, I still have enough xfs disks in the array so its no big deal. Indeed, #1 seems to do the job, #2 & #3 are not needed, sorry for the confusion. And yes, with XFS transferspeed is back to normal.
  6. Ah, I see were I was wrong. We tried other tunables during the first tests and I just asumed you asked me to retry one of it after getting better results with ReiserFS for the Linux VM. But it seems we actually did not try num_stripes before... I'll test it later today, I still have enough xfs disks in the array so its no big deal.
  7. Yes, running a VM with any disk on the array would result in a completly unresponsive array. So everything that is related to the array breaks. But after countless trys and a ton of support from Eric, I was able to find a workaround that worked for me. 1) Tunable "md_num_stripes" = 8192 (or higher) 2) Set "cache" mode of the vDisk that is on the array to "directsync" 3) Change Filesystem of the disk in the array to ReiserFS (XFS & BTRFS did not work) Since I am running these settings (~4 weeks) I had not one single crash or other issue to report. But its a rather drastic workaround and both 2) and 3) definitly do reduce Disk speed in my case. Apart from that, you could also go back to 6.1 or move your vDisks from the array to a disk that is not part of the array (like cache disk)
  8. Well, its not "to fast" or "to slow", its just the whole system from the view of a VM. 2.5/1.2 are the theoretical/tested values in a "normal" system. Which means the driver of the OS can acces your pci-nvme device directly and that device is directly connected to the cpu with at least 4x PCIe3.0 lanes. In that system, latency and bandwith would usualy not be reduced or enhanced, so the measured results would be your disk. Changing anything will end up with other results. Adding multiple virtualized layers (vDisks, vCPUs, vMEM, etc.) will normaly increase latency and depending on the system reduce bandwidth. In that case, you would end up with ~50% to 60% of the raw power the disk alone would be capable of. As you can see in the "too low" values. On the other Hand, virtualization can add a lot of perfomance. For example changing the vDisks cache mode will result in the "too fast" values. The VM thinks that a block was written, but the VM-Host actually just put it in its RAM and writes it at a later time. So your VM can write to its Disk as fast as the host can put it in its RAM... Great to move big chunks of non critical data around, for example while doing a backup or installing a new system. That is obviously very unsafe in case of a powerfailure (or similar errors), which would result in dataloss an a potential corrupt filesystem If you want what you paid for (~2.4/1.2), you need to go with pci-passthrough. The overhead is very limited and you would achieve 95%+ of your nvme disk. But the disk could only be uses by one VM... Outside of any VM (as a cache device or "unassigned disk") would count as a default config and should give you ~2.4/1.2. As far as I know, VMware is currently the best vendor in keeping that overhead to a minimum, so you could achieve maybe 70%-80% on an ESXi host, but that comes at the cost of compatibility, consumer hardware won't get you far in VMware. Qemu/KvM is getting there and LimeTech is doing a good job of optimizing perfomance/compatibility/usability. My personal opinion is, 60%-70% of the raw speed of NVMe ist still more than enough for any workload and I can share it with many VMs and even unRAID itself. Its still faster than any SATA-SSD would be at 100%. I hope that answers your question, if not be more specific
  9. Its hard to tell if virtualisation is cause or effect to the issue. I think there are some, that don't use vms, or at least do not have them on the array. It may just be, that the vm-workload is just an easy way to find issues, that came with introducing smb3 or dual parity. And in my case, its "just" the array that stops working... anything that has nothing to do with the array works fine. So judging from my beta experience, everything is working, but the NAS part may have some unidentified issues under certian workloads... For the next beta version, Soon™ is okay. For the final release, I prefer when it's done™... indicates thats the release date is more based on the state of the product Now ?-------------- Very Soon -------- Soon™ -------- Soon-ish(er) ------- Soon-ish --------? End of Time | when it's done™
  10. The Windows VM crashed during the backup this night. :'( So I guess ReiserFS does not "fix" the issue, but delays it or softens it. (at least im my case) Or it plays better with linux VMs, investigation ongoing -.- Played around with some settings (thanks to Eric!) and found a working combination for my Windows VM. As I mentioned, going "back" to ReiserFS worked for the Linux VM, but that alone was not a fix for my windows VM. I changed the cache mode of the vDisk that is placed on the array to 'directsync' (like the Ubuntu VM), but that alone did not help. After changing the Tunable (md_num_stripes) to 20480, my Windows VM has no issues at all. I tried those settiongs with XFS, but it did not help. ReiserFS+'directsync'+Tunable is a working solution for me. Normal usage, nightly backups, everything works right now. But perfomance while copying vDisks from that disk to the cache disk (sequential read) went from ~450MB/s on XFS to ~350MB/s on BTRFS to now "only" ~280MB/s on ReiserFS. The perfomance hit seems rather big, don't know if its related. It may have other reasons (trim, etc.) While thats no fix and even a debatable workaround, I hope it helps LimeTech/Tom to find the underlying issue. For those that don't want to go back to 6.1.9, but need a quick&dirty fix, you may try these settings and see if it helps.
  11. Well, I think you are doing something wrong. But without any more information (xml-config, screenshots, etc.), I could only guess... Are you using the default "Server2016" template that comes with unraid? Can you confirm, that your vDisk is at least 30G big? Did you select the "viostor->w2kr2->amd64" folder when you were loading the driver? I had no trouble in 6.1.X with a OVMF virtual machine and on 6.2 beta it runs just as smooth. Installed Prev. 4 yesterday and Prev. 5 today.
  12. Server 2016 would be Win10 kernel and Win10 VMs were working with Win8.1 drivers. I had no issues with the 2012R2 drivers back in v109, so I would use the latest stable/beta version of the 2012R2 drivers.
  13. If its 6.2 related, increasing the image does not seem like a reasonable solution. Did you try to find out which container consumes the space, to see if its a beta related issue? If not, I would suggest installing "cAdvisor" from Community Applications and then have a look into the ressource Monitor of Community Applications.
  14. The Windows VM crashed during the backup this night. :'( So I guess ReiserFS does not "fix" the issue, but delays it or softens it. (at least im my case) Or it plays better with linux VMs, investigation ongoing -.-
  15. Ok, I had some kind of breakthrough... Testing the other filesystems was on the list and I finally had some spare time to move the files around. Switching from XFS to BTRFS didn't change anything, but after going to ReiserFS the mentioned Ubuntu VM managed to not even just complete the first part of the script, but the whole thing. Even with very decent perfomance, considering the parity and qemu overhead... I need to test more, but until now, that script NEVER even once completed... Btw, I forgot to mention that I changed the vDisk from "cache=writeback" to "cache=directsync" in all the tests, because it seemed to be a disk issue, not a cache issue. So if anyone tried to reproduce, the result may be diffrent without "cache=directsync", sorry!
  16. Good point, I hadn't tried yet. I can confirm they do not work with the setup I mentioned above. Despite the documentation, adding this didn't help: proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; Has anyone else been able to get this working? From what I understand, you need to specify the correct port of your application, in the proxy_pass. But it may not even be websockets, it just makes sense from what see and read. For example, VNC runs on ports from 5900 ongoing. (random each time you start the vm, unless you set it in the vm config) I do not know how the "log" part works, but I think you could look through the source code of the unraid gui... The little "pop-ups" when you pull a docker, upgrade a plugin and so on are also not working, because they are just a -tail of the console. A decent way to start would be HERE. *EDIT: Btw, this topic is almost 2 years old, and it seems aptalca event got his opinion on it (use vpn...) But that was before LetsEncrypt.
  17. All my filesystems are XFS. (cache & array) I found a very easy and fast way to reproduce the issue. I installed an Ubuntu VM (15.04, desktop, default settings) and created a script that runs some dd commands. The script contains variants up to 8G and 1000000 count (for i/o stress). When I place the vDisk on a disk in the array and run it, the first line dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=direct does not even succeed. Everything hangs after 5s - 15s. It works perfectly fine while beeing placed on the cache. (850MB/s and more on NVMe cache) I can now test the issue without putting my windows VMs in danger and I can rule out a windows issue. Maybe other people want to try that and see if their system keeps running or hangs like mine.
  18. First of all, aptalca example is already very good, to want more is more like a sport... There are a few things at some points in the config and 100/100/100/100 needs certificate pinning, which is overkill to be honest. I see it more from an educational view, so its fun and you learn alot. a) The folowing line blacklists all "bad" cyphers. It looks cleaner, but unless you know whats left, you have no idea what cypher you are using (I don't, but if its all 100% its safe I guess ^^) ssl_ciphers HIGH:!aNULL:!MD5:!3DES:!CAMELLIA:!AES128; b) You also need 384bit+ ECDH ssl_ecdh_curve secp384r1; c) Certificate and Public Key Pinning TL;DR: you put the signed fingerprint of at least TWO certs into you site, so even if a cert. contains the right domains AND is generally trustworthy, the browser won't accept it if the hash is not found in the header. As of now, I think only chrome actually checks it. proxy_set_header Public-Key-Pins 'pin-sha256="79xH6q4gKV/LQ1LPmI67/hkuNxMB+J6YbCQJWrdavOw="; pin-sha256="Ox0UeNDINSfUUFCkKw0vvnYprdL/Ep7bzb8wDl36ByA="; max-age=15768000; includeSubdomains'; So I have 2 Certificates for my domain, one is LIVE, the other is offline. Chrome has a store per domain for these hashes and it only gets refreshed once the site correctly loads, so you can't just renew the cert, because a client that was on your site before that renewal won't accept the new cert, because the hash was not available last time... before removing the old cert, ALL your clients need to visit your site to get the new one. That may be a Problem with the 90day certs... You could use the hash from the LE cert wich lives longer, but that would mean, every cert that comes from LE and has your domain is valid, which counters the idea of pinning... This container cant and probably won't for a long time (if ever) be able to keep 2 (or more) seperate certs, for the same domain, live and offline and rotate their hashes into your proxy.conf every ~45 days (or less)... While it defenitly COULD be done not even I dare to propose such a feature to aptalca (no sane person would like to support such a mechanism and only chrome actually cares, so it does not add much)
  19. There are a couple of interesting bits to get it working: 1. It needs its own url, such as unraid.yourdomain.url (the unraid interface does not run in a directory, such as yourdomain.url/unraid ) 2. You don't want to expose unraid's authentication to the internet, let nginx handle the authentication instead (that way fail2ban can lock out failed authentication attempts). That is a little tricky, because that means nginx has to know how to login to unraid. This article explains the concept: http://shairosenfeld.blogspot.com/2011/03/authorization-header-in-nginx-for.html I believe some parts of the unraid gui also use websockets (tailing LOGs, VNC console). Websockets through nginx is possible, but another tricky thing to do. https://www.nginx.com/blog/websocket-nginx/ I could be wrong though, at least those two things wont work for me with a standard reverse proxy. And to your question of what application you may want to put behind the proxy instead of using vpn. In my case, I count how many diffrent users/devices are accessing the app. Its more easy to provide https access, than installing 10+ vpn clients with diffrent users. But since I am the only one accessing unraid and do so only from 1-2 devices, putting the unraid gui behind the proxy (with all its features) seems overkill. But keep in mind, you could use the LE certs for a web VPN. Not for the auth. but it should work for the encryption. (as seen HERE) Haven't found a vpn solution that can import the cert in intervals though.
  20. Due to other issues, I did not have the time yet to do perfomance tests. As of now, there is no reason to believe these other issues are nvme related, so I can at least say its running stable. And I can see/feel no difference from 6.1.9 where I had to use is outside of the array. I dont know if there is any use-case in unRAID, that would make you see or feel any diffrence between AHCI M.2 and NVMe M.2. At least through kvm, the latancy gain wont be as high as bare-metal. I placed my old sata ssd in the array and use the 750 as cache. Write-Speed to the sata ssd is limited by the parity disk, but I enjoy a very fast, array protected read speed for my media stuff (plex is very fast now) and some rarely used VMs boot quite fast in case I do need them.
  21. Yep and I am glad for the tip, but that plug-in only works for /mnt/cache and it does what I wrote. Dynamix SSD TRIM Dynamix SSD trim allows to create a cronjob to do regular SSD TRIM operations on the cache device(s). The command 'fstrim -v /mnt/cache' is executed at given interval. So, if trim is a solutition for anybody with these kind of issues, there should be trim-support for ssds in the array. But that may be more of a feature request, than a bug report. Benefits of sata ssd's in the array are limited, I know. but a sata ssd is "to slow" to pool it with a nvme disk in the cache. And having a parity protected ssd for read-heavy workloads (big mp3 / picture colletions) is really nice. Then again, trim is not realy needed on read-heavy workloads... But if it prevents server-lockups, its kind of usefull All my disks are xfs, maybe I'll switch the ssd to btrfs and see if anything changes.
  22. I tried to trim the ssd I moved to the array. From what i read, it should be "fstrim [options] device-mountpoint" So "fstrim /mnt/disk5" in my case. I got an error, that said something like "discard not supported". Tried the same command on "/mnt/cache" and it worked. It seems, that trim on xfs filesystems (which all of my disks are) only works, if the disk is mounted with the "-o discard" parameter (source) After stopping the array, and mounting the ssd correctly, trim worked. However, it did not help in any way with the issue. I moved the vm from the ssd in the array to another disk (hdd) with the same result.
  23. At least for me, its not a big deal, because I know how to recreate it and can avoid it. Others are not so lucky it seems. If I couldn't run a VM on the array anymore, I would have no problem with it. I am testing so much stuff and try to give as much feedback as possible, because It's part of the fun. The thing is, even on my system, I can't see any unusual log events at all, not before, not while and not after the crash occurs. I tried to increase the log_level of libvirt (to 1), but the settings Eric pointed me to, didn't add any (not even one...) messages, not to syslog and not to libvirtd.log. I would like to configure logging to be more verbose on libvirt, samba and maybe some kernel events. (if you think it may be related) Anyway, I think you are doing a great job, using bleeding edge technology always comes with a price. And if you try to please every single unraid user (which is how it seems to me), its not going to be easy I started with version 4.x and never had any issues, that weren't taken care of or were my own fault. If I find anything on my end, I'll post it here and of course after every new beta, I'll see if anything changes. Until then, enjoy your weekend I'll add that to the list of things to test. At least the last few weeks, I only used my old ssd to test, because I don't have to wait that long for a vm to boot. I guess triming or even secure erasing might be an option or I'll just see what happens when I use a normal hdd instead of a ssd.
  24. Indeed, release notes sounds great. I'll report back when renewal comes up
  25. Eric asked me to switch the vDisk to IDE, so I did. Ok, I would say going to IDE improved stability. All windows-disk tools were able to finish. Even the disk benchmark did not crash the system. Although task manager did from time to time fail to show any values. Perfomance however had a mixed result it wen't from 0.xx KB/s to very short bursts of 100MB/s back to 0KB/s So, according to my scale, it would probably be a 8.5.... Very slow, but seemingly working... Hoping I might have an improved solution, I rebooted into normal mode. That one vm was still working as "good" as in safe mode. So I went ahead and started a second VM, that has a vDisk (System) on the cache an additional vDisk on the array (which I also changed to IDE). It's a Server2012R2 and uses the second disk as a backup target. After ~1GB of data written, the first VM completly freezes, no mouse movement throuh vnc and clock stops going forward. The Server2012R2 VM itself stops writing anything to the backup disk but is still able to abort the backup and shut down. "Virsh list" however shows both VMs as running and "virsh destroy" still doesn't do anything. I was able to reproduce that three times. 1) normal mode, with disk shares (attached in next post due to file size restrictions) 2) safe mode, with disk shares (attached) 3) safe mode, no disk shares (attached) jonp's suggestion to disable disk shares had no effect that I am aware of. System is still crashing as fast and processes that access a disk in the array get stuck (WebGUI, MC, ...) *A new sidenote: I had to copy the second disk to the array first, because I placed it on the cache until we find a solution. The first vm was already running. Copying the disk through "mc" whithin a ssh session had constantly 60-62MB/s (cache->array) I even started t benchmark in the vm, but it did not drop. So while everything in the VM is slow, anything als is working fine (until the VM goes of the cliff...) safe_no-disk-share_unraid-diagnostics-20160413-2130.zip safe_with-disk-share_unraid-diagnostics-20160413-2122.zip