tsawind

Members
  • Posts

    58
  • Joined

  • Last visited

Everything posted by tsawind

  1. This is a link to a space invader video talking about your original problem. From what I understand, NVIDIA has something in their bios to make their gaming video cards become resistant to virtualization or something. When it is the first slot, it need a custom virtual BIOS .ROM file which fixes this or something. I hope I solved your original issue with this response, maybe someone else will read this and give a better explanation!
  2. Okay well after a couple of hours of sleep, I have made some progress. I enabled syslog to save to flash, and I believe what has been happening is that when the old VM would start up it was using probably old usb pass-through and other wrong things in the xml. I have the second gpu VM operating, currently working to get a display out of the main (1st slot) gpu. Boom, Success!!! A New VM and using a custom hacked Graphics ROM Bios. ROM file for the GPU, now the "handoff" is working!
  3. Correct, hard freeze. I understand that the gpu that unraid is showing up on is supposed to be "handed off" to the VM. unRAID becomes inaccessible from the network as well, so yes a hard freeze.
  4. Hard freeze when trying to click the button to pass through the gpus and restart... Have to pull the USB drive again and delete /config/vfio-pci.cfg just to access the server again.
  5. So I have been working on this problem for hours. I have an Asus Crosshair Hero VII (wifi) with a 2700x and bios #2008. I ran two 1070ti and used 2 windows 10 vms for over 2 years daily (2 gamers 1 cpu) Well I just got a new cpu, 3950x, and I had to update my bios, so I now have 4603. Now both GPU have identical pcie ID names and virtually everything I try ends up crashing unRAID, I have even corrupted my usb a couple times. They are in different IOMMU. I have been able to get ONE of the cards to pass through to a VM, but if I try to start the second one the system hard crashes. I will continue searching for a way to get my daily system up and running, I really hate to have to roll back bios and go back to an 8 core cpu.
  6. I too have been battling similar latency issues for over 2 years now, I just call it the unRAID stutter... Mine is the worst when I write to my array without a cache drive, but dual parity, I believe that the parity calculations overwhelm the cpu and cause major stutter. This is an old link roughly showing my system: https://pcpartpicker.com/b/6vGG3C This has been helpful, I just have one question. What is htep and where do I find it?
  7. Limetech is the best! I can't say enough for the transparency of this company! Thank You! I have been running unRAID as a daily driver for over a year now, and it has been GREAT, if it quits working properly because of a security breach or something, well that is okay, i'm sure it would get fixed as a top priority. unRAID isn't an enterprise grade product, though I think it performs better than some, and there aren't thousands of workers behind the scenes. If you want that level of stuff, then you have to deal with constant forced updates and go pay the premium for it. I will back this company for as long as it stands!
  8. I have been using unRAID as my daily driver, and only computer, for about 14 months now. Getting my windows 10 VM to recognize this SMT just gained me like 25% performance, and fixed my microstutter problem in games!
  9. So after reading into this, and researching more, it seems like if we just wait, for possible integration in unRAID 6.9, or just pick a unRAID version that has a linux kernel version of 5.4ish, this will fix the Error 127 bug. This all seems to revolve around the newest generation AMD cpus, and the newer BIOS having conflicts with older linux kernel. In my particular situation, I have an Asus Crosshair Hero VII (WiFi) x470 motherboard. I can't update past Version 2008 (2019/03/14) without getting the 127 bug, and according to this article: https://www.asus.com/News/EtaH71Hbjuio1arV I need at least BIOS version 2302 to use ryzen 3k cpus. I believe that running some of the RC version of unRAID would fix this issue. I.E. unRAID version 6.8.0-RC1 has linux kernel 5.4
  10. Which version of unraid are you using? You arent using any special kernel? I am unfamiliar with unraid legacy mode.
  11. I was REALLY hoping that unRAID 6.8 would fix this issue. I have been running version 6.6.6 for almost a year as a daily driver with two windows VMs on a 2700x, with two 1070 ti, I have a ASUS Crosshair Hero VII (WiFi). The latest BIOS that works without the error 127 bug is version 2008, I just upgraded to unRAID 6.8 without issues. I tried to update to the newest BIOS which is version 2901, ASUS website seems to state that it has AGESA version 1.0.0.3. This has failed GPU passthrough. Going to either revert BIOS back to version 2008, or try to learn how to do this kernel hack. EDIT: I will just bios flashback and wait to see hopefully this kernel update for the stable unraid 5.8. I don't want to learn how to roll-back unraid to a RC version. Thanks for all the hard work!!!
  12. Thanks, I will do that, I am writing down notes and my "append vfio-pci.ids=1912:0014,1022:145f,10ec:b822,1022:1457 isolcpus=1-7,9-15 pcie_acs_override=downstream,multifunction initrd=/bzroot" settings as well. I do have my UUID for the VMs saved as well, Thank you!
  13. Hello, I am using unRAID as our daily driver! This is the only computer we have, and it is setup so that when we turn it on, it auto-starts two VMs running retail Windows 10 Pro. Both VMs have their own usb controllers and their own NVIDIA 1070 ti, running multiple 4k/60 screens. A full hardware breakdown is here: https://pcpartpicker.com/list/LVhGPn I'm about to attempt an upgrade from unRAID version 6.6.6 to unRAID version 6.8. I am going to try to make full backups of all my settings and figure out how to easily roll-back to my stable version if it breaks. After I successfully upgrade unRAID 🤞, I am going to attempt a BIOS update from ASUS ROG CROSSHAIR VII HERO (WI-FI) Version 2008 (2019/03/14) to Version 2901 (2019/11/08) 🤞 The last time I tried to update BIOS I could not get past version 2008 without breaking the GPU hardware passthrough "Unknown PCI header type 127". I lucky have a bios-flashback button, and a USB stick to roll-back. This is in preparation for a cpu upgrade to the 16-core 3950x. I will reply to this thread with the results, please let me know if there are certain things I should try, I have lots of time on my hands. Thank, TsA!
  14. I got into unRAID because it was one of the only ways I found that I could come up with a NAS, and 2 windows 10 4k gaming machines in one box. We currently are currently doing travel nursing around the U.S. Every 3 months we pack up the hardware, and reset it up in our new hotel or apartment. I like the fact that we run a parity check every 3 months, helps determine if our HDDs survived the trip!
  15. So I have added backgrounds on the bare metal boot-up... started the server up again and now windows is activated on both boot-ups. Very odd indeed, anyways it is working!! YaY!!
  16. Work in progress, mobile all in one gaming/entertainment center. Main 4k gaming station with 3 28" surround monitors all wired Logitech gear, and full surround sound. Secondary 4k gaming rig with 1 4k monitor, and mirrored image to 4k living room and bedroom TVs. all wireless Logitech gear. Expandable NAS with unRAID PRO using 8tb WD drives with parity protection. Motherboard has onboard wifi with external antenna, able to cast 4k movies through walls over long distances. unRAID array has USB backup drives disconnected from the network. Still building and working out compatibility issues here is a pcparts list: https://pcpartpicker.com/list/WBV6ZR
  17. I have attempted to dual boot my computer setup. I Spent an hour on the phone to get microsoft to fix my product key, I think I activated it once on a VM.... Anyways I have a 500gb SSD that is passed through unraid for my windows 10 pro. I booted my system directly from the drive "bare metal" and activated windows, i took the UUID from cmd on my bare metal with --- wmic csproduct get UUID Then I copy/pasted this into the XML for my windows 10 VM using the same SSD being passed through as a SATA drive. Windows is not activated in the VM, I am hesitant to try to enter my product key while in the VM.| What do I do?
  18. Keep in mind I didn't realize that I was overclocking my memory! This is what made my instability so hard to diagnose. Especially with my motherboard on the Asus website showing pictures of 4 sticks of ram running at 3600 mhz!!!! So I bought 3600mhz Ram, thinking I was doing the right thing, but my memory controller wasn't fast enough to keep up, so I was overclocked even though I was running with the recommended XMP profile! The wikichip.org shows the ratings for my integrated memory controller for my 2700x. Also something to note, if I run a bare metal boot-up with windows 10 from an SSD, I am perfectly stable with the overclock.
  19. So I just re-read part of this post, and I think you just explained why I have been having BSODs. They have been: attempt_to_write_to_read_only, memory_management, dxkrnl.sys, netkvm.sys, ect ect ect. I have an overclocked build with a 2700x. I bought 2 8gb sticks of good high end RAM, G.SKILL Trizent Z 3600 CAS 1. I have them running at 3600 mhz, command rate 1, Timing 17-18-18-38, 1.35v. I have noticed during overclocking the speed and timings that these particular sticks don't want to run faster than 3800mhz, and they like extra voltage to improve timings. From an overclocker standpoint they only went unstable with too high of speeds, or if the temps went over 50 degrees C. We were diagnosing the BSOD, doing a dual 4k gaming session and came to a rough conclusion that the problem existed somewhere along the DDR4 Ram. I assumed this was due to heat, as one day it was much hotter in the room and we would have a BSOD every 15-30 minutes, one time the entire system froze solid, even failing to POST. The day before it was much cooler in the room and we had a 5+ hours session of heavy gaming with no crashes. The game we are playing is Elder Scrolls Online through steam. The high resolution and extreme memory usage required me to install the game directly in the isolated sata SSDs, the normal unraid array wasnt fast enough for both VMs. This could be solved with a fast CACHE drive in my array. So I started really digging and FINALLY found this picture: We looked at the problem from a different point of view, and realizes there is a LOT of pressure on Channel A in the memory controller between the 2 sticks of RAM. I had realised this while I was deciding/buying parts for this build, thus why I bought the more expensive, faster, lower latency ram. I try to think of each of those arrows on the picture as a river. In these rivers runs electricity, (Bandwidth), and if the river gets full, it floods. Electricity gets overflows in the form of extra heat, Blue Screens of DEATH, and speed reductions.... lol I know I am crazy, and that is way way overly dumbed down and only partially true. It is relevant though cause we are talking about going back to the basics, this might not be a driver issues or some quirky thing that nobody knows anything about. It might litterally be: "You told the river to flow faster, but it can't flow that fast all the time." My point is I believe the chipset is under heavy stress already: using the entire unraid array, 2 SSDs directly to the VMs, the multiple USB controllers, the network connections, and anything else on the chipset. Now add 2 graphics cards AND 8 powerful cpu cores and make everything go BOTH WAYS in one tiny river in the memory controller. (Channel A) 2 more sticks should help a lot in my opinion. I will keep you guys posted if this fixed my stability issues. I am going to reduce the speed, and check the max voltage allowed to the memory controller for my ryzen cpu. https://en.wikichip.org/wiki/amd/ryzen_7/2700x From what I see here I have "G.SKILL TridentZ Series 16GB (2 x 8GB) 288-Pin DDR4 SDRAM DDR4 3600 (PC4 28800) Intel Z170 / Z270 / Z370 / X299 Desktop Memory Model F4-3600C17D-16GTZ", which appears to be single rank. I should run this ram on this particular build at 1.35+- volts, and DDR4-2133. I believe I should wait to do the major timing "overclock" until I have all 4 DIMMS installed? It takes forever, and the only way I know how to do it is, bare-metal boot, turn down a timing number, stress test, rinse and repeat. Anybody see anything I am missing?? Thank you for this post I think it helped solve a major problem for me! I bought a SanDisk Cruzer Fit CZ33 32GB USB 2.0 Low-Profile Flash Drive- SDCZ33-032G-B35 I am hoping this is a decent stick for my UNRAID. My build: https://pcpartpicker.com/list/tJ39QZ (some misc parts are missing like custom case, a USB controller, USB hubs, extension cables ect.) My next buy is: UNRAID PRO!!! My next upgrades: Adding 16gb of matching ram, UPS, and 1tb M.2 Drive. I believe this should stabilize the entire system and hopefully have no more issues. My wishlist: 1600w corsair PSU, UHD Blue Ray Burner, and a decent supply of MDiscs, 4 more screens, 2-7.1 surround systems, 4 more 8tb harddrives, ROG router. Im sure I could come up with more things to add here 😃 like 2 VRs, gaming chairs, car simulators, flight sims, steam controllers ect ect ect ect Looking for answers to: 1. Is that a proper USB stick? 2. Anything I should test for the community. Thank you and I hope this helps someone, TsA
  20. Thank you for the Suggestions!! Ill post what I put into my amazon cart before I buy it.
  21. I don't care about the VMs, im talking about the array where the important stuff is. (I dont want to have to do parity for 20 hours if the power spikes) This is a 2 headed gaming setup, each head has its own SSD passthrough. I made full backups for each install after I did all the quirks like turning off windows search, power setting ect ect. These backups are saved on the UnRaid array, so if a VM starts to fail or get slow or ANYTHING, I just bleep boop bleep, reformat SSD, pre-setup "clean" install. So as far as im concerned, if UPS hits 75% battery from 100% it can force close the VMs, keeping the router and UnRaid array online as long as possible. Looking for at least 30 minutes power for UnRaid before it shuts down cluster. Do I just need a wall plug from amazon that detects watts drawn by system? Then calculate largest number drawn during a stress test then divide by 30 minutes I assume? Sounds like apcupsd compatible (whatever that is) is a requirement. Buy a name brand UPS. Ideally system will restart and array will online when power comes back on. Thanks, TsA
  22. YAY Parity is 100% now, took all day.... Anyways I found out Excel Energy did have a power outage this morning. First time in months. I am not beginning to recover data. What should I be looking for in a UPS, (minimum specs) Thanks, TsA
  23. Update: I believe I have some form of mental disability (like autism), just to put things into perspective. I Just had an AHAH moment, Huge breakthrough, like party time. I just learned about M-Disks. I am ordering drives and disks to incorporate into the system, not really too concerned about extreme redundancy in the array if I can use M-Disks. They theoretically will last 1,000+- years. They come in 4.7 GB DVD, 25,50,and 100GB Blu-ray. They are one time write and exactly what I have been looking for, for literally 50+ hours in the past month. And realistically been seeking this data medium for years. Couldn't talk myself into turning off the parity check, it is at 25%. M-DISC Is THE BEST solution I have come up with for redundancies. MAJOR MAJOR AHAH MOMENT.
  24. BTW figured I would mention my luck and how I got here. Every time a order new hardware this happens to me. Needed an upgrade, gaming rig was 10yrs old and still running strong. Ordered a 1070ti Bought 4 new sticks of DDR2 and some used fans Installed a fan onto motherboard and shorted it out. (my fault) Succession of deaths an suicides to follow: 1 motherboard murdered. like 1 or 2 fans dead. 2 fans committed suicide in front of me. (both were 120mm) 1 Seagate 250gb drive committed suicide, just started clicking, no recovery. 1 Seagate 250gb drive spun up, read all data and copied onto 1.5tb WD green. Format attempted and it committed suicide. 1 WD 750gb spun up, read almost all data (bad sectors) and copied onto 1.5tb WD green. Format attempted and it committed suicide. Other miscellaneous murders, suicides, and deaths from old age have been witnessed during rebuild. Now I am lookin at this 1.5tb "DATE: 06 MAR 2010" drive that is full, a 500gb "Date: 10301" (I'm guessing full) that I haven't dared plug in, and a dozen USB flash drives and camera flash drives on my desk in front of me. I also have some really old misc. 80gb drives to check through and trash during this process. I'd say I'm a bit nervous ATM and I believe that is the definition of Fragmented. So an uknown, unclean shutdown, of a brand new NAS is a bit concerning to me ATM. Wishing myself luck, TsA
  25. The cat thing is funny, because it actually was on my list of things to check. I haven't re-covered (taped hard plastic over it) my power button on top of computer since the rebuild. I am trying to recover data some 10yr old hard drives ect, very risky, which is why I am asking. I have not plugged drives into power in months (part of why I am building a NAS). I will just deal with unraid later, im canceling parity and rebooting to a clean windows 10 install on a SSD that I know is stable. Ill check these forums later today. Wish me luck that I don't have to start calling data recovery centers today! Thank you, TsA