X399 and Threadripper


Recommended Posts

22 minutes ago, skunky99 said:

Thanks for updating! I hope you get the ram issue resolved! Can't wait for more!

 

I haven't been able to get it to POST with all 4 sticks of RAM, even at the default (non-XMP) speed and on the latest beta BIOS. I left the two sticks that *are* working in their slots and tried the other two in every other slot with no success.

 

I tried bumping the two sticks to 3600 using the XMP profile after updating to the beta BIOS and giving up on trying to get all 4 sticks working and it looked like it booted (RAM LEDs lit up and everything) but I had no video output and had to click the little button on the mobo to go back to "safe" BIOS settings.

 

I started unboxing the water cooling equipment last night and figuring out where I'm going to put everything. I need to reconfigure a bit because even with my gigantic case I have a drive cage in front of where one of the pump/reservoirs is supposed to go. Should be plenty of room for the fans and radiators though. I'll probably pick up some distilled water this weekend and try to get all those goodies installed.

 

I'm really bummed about the RAM. I need quad-channel to get the most out of the CPU and I'm not sure I'm going to be able to get that working and RAM is so expensive that just buying another kit is ludicrous, at best.

 

I need to get ahold of another video card (pretty sure I don't have a spare that'll fit any of these slots, but I'll check) so I can see if I can get the GPU passthrough working properly. Worst case scenario I put all the old parts back together with the Intel chip and use that as a separate gaming machine, but I can't fit the radiator I bought for the GPU in the old case. Limetech is aware of the VM stability issue and I assume it'll be fixed in the next RC.

Link to comment
4 hours ago, gilahacker said:

 

That's all way above my head. If I had to guess, I think they're editing kernel stuff and recompiling (but I'm really not sure). I doubt the required stuff is installed by default in unRAID to compile your own kernel. Might be something to point out to the guys actually building unRAID though.

Hey, thanks! 

 

Will do!

Link to comment
17 hours ago, skunky99 said:

Any news?

Unfortunately, I haven't had any free time to work on the server and I'm flying out of state tomorrow and won't be back for a week. I went through some of my stuff and couldn't find my old spare video card so I may take you up on your offer, if needed.

 

@david279 - Thanks for the link! If I get a chance tonight I'll try the updated kernel and see if the VM will boot with the single video card. I don't know if it's even supposed to work with only a single video card, as I've always had onboard video before and never tried passthrough until I got the Titan X.

Link to comment

And... I guess I screwed up. Just updated to the latest RC and rebooted and then updated the kernel with the ugly patch and rebooted again and now I can't start my VM because IOMMU isn't enabled in the BIOS (I'm guessing it got reset with the latest BIOS update?). About to leave the house. If I manage to get back to it tonight I'll drop an update. Otherwise it'll be a week or so.

Link to comment
  • 4 weeks later...

Finally got to mess with it again today. Got it booting with all 4 sticks of RAM at 3333 MHz with the latest beta BIOS (0902) so I finally have 64 GB to play with. I also threw in another SSD so I can make my cache RAID 1, but haven't set that up yet.

 

I'm on 6.4.0 Stable and it boots off my "basic" video card (GeForce 710, IIRC) just fine. It allowed me to assign the Titan X to the Windows 10 VM and it started up and was accessible via RDC, but there was no video output. I stopped the VM and tried to restart it, but got "internal error: Unknown PCI header type '127'". I tried to create a new VM and assign it and got the same error again (screenshot attached).

 

A Google search for the error brings up some posts from this forum, but I haven't read through them yet.

Screenshot_20180113-162846.png

  • Like 1
Link to comment

Had to dial back the RAM to the "default" speed (2000-something MHz) as I had two complete lock-ups and one unexpected reboot. It's been up for nearly 14 hours now, so I'm pretty sure it was the RAM speed setting.

 

The first lock-up happened when I tried to re-encode a Blu-ray with Handbrake. The server completely froze up within seconds of starting the encode. The subsequent unexpected reboot and lock-up happened while I was not actively using the server, but someone could have been streaming from Plex.

 

My only theory (guess) on that at this point is that, because part of the chip runs at the RAM speed and I still have that woefully insufficient all-in-one cooler on there (haven't had time to figure out and install all the water cooling stuff) is that the chip is running hot and overheated easily when running the higher RAM speed.

 

However, I had managed to do some encodes with Handbrake before (with half as mu RAM running at a lower speed), which definitely strained the processor and didn't run into any problems, so who knows. I still don't have any kind of CPU temperature monitoring and can only guess at where it's running hot by the sound of the fan on the all-in-one.

  • Like 1
Link to comment

For anyone lurking here. If you're on, 6.4.0 and want to give "Ugly Patch" a shot I've compiled it, see this post 

My system has only been running it for about eight hours now.

 

Edited follow up report (1/21/2018):  Been running my compiled "Ugly patch" since time of post. For me it's been 100% functional on 1 gpu VM; have yet to try two gpu vm's (waiting on my friend to upgrade his computer ((my hardware)) and 'deal out all the cards in my deck').  

Edited by Jcloud
Follow up info of the narcissistic kind, but implying running stable.
  • Like 2
Link to comment

I downloaded the kernel thing @Jcloud made and rebooted. My VM still output no video. I noticed a comment on the Reddit thread about the dirty patch that SeaBIOS doesn't work, but OVMF does. Sure enough, my VM was using SeaBIOS.

 

Unfortunately, there doesn't seem to be any way to change it. I created a new OVMF VM and holy crap we have video output! However, it won't boot off my existing virtual disk (despite many attempts to repair the startup with the Windows DVD image and some manual command-line-fu I found online), so I just created a new one and got windows installed. Just installed the GeForce experience software and had it update the driver and that's all working fine so far. Now to figure out how I had my games folder mapped from the array and re-install steam and try a game.

Link to comment

Can't get any games to launch. Steam says they're running but nothing ever pops up. Within a second or two it changes from "running to syncing".

 

Installed the Passmark Performance Test and it locked up and rebooted the VM when trying to access the 3D test page. Not running the test, just accessing the page you run it from, so I'm guessing something is wonky with the video card detection? GeForce Experience sees it fine, Windows is running at 4k60, etc.

 

Can't even get steam to launch now. Going to install all the Windows 10 updates and try again.

 

I'm not sure if it has anything to do with the issues I'm seeing, but Windows is listing several devices, including the GPU, as if they're removable devices (see attached screenshot).

Capture.PNG

Edited by gilahacker
Link to comment
2 hours ago, gilahacker said:

I'm not sure if it has anything to do with the issues I'm seeing, but Windows is listing several devices, including the GPU, as if they're removable devices (see attached screenshot).

 

I just checked mine, sure enough it's the same. Thinking to my self, "huh, well lookie there." Makes a bit of since, and in the context of ThreadRipper's ugly patch. Ugly patch is all about the micro code unable to recognize a specific reset command on the PCIe bus, this is important for say -- unplugging the device, which can be done on according the the white-paper specification.

 

There's also a sort of "soft-unplug" that occurs with VM's and their associated hardware, every time you start and (really just) stop a VM.  So, while still ignorant on details, I could see if KVM/SeaBIOS/OMFV treating all of the hardware as eject-able devices. And perhaps Windows is picking up on that? 

 

I'd bet the fix to resolve this would be a registry hack, flipping the, "can eject hardware" bit to zero. 

Edited by Jcloud
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.