Anybody planning a Ryzen build?


Recommended Posts

9 hours ago, Pauven said:

Oh, wow, just realized your 62w reading was with the 65w 1700, and not the 95w 1700X!!!  Something was definitely wrong there.

 

Yes, but like I said I have a bunch of extra stuff connected, I was not trying to see how low it goes but how it compared to a Kabylake in equal circumstances.

Link to comment

So I think I've finally figured out the higher power consumption on the ASRock X370 Fatal1ty Professional Gaming in Linux.

 

As part of my testing, I booted into the latest openSUSE Tumbleweed.  AMD is supposedly very active in the openSUSE distro, and recent tests show Ryzen performing best here, so I figured this would be a great place to evaluate Ryzen on Linux outside of unRAID.

 

Power consumption did not change from unRAID, still sitting at 56.5w idle in openSUSE.

 

I then checked to see if any drivers were missing, and sure enough the Aquantia AQC108 5Gb/s LAN controller drivers were missing.  These are supposedly in 4.11, and openSUSE Tumbleweed is currently at 4.10.2.

 

I also noticed on the motherboard that the Aquantia chip has a pretty big heatsink on it, and it was warm to the touch in Linux (but didn't check in Windows yet). 

 

Also missing are the Realtek ACL1220 audio codec drivers, again slated for 4.11.  I think the combination of these two chips missing drivers could very easily be behind the extra power consumption in Linux vs. Windows, especially with the Aquantia chip.  It's also nice to know that another distro has the same power consumption as unRAID, so this definitely is not an unRAID specific issue.

 

This ASRock MB is the only AM4 board I've seen with the Aquantia 5G LAN chip, so if this is the issue, it doesn't affect any other Ryzen motherboards at this time.

 

Last thought on power consumption:  try as I might, the BIOS won't let me disable the wi-fi or bluetooth on this motherboard.  It also won't let me disable all of the SATA ports (and this MB has a ton of them), as at least 3 keep reactivating themselves.  I don't know if any of this is affecting power consumption, impossible to tell without better BIOS control.  I expect it will take many months, and a lot of helpful prodding, to get ASRock to fix this BIOS - and there's a lot of more pressing BIOS issues that need addressing first, primarily memory support.

 

Unfortunately, I'm still struggling with stability in unRAID.  Sometimes it will hard crash in an hour, and at other times it will go half a day.  I'm running the memory tests now, as that's the most likely culprit.  There's no way I can move my production server onto Ryzen until this stability issue is resolved.

 

-Paul

Edited by Pauven
Link to comment

Yeah, I think you are on the right track regarding power consumption.

 

Based on other reviews Ryzen should be fairly economic in idle, perhaps 5-10W higher than Kaby Lake. 5 Gbit/10 Gbit LAN controllers are known to be very power hungry. It'd be quite disappointing if it turns out that the BIOS only disables access but not power to the chipset.

 

Gaming mainboards in particular don't seem to pay much attention to these details because what's another 10W here and there when you have 8 case fans and six LED strips.

I suspect a basic B350 board without all the extra features would idle much lower but unfortunately these usually have very questionable 4+2 phase power layouts.

 

Idle power consumption between a 1700 and 1800X should be almost identical when all cores are clocked down. TDP really only suggests  power consumption at maximum load and in the case of Ryzen that's maximum load on one core at boost.

Actual power consumption at 100% load on all cores is usually much higher.

Edited by lionceau
Link to comment
On 3/15/2017 at 10:52 AM, Bureaucromancer said:

So as of now, having run through all the usual including a manual BIOs dump I can't get any IOMMU to work.  No errors, cards selected, and it appears to unbind properly if I use primary, but nothing initializes.

 

Unless anyone has something to suggest I guess the best case for me is whenever a new kernel comes along.

 

Can you post your system specs, and what it is you're trying to pass through?  What are the VM's specs as well?

Link to comment

Update:  I have effectively completed my Ryzen build from a component perspective:

unRAID Server Pro 6.3.2 (Dual Parity) • ASUS Prime X370-PRO MB • AMD Ryzen 7 1800X 8-Core 3.6GHz • Crucial CT16G4DFD8213 DDR4 2133 (64GB) • Seasonic SS-660XP2 660W 80 PLUS PLATINUM • Asus Radeon 6450 1GB (Desktop) Graphics Card • 4 x Hitachi/HGST Deskstar 7K4000 4TB • Crucial C300 128GB SSD (Cache) • VM: Windows 10

 

Specific to the Windows 10 VM, I am passing in the Radeon 6450 GPU, and have added USB audio and a PCIe card for USB support:

Startech USB Audio: https://www.newegg.com/Product/Product.aspx?Item=N82E16829128004

Vantech UGT-PC341 USB Card:  https://www.newegg.com/Product/Product.aspx?Item=N82E16815287016

 

I had to download the drivers from the Startech website to get clean audio in Windows 10.  The USB card provides 3.0 speeds and seems happy with plug-and-play.  On my MB, I had to put it in the second PCIe x16 slot just to get it into a properly isolated IOMMU group ... it's the VIA controller at the bottom:

IOMMU group 2
[1022:1452] 00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
[1022:1453] 00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
[1022:1453] 00:03.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
[1002:677b] 28:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Caicos PRO [Radeon HD 7450]
[1002:aa98] 28:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Caicos HDMI Audio [Radeon HD 6450 / 7450/8450/8490 OEM / R5 230/235/235X OEM]
[1106:3483] 29:00.0 USB controller: VIA Technologies, Inc. VL805 USB 3.0 Host Controller (rev 01)

 

To pass in 29.00.0 (USB), I have to pass in both 28.00.0 (GPU/VGA) and 28.00.1 (HDMI sound), even though I'm not using the latter.  Sad that there are two completely empty IOMMU groups, yet all the other slots wanted to put the USB card into IOMMU group 1 where all sorts of stuff lives that I can't pass through.

 

For now, I'm still on my test setup.  I'll merge this system with "Cortex" (see signature) so I'll have to add a Dell PERC H310 controller at that point ... hope the existing balancing act holds together.

 

Otherwise, the talk about power consumption intrigues me.  I'll have to run some tests on my own system, and see if I can disable things like the onboard audio that I'm not using to save on power.  So far uptime has been good, though I had some hangs earlier on.  I'm chalking that up to oddball VM/IOMMU configurations that can crater the system.  Haven't had a hang situation in days, so knock on wood.

 

Edited by ufopinball
Clarity
Link to comment
Just now, ufopinball said:

 

Can you post your system specs, and what it is you're trying to pass through?  What are the VM's specs as well?

 

The issue was resolved by switching out the GTX 560 for a 970.  Lime Tech docs actually say compatibility back to the 600 series, and the 560 was really just something I had lying around so I'm really not going to pursue that.  I'm getting the same issues trying to passthrough an RX 460, but this is the primary card so I'm not terribly surprised.  Going to put some real time into that on the weekend, but passing both cards through is more of a "would be cool" than a core part of anything I'm trying to accomplish.

 

Specs though are an 1800x running on a Crosshair VI Hero (bios 0902) with 64gb (G.Skill Aegis at 2400) ram, an RX 460 in the primary slot and a GTX (970 or 560) in the secondary.  I've been experimenting with a Windows VM with something in the 50gb range of RAM and between 4 and 7 cores.

Edited by Bureaucromancer
Link to comment

My stability issues may be memory related (shocker, I know).  I've never seen Memtest86 errors before, so not sure what the right next steps are.

 

I'm running the memory at DDR4-2400 speeds.  Since this is 4 sticks of Dual-Rank memory, technically Ryzen only supports speeds up to DDR4-1866 in this configuration.

 

I'm thinking my next step would be to slow the memory back down to 2133 (for whatever reason, my motherboard won't boot at 1866, go figure), and test again to see if that improves things.

 

Any way I can tell from my Memtest86 results if the problem is with a specific stick of RAM?

 

sK5xNDuflbK6ch07HNB0ZgSeWyc-6SRCh_t0jsIJ

 

-Paul

Edited by Pauven
Link to comment
On 3/15/2017 at 8:24 PM, johnnie.black said:

I see 1w difference between those settings, still seems very little, freq on the 1700 is 1550Mhz for power save and 3000GHz for performance, a few more readings (+/- 1w):

 

I just dropped my memory speeds back to 2133 from 2400, and noticed that the power consumption at the wall dropped maybe half a watt too.

 

Now, when I change between Performance and Power Saver, I can see what appears to be less than a half watt difference between the two.  I think the difference is so small, I missed it before with the higher memory timings.  The difference is certainly smaller than on Win10 switching between Performance and Power Saver, where I saw a couple watts easy at idle.

 

One thought I had is that maybe voltages weren't dropping as much in Linux vs. Windows.  In CPU-Z on Win10 (assuming it is reporting accurately) I see voltages dropping lower in Power Saver (i.e. ~0.35v lows) vs. Performance (~0.45v lows), which makes perfect sense.  Max voltages I saw were around ~1.38v (not overclocked).

 

Unfortunately, I haven't found good voltage data in Linux.  Using the nct6779 sensors driver, the voltages don't appear to be reporting correctly.  The only voltage that moves enough (like Windows) to be Vcore is the top one actually labeled "Vcore", but the values are way off.  On the other hand, in10 (which is commonly Vcore on nct6779), is in the right range, but is holding steady.  If in10 is the real Vcore, it is not ramping down.

 

root@TESTTower:~# sensors
nct6779-isa-0290
Adapter: ISA adapter
Vcore:        +0.18 V  (min =  +0.00 V, max =  +1.74 V)
in1:          +1.28 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
AVCC:         +3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
+3.3V:        +3.36 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:          +1.87 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:          +0.90 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:          +1.21 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
3VSB:         +3.47 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
Vbat:         +3.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:          +0.00 V  (min =  +0.00 V, max =  +0.00 V)
in10:         +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:         +1.08 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:         +1.70 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:         +0.93 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:         +1.84 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan2:         531 RPM  (min =    0 RPM)
fan3:         507 RPM  (min =    0 RPM)
Array Fan:      0 RPM  (min =    0 RPM)
CPU Temp:     +30.0┬░C  (high =  +0.0┬░C, hyst =  +0.0┬░C)  ALARM  sensor = thermistor
MB Temp:      +22.5┬░C  (high = +80.0┬░C, hyst = +75.0┬░C)  sensor = thermistor
AUXTIN0:      +15.0┬░C    sensor = thermistor
AUXTIN1:      +25.0┬░C    sensor = thermistor
AUXTIN2:      +21.0┬░C    sensor = thermistor
AUXTIN3:      -27.0┬░C    sensor = thermistor
intrusion0:  ALARM
intrusion1:  ALARM
beep_enable: disabled

 

Edited by Pauven
typos...
Link to comment
4 hours ago, Pauven said:

My stability issues may be memory related (shocker, I know).  I've never seen Memtest86 errors before, so not sure what the right next steps are.

 

I'm running the memory at DDR4-2400 speeds.  Since this is 4 sticks of Dual-Rank memory, technically Ryzen only supports speeds up to DDR4-1866 in this configuration.

 

I'm thinking my next step would be to slow the memory back down to 2133 (for whatever reason, my motherboard won't boot at 1866, go figure), and test again to see if that improves things.

 

I'm running my 64GB or DDR4 2133 dual-rank at rated speed. Can't expect much more with that much memory.  CPU-Z shows 1064.5 MHz, so you double that (DDR) to get to ~2133.  This is on straight Windows 10, not in an unRAID VM.

 

CPU-Z (Memory).png

 

Quote

Any way I can tell from my Memtest86 results if the problem is with a specific stick of RAM?

 

 

The stuff I have read suggest you have to install and test each stick individually.  I only completed a pass or two with all four, but my memory completed the test without any errors ... otherwise, I'd be doing the same.  Was this a 4 stick kit, or individual sticks?  Mine was the latter because no one sells a 64GB kit.  I also heard that if one stick is bad out of a 4 stick kit, they make you send all of it back.  Possibly for the best?

 

 

Edited by ufopinball
Link to comment

For comparison to my idle voltages above, here is my voltages with 16 sessions of "cat /dev/urandom > /dev/null", fully loading the CPU to 100% on all cores:

 

The only numbers that have changed are Vcore and in10.  I think in10 is RAM voltage, as I am now running a single stick of 4GB DDR4 @ 1.2v, where before I was running 

4 sticks of 16GB @ 1.35v.

 

That just leaves Vcore, and it seems pegged at 0.62v, while the PC is consuming 135w at the wall.

 

The difference between idle Vcore (0.18v) and peak Vcore (0.62v) is 0.44v.  In Win10 + CPU-Z, the observed difference between idle (~0.35v) and peak (~1.38v) is more than 1 volt.  I was hoping this was simply an offset, and maybe I could add 0.75v to both values, but that doesn't work right unless the idle Vcore isn't dropping below 0.93v (0.18v idle + 0.75 offset = 0.93v)

root@TESTTower:~# sensors | grep V
Vcore:        +0.62 V  (min =  +0.00 V, max =  +1.74 V)
in1:          +1.27 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
AVCC:         +3.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
+3.3V:        +3.34 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:          +1.86 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:          +0.91 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:          +1.21 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
3VSB:         +3.47 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
Vbat:         +3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:          +0.00 V  (min =  +0.00 V, max =  +0.00 V)
in10:         +0.58 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:         +1.08 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:         +1.70 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:         +0.93 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:         +1.83 V  (min =  +0.00 V, max =  +0.00 V)  ALARM

 

Link to comment
13 minutes ago, ufopinball said:

The stuff I have read suggest you have to install and test each stick individually.  I only completed a pass or two with all four, but my memory completed the test without any errors ... otherwise, I'd be doing the same.  Was this a 4 stick kit, or individual sticks?  Mine was the latter because no one sells a 64GB kit.  I also heard that if one stick is bad out of a 4 stick kit, they make you send all of it back.  Possibly for the best?

 

For determining stability, you should test all 4 sticks together (as I did).  If there is a problem, you can try testing them individually, to determine which one causes the problem (assumes a physical defect with a single stick).

 

But it is possible for the problem to only show up in dual-channel mode (requires 2 sticks minimum), or only when all DIMMs are installed (puts more load on the CPU's memory controller).

 

I dropped my speeds down to 2133 (yes, that's 1066MHz, double data rate), and within an hour had another crash.  I've now pulled all the memory, and I am testing just a single stick of 4GB DDR4-3600, running at 2133.  I'm now trying to see if the board is stable in this configuration.

 

I'm thinking of buying a new memory kit that is actually validated with my motherboard.  The kit I have is not on the memory QVL for my board, and I think I am dealing with incompatibility issues.

 

-Paul

Link to comment
8 minutes ago, Pauven said:

 

For determining stability, you should test all 4 sticks together (as I did).  If there is a problem, you can try testing them individually, to determine which one causes the problem (assumes a physical defect with a single stick).

 

But it is possible for the problem to only show up in dual-channel mode (requires 2 sticks minimum), or only when all DIMMs are installed (puts more load on the CPU's memory controller).

 

I dropped my speeds down to 2133 (yes, that's 1066MHz, double data rate), and within an hour had another crash.  I've now pulled all the memory, and I am testing just a single stick of 4GB DDR4-3600, running at 2133.  I'm now trying to see if the board is stable in this configuration.

 

I'm thinking of buying a new memory kit that is actually validated with my motherboard.  The kit I have is not on the memory QVL for my board, and I think I am dealing with incompatibility issues.

 

-Paul

 

Sounds like you know more than I do about this.  I didn't get that you were also testing 64GB of memory, at much higher speeds than I was.  I bought my CPU/MB on launch day, but I had pre-bought the memory prior to the QVL being available.  There wasn't even a QVL from ASUS on launch day (!!).  I ended up installing it anyway and getting lucky, but then I'm also not pushing the envelope on memory speed.

 

So maybe going with QVL-approved memory is the best way to go?  At least saves on frustration, which can amount to a lot when dealing with all new hardware.  I bought my RAM from NewEgg so if it's mail-order you'll still have time to test the individual sticks, etc.

 

- Bill

 

Edited by ufopinball
Link to comment

I'm in the same boat as you, bought my memory before a QVL was posted.

 

After communicating with ASRock tech support on my BIOS flashing issues, I have almost zero confidence in their ability to qualify the RAM I have already purchased.  Luckily I can still return the RAM, but will have to pay a 5% restocking fee.

 

Yes, I'm thinking QVL is the best way to go, especially with all the BIOS issues being reported.  

 

After going through this motherboard's QVL, I was only able to find a single 64GB DDR4-2400 kit from the list actually available for purchase in the US.  But at least there is one.  I went ahead and ordered it.

 

If I was you, I would test all four sticks together, with SMT enabled in the test, and let the test run for a day or two at least.  Maybe you did get lucky, but better to be safe.

 

-Paul

Link to comment
30 minutes ago, Pauven said:

If I was you, I would test all four sticks together, with SMT enabled in the test, and let the test run for a day or two at least.  Maybe you did get lucky, but better to be safe.

 

Thus far, I haven't had any issues with the memory.  I installed them once, and haven't had reason to (re)move them.

 

At least one web comment indicated that a single MemTest pass will catch 99% of memory issues, but certainly more passes are going to be better.

 

Since system stability is looking good, and there isn't much more configuration for me to fiddle with, another round of MemTest is certainly worth doing.  I'll probably start it today and just let it run through the weekend.  That should come in at around 10 passes or so?

 

- Bill

Link to comment
1 hour ago, ufopinball said:

Since system stability is looking good, and there isn't much more configuration for me to fiddle with, another round of MemTest is certainly worth doing.  I'll probably start it today and just let it run through the weekend.  That should come in at around 10 passes or so?

 

Memtest didn't show errors on mine until the 3rd pass.  I almost decided to cancel it after the second pass, but figured I would let it run overnight anyway.  Glad I did.

 

Of course, I was already having stability issues.  Sometimes they would crop up in under an hour, and sometimes it was still up and running 14 hours later.  No rhyme or reason to it.  Still doesn't make sense why it has only affected unRAID so far - it was completely stable in Windows.

 

So far it seems to be running better on the single stick of replacement RAM I put in, 4 hours and counting.  Time will tell.

Link to comment

The 1800X just crashed again, so I'm now thinking this is not memory.  Plus, I was able to get more info this time. 

 

Since unRAID doesn't have mcelog/edac_mce_amd support working, I configured the motherboard to boot into Win10 by default, and I then manually booted into unRAID. 

 

Sure enough, after crashing in unRAID, Win10 booted up, and I can see MCE info in the windows logs.

 

I have two of these errors, identical except for the APIC ID (the other one being APIC ID 10):

A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 6

The details view of this entry contains further information.

I think this is describing a problem with the cache in the processor. 

 

I have no idea if this means the processor is bad, or if it is a Linux compatibility issue.

 

I obviously had memory issues too, as reported in Memtest86.  Not sure if a processor issue is causing the memory issue, or if these are two separate issues.

 

My cooling has been excellent, so I don't think this is a cooling issue at all.

 

-Paul

Link to comment
4 hours ago, Pauven said:

The 1800X just crashed again, so I'm now thinking this is not memory.  Plus, I was able to get more info this time. 

 

Since unRAID doesn't have mcelog/edac_mce_amd support working, I configured the motherboard to boot into Win10 by default, and I then manually booted into unRAID. 

 

Sure enough, after crashing in unRAID, Win10 booted up, and I can see MCE info in the windows logs.

 

I have two of these errors, identical except for the APIC ID (the other one being APIC ID 10):


A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 6

The details view of this entry contains further information.

I think this is describing a problem with the cache in the processor. 

 

I have no idea if this means the processor is bad, or if it is a Linux compatibility issue.

 

I obviously had memory issues too, as reported in Memtest86.  Not sure if a processor issue is causing the memory issue, or if these are two separate issues.

 

My cooling has been excellent, so I don't think this is a cooling issue at all.

 

-Paul

 

Not sure what to tell you.  Might be worth contacting the MemTest people to see if they have any advice?

 

Some say you should try reverting to BIOS defaults and testing again.

 

Others suggest trying a different power supply?  If you have a spare, it's an easy thing to test.

 

How do you judge your cooling setup?  Do you have a separate temperature display?  It would be nice if that were integrated into MemTest.

 

- Bill

 

Link to comment

In Win10, I ran the AIDA64 stress tests for CPU & RAM while at the same time running  3DMark Time Spy.  System was perfectly stable, and CPU temps never rose above 38c. Ran it for hours, and saw peak wattage of 304w at the wall.  Coming seems adequate to me. 

 

I too began thinking it was my test power supply causing the issue. My only other "spare" power supply is in my server, so I bit the bullet and swapped this motherboard in. I also have my server running on a pure sinewave battery backup, so I was hopeful better AC/DC regulation would solve these issues. 

 

Nope. 

 

I've had two hard hangs in as many hours, even the console hangs. 

 

The last time there was a new message on the console :

 

Hangcheck: hangcheck value past margin! 

 

I've booted into Windows for the night, will check tomorrow to see if it is stable in Windows or crashes there too. 

 

Resetting BIOS is a good idea. I'll try that tomorrow. Replacement QVL memory also arrives tomorrow. Hopefully something works, else I've got a bad CPU or motherboard. 

 

-Paul 

Link to comment

On the memory note I'll suggest two things.  The first is that with the current state of the boards I'd try EVERYTHING.  I've got four sticks of 16, and they happily run at their rated 2400 as I have them now (several passes, 0 errors and a stable system a few days later), but to even get it to detect I needed to muck about with what order the sticks were in.  The second is that while I'm running the cheaper Aegis I work in retail sales for this stuff and my day-to-day observation (Ryzen and other AMD included, but mostly Intel based) is that G.Skill in general appears to have better compatibility and stability than Corsair (which we actually have a hard time with) and Kingston stuff (I really don't sell enough to comment on other brands).

Link to comment
2 hours ago, Bureaucromancer said:

On the memory note I'll suggest two things.  The first is that with the current state of the boards I'd try EVERYTHING.  I've got four sticks of 16, and they happily run at their rated 2400 as I have them now (several passes, 0 errors and a stable system a few days later), but to even get it to detect I needed to muck about with what order the sticks were in.  The second is that while I'm running the cheaper Aegis I work in retail sales for this stuff and my day-to-day observation (Ryzen and other AMD included, but mostly Intel based) is that G.Skill in general appears to have better compatibility and stability than Corsair (which we actually have a hard time with) and Kingston stuff (I really don't sell enough to comment on other brands).

 

Yeah, I have a lot of G.Skill memory, both DRAM and SDHC.  I recall doing some research, but don't remember exactly what led me to purchase the Crucial memory.

 

It's a shame about Kingston.  They were once very highly regarded, but they seem to get a lot of unhappy reviews about their thumb drives these days.  Apparently they sell a USB 3.0 compatible thumb drive that doesn't actually spec at USB 3.0 speeds.  If you need the high speeds, they have a Premium USB 3.0 line, which really feels like a bait and switch, and I understand why customers are disappointed.  Their headquarters are located up the street from me.  I've considered looking for a job there, I heard it was once an amazing place to work.  Not sure how things are going now, but last I checked their current openings weren't in my areas of expertise.

 

- Bill

Link to comment

It's funny, almost everything you guys are describing (and I greatly appreciate the insights, thanks!), I've already been through on this build.

 

My 64GB kit is G.Skill, and it initially only booted in single channel mode.  I had to rotate the DIMMs by 1 slot to get it into 2-channel mode (probably just re-seating the DIMMs addressed the issue).  This is the kit that showed errors on the 3rd Memtest pass, when running at 2400.

 

My spare 4GB stick is also G.Skill.  It passed Memtest, at least 4 passes with no errors.

 

ASRock has not validated any G.Skill 64GB kits for this motherboard.  My choices were HyperX, ADATA, and SANMAX (never heard of them before).  There were 7 ADATA kits validated (though they all appeared to be the same sticks with just different heatsinks and lighting).  There was only one HyperX, and this lone kit was the only one available locally in the states.

 

Unfortunately, HyperX is a Kingston brand.  ASRock is basically forcing me to go in a different direction than common wisdom.  It has good reviews, at least for the few reviews it has garnered.

 

From what I've been reading online, as far as hitting high memory speeds with Ryzen, Samsung memory chips seem to have the best compatibility.  No idea what is used in these HyperX sticks, the heat spreaders prevent reading the chip markings.  I might be able to tell from a program like CPU-Z, though.

 

An update on stability:  

  • After 2 back-to-back hangs in unRAID, the system has run over 12 hours in Win10 with no problems whatsoever.  In all my testing with Win10, including overnight torture tests, I've never had a single problem in Windows.  And yes, this was with the 64GB memory kit that showed errors in Memtest.
  • Though I only used it for about 4 hours, I also had no hangs in openSUSE.  
  • But in unRAID, I've had at least 10 hangs/crashes/reboots.  I've lost count, actually.  These crashes are occurring regardless of what memory or memory timings I am using.

I'm not a statistics major, but even I'm starting to see a pattern emerge.  :/

 

My next troubleshooting steps:

  • Swap in one stick of the HyperX memory that is ASRock QVL'd (UPS just delivered it, boy is Amazon quick).  In all honesty, I don't think this will make a lick of difference for my stability issue.
  • Reset the BIOS to the system defaults (CPU/RAM settings are currently at defaults, but I turned off a lot of stuff I didn't intend to use, maybe that caused an issue)
  • Create a new test unRAID USB stick (perhaps it is corruption on the current USB stick that is causing the problem).
  • If problems continue in unRAID, I'll boot into openSUSE again and see if it is stable or also has issues

 

One last thought, on the HangCheck issue that I noticed last night.  In reading up on it, primarily what is said about it is "why would you turn this on?".  The purpose of HangCheck seems to be to identify when a system is hung, and reboot it.  Apparently, the scenario that identifies a hung system is when a clock/timer gets out of sync with another clock/timer (cheesy layman terms, sorry).  Interestingly enough, searching for the HangCheck term on Google primarily pointed back at the Lime-Tech forums.

 

This makes me wonder, is the HangCheck error a symptom, or a cause...

 

-Paul

Link to comment
On 3/14/2017 at 6:15 PM, ufopinball said:

 

Thanks, I found my IOMMU setting, but still don't know why it's disabled by default?  It also turns out that "Auto" is the necessary setting.  "Enabled" didn't work for me on the Asus Prime X370-PRO.

 

So now I have a Windows 10 VM running with the GPU pass-through.  Thus far, I also have the USB keyboard, mouse and a sound device being passed through.  The USB sound is kinda crackly, so I may have to replace it with a PCIe card.  Also, I don't seem to be able to find a happy combination of IOMMU groups and the available USB hubs, so I'm looking at getting a USB PCIe card as well.

 

Specs:  AMD 1800X CPU, Asus Prime X370-PRO MB, Crucial CT16G4DFD8213.16FA1 DDR4 2133 (4x16G), Asus Radeon 6450 GPU (desktop, not for gaming)

 

At some point, I'll add a Dell HV52W PERC H310 8-Port controller card, so combined with the 8 SATA ports on the MB, I can still support 16 drives.

 

Hope all those PCIe cards work smoothly together (GPU, SAS/SATA, USB and sound).  Need to dig through the manual to see if there are any gotchas.

 

Did you do anything special to get your Windows 10 VM running.  I finally got my ASUS Prime x370 Pro yesterday and everything seems to be running ok except VM's stop at the shell.

 

Thoughts anyone?

 

 

Link to comment
Just now, chadjj said:

 

Did you do anything special to get your Windows 10 VM running.  I finally got my ASUS Prime x370 Pro yesterday and everything seems to be running ok except VM's stop at the shell.

 

Thoughts anyone?

 

 

Are you running them on GPU or VNC?  I'd make sure everything is working properly with VNC before making a serious attempt to sort out IOMMU issues.

Link to comment
3 hours ago, Bureaucromancer said:

Are you running them on GPU or VNC?  I'd make sure everything is working properly with VNC before making a serious attempt to sort out IOMMU issues.

 

VNC.  The plan is to get that working before attempting any IOMMU pass through.

 

I believe I have everything going now.  I re downloaded the image from MS and this time it's working.  Who knows.  Corrupt image possibly?

Edited by chadjj
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.