Anybody planning a Ryzen build?


Recommended Posts

On 11/03/2017 at 1:33 PM, Beancounter said:

I am happy to report I have unraid working on the following setup.

 

ASUS Prime B350 Plus

Ryzen 1700

32G ram @ 3000

AMD Radeon 270 Graphics 

 

Running 4 dockers and a Win 10 VM.  The graphics card had no issues passing through.   SNIP rest

 

 

Hi, can you confirm it is possible to boot the system with the PCIEx16 2.0 slot (x4) for the GPU so you can assign a GPU in the PCIEx16 3.0 slot to a VM, please?

Link to comment
On 11/7/2017 at 9:55 PM, phbigred said:

If you run the most recent 6.4rc10b you shouldnt have to have C-states off. I haven’t run them turned off since rc7. I don’t know what the I’ll effect of having them off since RC8, that’s when I flipped the option in bios. 

 

Unfortunately the system hung this morning, console was not responding nor could I SSH to the system, so I needed to perform a hard reboot.  Besides a reported memory error message, seemed like a Plex issue and that was only experienced once since August, if I leave C-State disabled, the system is rock solid. 

 

I see that c11i was released... thoughts anyone?   

  • Like 1
Link to comment

I got some hangs last week. Thought it was the new PCIe card i had installed but it was some strange amd bug caused by the C states thing. Its a very random thing too. I went 6 days without a crash when i first moved to Ryzen then got it twice in a day and one the next day. Since that i said what the hell and i turned off the c states and OCed using the zenstates script. Its been a solid week now with no crashes and OCed. Anytime I would OC before the entire system would freeze up, and I mean freeze like everything gets stuck and I have to hard reset it. 

Link to comment

@luisv I know you've kept c-states off a while now due to the issues. 

 

I don't think anyone has reported C-State issues since the 'fix' was added for it, which was awhile ago. I'm just wondering if its time to try figure out what else it could be.

 

@everyone-else else Has anyone else had C-State issues with the recent versions? 

 

EDIT: Maybe I should have read @david279's post first.

Edited by Tuftuf
Link to comment

Sure @Tuftuf, I'm all for trying to figure out what the issue is as I feel disabling C-State is just masking the true underlying issue and as always, thanks for your help! 

 

The system hangs started around August 14th, a few days after the system was originally built; however, here's a run down of system changes after the initial build.  System details are in my signature:

 

Added Intel PRO/1000 Dual Port Server Adapter on August 23rd

Added the final WD Red 4TB HD on October 22nd 

 

I migrated from a 4 year old Synology NAS and during my initial testing found read / write speeds were horrifically slow.  Since my Synology had dual NICs and was bonded (I have managed switches), I purchased the above adapter to try and compare apples to apples.  I was initially trading PMs with @Tuftuf as he had a Ryzen system and ran 2 dockers I was interested in running... Plex and the Unifi controller, so he received the brunt of my frustrations.   After numerous tests, playing with network settings, enabling turbo write, disabling spin down, etc. I was able to achieve similar speeds between unRAID and the Synology NAS.  

 

November 3rd I had a memory error.  After posting in the forums, it seemed to be related to Plex so I upgraded to rc10b.   Memory error hasn't happened since.  

 

https://forums.lime-technology.com/topic/59024-unraid-os-version-640-rc7a-available/?page=6&tab=comments#comment-600394

 

While writing this, I was notified that my Unifi Video docker is not reachable.  The unRAID console is not resolving via the browser; yet, I'm able to play a movie via Plex; both remotely and locally.   I can resolve to the Plex console and other dockers via IP:Port, but the unRAID console is not reachable via IP nor DNS name.   I can't SSH via IP, but I'm able to log into the console locally.   Before the logon prompt, it shows the following:

 

default via (IP of DHCP server) dev br0 metric 209 linkdown
(Subnet range) dev br0 proto kernal scope link src (IP address of Server) linkdown

I kicked off TimeMachine backups on 3 Macs and all are working fine.  From Windows and Mac systems I'm able to browse shares on the unRAID server via DNS name and IP. 

 

So I'm confused to say the least... any help is appreciated.   

 

*** Edit - Forgot to mention, the system is on the lates BIOS 1201.

 

 

Edited by luisv
Link to comment
On 11/15/2017 at 6:50 AM, mikeyosm said:

Can someone running UNRAID 6.4 RC10 and Threadripper please try disabling ASPM in their BIOS (ASROCK boards have this, not sure about others) and also adding pcie_aspm=off  to the flash/boot section in the UNRAID GUI and test GPU passthrough again? I have read reports that this may resolve the D3/sleep issue and allow us to reboot VMs without having to reboot the host. If it works, could be an acceptable workaround until the wider problem is resolved.

 

I can't find solid confirmation that this might work, especially for NVIDIA boards, which is all I have. 

 

https://forum.level1techs.com/t/threadripper-pcie-bus-errors/118977/21

 

I'd be willing to rip up my Intel setup and reinstall my AMD one (again), if there was at least a little more indication that it might work. 

Link to comment
8 hours ago, Tuftuf said:

@luisv I know you've kept c-states off a while now due to the issues. 

 

I don't think anyone has reported C-State issues since the 'fix' was added for it, which was awhile ago. I'm just wondering if its time to try figure out what else it could be.

 

@everyone-else else Has anyone else had C-State issues with the recent versions? 

 

EDIT: Maybe I should have read @david279's post first.

 

I was the original discoverer of the Ryzen stability issue and C-state solution.  My server is extremely susceptible to the C-state issue, typically crashing in 4-8 hours when the issue is present.

 

I'm running 6.4.0-rc7a, with C-states enabled, and my uptime is 52 days.

 

I have avoided all of the recent 'Really Close' releases since 7a, as the changes just seemed too scary for me to be a guinea pig.  I think it was the introduction of the block level device encryption.  I don't plan to use it, but I have nightmares thinking that a beta version could somehow misbehave and accidentally encrypt my precious data, so that I never get it back.  I know the odds of that happening are pretty much zilch, though if it could happen it would likely happen to me.  I'm waiting for the next stable public release.

 

Anyway, perhaps something has changed since 7a that lost the fix for the Ryzen C-state issue.

 

Paul

Link to comment
20 minutes ago, Pauven said:

I was the original discoverer of the Ryzen stability issue and C-state solution.  My server is extremely susceptible to the C-state issue, typically crashing in 4-8 hours when the issue is present.

 

I'm running 6.4.0-rc7a, with C-states enabled, and my uptime is 52 days.

 

This issue is related to the linux kernel.

 

unRAID 6.4.0-rc7a uses linux kernel 4.12.3.

6.4.0-rc8q uses kernel 4.12.10

6.4.0-rc9f uses kernel 4.12.14

6.4.0-rc10b uses kernel 4.13.10

 

Typically those patch releases (last number) are bug fixes only, not massive code changes (though not a rule).  It might be interesting to upgrade to -rc9f to see if your server is still stable.

 

Have you considered chiming in on this topic?

https://bugzilla.kernel.org/show_bug.cgi?id=196683

 

I think other people there have reported "better" stability in pre-4.13 kernels but not everyone.

 

Personally I think this is a h/w issue with Ryzen that AMD doesn't care about because it doesn't show up on Windows.  The various linux kernels do a better or worse job at "masking" the problem, meaning, problem is there, just timing differences in code paths tend to make it happen more or less frequently.

 

In our case, we cannot revert back to earlier kernels because then we give up improvements/bug fixes in other subsystems such as btrfs, xfs, etc.

 

btw: do not fear the encryption!

  • Like 1
Link to comment
On 11/14/2017 at 10:04 AM, Tuftuf said:

 

I followed the Nivida GPU passthrough guide that showed you how to dump your vbios and then provide the path to it within the XML. 

 

 

 

 

Unfortunately I'm on a B350 matx board with only one full size pci-e slot so I can't move the card to dump my bios.

 

I tried the method for editing the vbios off techpowerup and that is still causing the code 43.  I've also attempted jayseejc's tip about adding video=efifb:off without success as well.

 

I'll try swapping in another card this weekend and see if maybe that does the trick.

 

Thanks for the help.

 

P.S. Just to add to the data set I've been running stable for a few months now on 6.3.5 with the board's original bios, c-states off, 2 windows vm, and 8-10 docker containers.  The system would crash if any form of linux vm was run for more than 2 hours.  I upgraded bios and unraid to rc10 this past weekend and turned c-states back on and everything seems to be running fine other than the passthrough problems. Have not tried a linux vm after these updates yet though. This is all on 1700x, Asus B350m-a/csm, and 2x16GB Corsair Dominator Platinum

Link to comment
18 hours ago, Pauven said:

 

I was the original discoverer of the Ryzen stability issue and C-state solution.  My server is extremely susceptible to the C-state issue, typically crashing in 4-8 hours when the issue is present.

 

I'm running 6.4.0-rc7a, with C-states enabled, and my uptime is 52 days.

 

I have avoided all of the recent 'Really Close' releases since 7a, as the changes just seemed too scary for me to be a guinea pig.  I think it was the introduction of the block level device encryption.  I don't plan to use it, but I have nightmares thinking that a beta version could somehow misbehave and accidentally encrypt my precious data, so that I never get it back.  I know the odds of that happening are pretty much zilch, though if it could happen it would likely happen to me.  I'm waiting for the next stable public release.

 

Anyway, perhaps something has changed since 7a that lost the fix for the Ryzen C-state issue.

 

Paul

 

 

Not sure, but it's been frustrating for sure... especially when I forget to disable C-State after a BIOS upgrade and find the system hung two or so days later.   

 

When I checked console access, it was also hung, so I needed to perform another hard reboot.  I left C-State enabled.   Once at the log-on prompt, I noticed the following line:

192.168.1.0/24, dev br0 proto kernel scope link src 192.168.1.10, linkdown

However, after the reboot the web console, all dockers, VMs and shares have been accessible.  Uptime is 21hrs and 23mins.  With C-State enabled, prior lockups occur around the 2 day mark.  So needless to say, I'm very confused as the console shows linkdown, but it's obviously up.   

Link to comment
55 minutes ago, luisv said:

So needless to say, I'm very confused as the console shows linkdown, but it's obviously up

 

This is an asynchronous process. The welcome message + link state is displayed while in the background initialization is taking place. Depending on how fast background activity completes (=your hardware) the link state can either be shown as up or down.

Link to comment
20 hours ago, bonienl said:

 

This is an asynchronous process. The welcome message + link state is displayed while in the background initialization is taking place. Depending on how fast background activity completes (=your hardware) the link state can either be shown as up or down.

 

Thanks for the explanation!

 

Any thoughts on why the system keeps freezing with C-State enabled as the fixes are supposed to be incorporated into the current RCs?    Current uptime is 1 day, 18 hours, 50 minutes.   

Link to comment

Thought I'd chime in here. Haven't read the thread (yet) as it's huge, but:

 

1950X / X399 Taichi build.

 

On 6.3.5 at the moment. Have been for a day or two now. Originally I was having lots of problems with a HBA, so I ended up leaving SVM disabled. Ran for over a day with no problems. Turned SVM on today and everything's back up, with the only change being I can't see my CPU usage on the dashboard (though it works in Stats). I should mention I've changed no settings at all, other than enabling, SR-IOV, SVM, and IOMMU in BIOS.

 

Now the problems with 6.4

 

If I boot into 6.4 with BIOS defaults, my HBA isn't detected.

 

If I then enable IOMMU, my HBA is detected, but my logs are flooded with:

 

Quote

Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3: AER: Multiple Corrected error received: id=0000
Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=400b(Transmitter ID)
Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3:   device [1022:1453] error status/mask=00001180/00006000
Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3:    [ 7] Bad DLLP              
Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3:    [ 8] RELAY_NUM Rollover    
Nov 19 17:27:44 server kernel: pcieport 0000:40:01.3:    [12] Replay Timer Timeout  
Nov 19 17:27:44 server kernel: mpt3sas 0000:41:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=4100(Transmitter ID)
Nov 19 17:27:44 server kernel: mpt3sas 0000:41:00.0:   device [1000:0072] error status/mask=00001101/00002000
Nov 19 17:27:44 server kernel: mpt3sas 0000:41:00.0:    [ 0] Receiver Error         (First)
Nov 19 17:27:44 server kernel: mpt3sas 0000:41:00.0:    [ 8] RELAY_NUM Rollover    
Nov 19 17:27:44 server kernel: mpt3sas 0000:41:00.0:    [12] Replay Timer Timeout  

 

I'm hesitant to start my array in this condition, so I've gone back to 6.3.5, which seems (weirdly), to be working completely fine so far. I'm not using passthrough for any of my VMs, so perhaps that's why. I also haven't disabled C-States.

 

Does anyone have anything I can try? As I said, I haven't read through the thread yet, so there could well be something simple I'm missing.

server-diagnostics-20171118-1501.zip

server-diagnostics-20171119-1730.zip

Link to comment

I tried :)  It's an issue I live daily with, generally, use Safari for most things if I'm actually on a mac but I can't see CPU usage. Webhooks issue. 

 

I'm still tempted by a 1950X but seems they are struggling a little more than the Ryzen range with passthrough related etc. 

 

I suggest posting in the Pre Release section with the error and diags for 6.4.

 

EDIT - Looks like you have already :)

 

 

Edited by Tuftuf
Link to comment
4 hours ago, Tuftuf said:

I tried :)  It's an issue I live daily with, generally, use Safari for most things if I'm actually on a mac but I can't see CPU usage. Webhooks issue. 

 

I'm still tempted by a 1950X but seems they are struggling a little more than the Ryzen range with passthrough related etc. 

 

I suggest posting in the Pre Release section with the error and diags for 6.4.

 

EDIT - Looks like you have already :)

 

 

 

I agree,  I almost got the 1950X and may still when the passthrough and iommu is fixed.   I am running a R7 1700 at 3900Mhz with a Kraken x61 cooler with no issues.     The stock Wraith Spire ran at 3800 with no issues and I don't want to use 1.7v to get to 4ghz so I will stick with a stable 3.9ghz.  This will last me until the price on the TR4 comes down or its bugs are fixed.

Link to comment

The system froze a few hours ago and I couldn't SSH nor log in via the console, so I had to perform a hard reboot.  Upon reboot I disabled C-State.  With C-State enabled, the longest uptime was around 2 days, with it disabled, it was 20 days and 11 hours... I restarted the system due to a BIOS upgrade.   If anyone wants to see a diagnostic file or if they feel I should upgrade to RC13 please let me know.   

Link to comment
34 minutes ago, luisv said:

The system froze a few hours ago and I couldn't SSH nor log in via the console, so I had to perform a hard reboot.  Upon reboot I disabled C-State.  With C-State enabled, the longest uptime was around 2 days, with it disabled, it was 20 days and 11 hours... I restarted the system due to a BIOS upgrade.   If anyone wants to see a diagnostic file or if they feel I should upgrade to RC13 please let me know.   

 

You've definitely seen same crash with C-State disabled?

Link to comment

Just FYR, my 1700 on previous release must disable C-State, otherwise it will hang within 24hrs. But after later release, enable C-State seems no issue.

Yesterday, I just upgrade lartest BIOS ( new AGESA 1071 ) and RC13, just let it running for a period to check.

 

Due to I never turn-on for 2 days or more, so I can't say stable or not.

 

Update : With lartest BIOS (3203) & RC13, found Syslog have below message. I try change in BIOS ( auto / enable /disable ) same. BIOS version can't down grade.

Does that means Kernel won't control C-State 9_9

 

Nov 23 04:25:26 X370 kernel: ACPI Error: Needed [Integer/String/Buffer], found [Region] ffff88081e872318 (20170531/exresop-424)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)
Nov 23 04:25:26 X370 kernel: [Firmware Bug]: ACPI MWAIT C-state 0x0 not supported by HW (0x0)

Edited by Benson
Link to comment
On ‎2017‎年‎11‎月‎20‎日 at 4:05 AM, -Daedalus said:

Thought I'd chime in here. Haven't read the thread (yet) as it's huge, but:

 

1950X / X399 Taichi build.

 

On 6.3.5 at the moment. Have been for a day or two now. Originally I was having lots of problems with a HBA, so I ended up leaving SVM disabled. Ran for over a day with no problems. Turned SVM on today and everything's back up, with the only change being I can't see my CPU usage on the dashboard (though it works in Stats). I should mention I've changed no settings at all, other than enabling, SR-IOV, SVM, and IOMMU in BIOS.

 

Now the problems with 6.4

 

If I boot into 6.4 with BIOS defaults, my HBA isn't detected.

 

If I then enable IOMMU, my HBA is detected, but my logs are flooded with:

 

 

I'm hesitant to start my array in this condition, so I've gone back to 6.3.5, which seems (weirdly), to be working completely fine so far. I'm not using passthrough for any of my VMs, so perhaps that's why. I also haven't disabled C-States.

 

Does anyone have anything I can try? As I said, I haven't read through the thread yet, so there could well be something simple I'm missing.

server-diagnostics-20171118-1501.zip

server-diagnostics-20171119-1730.zip

 

So sad, TR have these result. BTW thanks your post.

 

Would you try disable PCIe ASPM.

Edited by Benson
Link to comment

So I got my crosshair vi hero board and ran sensors-detect, here is my output.

Any ideas how I can force add sensors to UNRAID so I can see my CPU temps?

 

# sensors-detect revision 6284 (2015-05-31 14:00:33 +0200)
# Board: ASUSTeK COMPUTER INC. CROSSHAIR VI HERO
# Kernel: 4.13.13-unRAID x86_64
# Processor: AMD Ryzen 7 1700 Eight-Core Processor (23/1/1)

This program will help you determine which kernel modules you need
to load to use lm_sensors most effectively. It is generally safe
and recommended to accept the default answers to all questions,
unless you know what you're doing.

Some south bridges, CPUs or memory controllers contain embedded sensors.
Do you want to scan for them? This is totally safe. (YES/no): YES
Silicon Integrated Systems SIS5595...                       No
VIA VT82C686 Integrated Sensors...                          No
VIA VT8231 Integrated Sensors...                            No
AMD K8 thermal sensors...                                   No
AMD Family 10h thermal sensors...                           No
AMD Family 11h thermal sensors...                           No
AMD Family 12h and 14h thermal sensors...                   No
AMD Family 15h thermal sensors...                           No
AMD Family 16h thermal sensors...                           No
AMD Family 15h power sensors...                             No
AMD Family 16h power sensors...                             No
Intel digital thermal sensor...                             No
Intel AMB FB-DIMM thermal sensor...                         No
Intel 5500/5520/X58 thermal sensor...                       No
VIA C7 thermal sensor...                                    No
VIA Nano thermal sensor...                                  No

Some Super I/O chips contain embedded sensors. We have to write to
standard I/O ports to probe them. This is usually safe.
Do you want to scan for Super I/O sensors? (YES/no):
Probing for Super-I/O at 0x2e/0x2f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      Yes
Found unknown chip with ID 0x8665
    (logical device 4 has address 0x290, could be sensors)
Probing for Super-I/O at 0x4e/0x4f
Trying family `National Semiconductor/ITE'...               No
Trying family `SMSC'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'...               No
Trying family `ITE'...                                      No

Some systems (mainly servers) implement IPMI, a set of common interfaces
through which system health data may be retrieved, amongst other things.
We first try to get the information from SMBIOS. If we don't find it
there, we have to read from arbitrary I/O ports to probe for such
interfaces. This is normally safe. Do you want to scan for IPMI
interfaces? (YES/no):
Probing for `IPMI BMC KCS' at 0xca0...                      No
Probing for `IPMI BMC SMIC' at 0xca8...                     No

Some hardware monitoring chips are accessible through the ISA I/O ports.
We have to write to arbitrary I/O ports to probe them. This is usually
safe though. Yes, you do have ISA I/O ports even if you do not have any
ISA slots! Do you want to scan the ISA I/O ports? (YES/no):
Probing for `National Semiconductor LM78' at 0x290...       No
Probing for `National Semiconductor LM79' at 0x290...       No
Probing for `Winbond W83781D' at 0x290...                   No
Probing for `Winbond W83782D' at 0x290...                   No

Lastly, we can probe the I2C/SMBus adapters for connected hardware
monitoring devices. This is the most risky part, and while it works
reasonably well on most systems, it has been reported to cause trouble
on some systems.
Do you want to probe the I2C/SMBus adapters now? (YES/no):
Found unknown SMBus adapter 1022:790b at 0000:00:14.0.
Sorry, no supported PCI bus adapters found.
Module i2c-dev loaded successfully.

Next adapter: SMBus PIIX4 adapter port 0 at 0b00 (i2c-0)
Do you want to scan it? (YES/no/selectively):
Client found at address 0x50
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 No
Probing for `EDID EEPROM'...                                No
Client found at address 0x51
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 No
Client found at address 0x52
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 No
Client found at address 0x53
Probing for `Analog Devices ADM1033'...                     No
Probing for `Analog Devices ADM1034'...                     No
Probing for `SPD EEPROM'...                                 No

Next adapter: SMBus PIIX4 adapter port 2 at 0b00 (i2c-1)
Do you want to scan it? (YES/no/selectively):

Next adapter: SMBus PIIX4 adapter port 3 at 0b00 (i2c-2)
Do you want to scan it? (YES/no/selectively):

Next adapter: SMBus PIIX4 adapter port 4 at 0b00 (i2c-3)
Do you want to scan it? (YES/no/selectively):

Sorry, no sensors were detected.
Either your system has no sensors, or they are not supported, or
they are connected to an I2C or SMBus adapter that is not
supported. If you find out what chips are on your board, check
http://www.lm-sensors.org/wiki/Devices for driver status.

 

Link to comment
On 11/16/2017 at 9:11 PM, Clay Smith said:

Unfortunately I'm on a B350 matx board with only one full size pci-e slot so I can't move the card to dump my bios.

 

I tried the method for editing the vbios off techpowerup and that is still causing the code 43.  I've also attempted jayseejc's tip about adding video=efifb:off without success as well.

 

I'll try swapping in another card this weekend and see if maybe that does the trick.

 

Thanks for the help.

 

P.S. Just to add to the data set I've been running stable for a few months now on 6.3.5 with the board's original bios, c-states off, 2 windows vm, and 8-10 docker containers.  The system would crash if any form of linux vm was run for more than 2 hours.  I upgraded bios and unraid to rc10 this past weekend and turned c-states back on and everything seems to be running fine other than the passthrough problems. Have not tried a linux vm after these updates yet though. This is all on 1700x, Asus B350m-a/csm, and 2x16GB Corsair Dominator Platinum

 

I think you are on to something with linux VM crashing System. I swapped my motherboard to my existing Unraid Setup: 

 

Unraid 6.3.5

 

Old System 

Asus Sabertooth X79

3930k

SUPERMICRO AOC-SAS2LP-MV8 PCI-Express 2.0 x8 SATA

64 GB RAM

 

New Hardware

Biostar x370gtn

Ryzen 1800x

G.SKILL Ripjaws V Series 16GB x 2

 

I intended to run headless and with the issues with C-State still going, I disabled it on my Ryzen system. I did a few trial and error to make sure the machine would run headless. 

 

And I was doing some simple bench tests on VM and was all fine until I launched the Ubuntu VM. I actually need those VM up always, and I didn't go on with experimenting because its my main server. I don't want to continue messing around but not at the expense of my data. I'm staying with my intel machine for now.

 

I didn't want to upgrade to release candidate; the last time I upgraded versions of Unraid, I had major growing pains.

 

I'm scavenging hard drives to use for experimenting on the Ryzen build. I don't plan to do any GPU passthrough, I'm going to verify the issue with Linux VM as well as see just how stable the RC are. I'll just import all the VMs and try to create the crash I had in 6.3.5

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.