Frequent unraid crashes on 6.9.1


Recommended Posts

I just recently migrated from a older intel server build to a AMD CPU and since, I have had frequent crashes where the server is unresponsive and I have to reboot the server. Below is the specs I am running:

 

M/B: ASUSTeK COMPUTER INC. ROG STRIX X370-F GAMING Version Rev X.0x - s/n: 180221242700511

BIOS: American Megatrends Inc. Version 5603. Dated: 07/28/2020

CPU: AMD Ryzen 7 3700X 8-Core @ 3600 MHz

Cache: 512 KiB, 4 MB, 32 MB

Memory: 32 GiB DDR4 (max. installable capacity 128 GiB)

 

I have added my syslogs.

 

I have:

Set Global C-state control to disabled

Set Power Supply Idle Control to Typical Current Idle

syslog-192.168.1.50.log

unraid-diagnostics-20210315-0714.zip

Edited by antonio3427
Link to comment

Since changing the mem speed to 2400, im seeing much longer periods before crashes. Around 8-10 hours:

 

M/B: ASUSTeK COMPUTER INC. ROG STRIX X370-F GAMING Version Rev X.0x - s/n: 180221242700511

BIOS: American Megatrends Inc. Version 5603. Dated: 07/28/2020

CPU: AMD Ryzen 7 3700X 8-Core @ 3600 MHz

Memory: 32 GiB DDR4 (max. installable capacity 128 GiB)

 Corsair CMK32GX4M2B3200C16, 16 GiB DDR4 @ 2400 MT/s 

 Corsair CMK32GX4M2B3200C16, 16 GiB DDR4 @ 2400 MT/s

 

Link to comment
  • 2 weeks later...

Anyone been able to figure this out? I have done the following:

 

Turned down my memory to 2400 which my MB supports.

Set Global C-state control to disabled

Set Power Supply Idle Control to Typical Current Idle

Adding 'rcu_nocbs=0-15' in syslinux config

 

while I have been getting 1-2 stable days,  i still get crashes

 

at this point I may go back to a intel MB (supermicro) and intel CPU

Link to comment
On 3/17/2021 at 2:04 PM, antonio3427 said:

 Corsair CMK32GX4M2B3200C16, 16 GiB DDR4 @ 2400 MT/s 

 

You ought to be able to clock it faster than that. I don't see your specific memory on the Asus QVL though - I might have missed it because it's difficult to search. With your motherboard and a 3700X the full 3200 MT/s is possible with the correct RAM, without overclocking the memory controllers. That said, 2400 MT/s should be easily possible, but leave it at that speed for the time being. Have you run a MemTest? Preferably go to MemTest86 and download the latest free version and make a dedicated USB stick - NOT your Unraid boot device!

 

6 hours ago, antonio3427 said:

Set Global C-state control to disabled

 

Turn that back on.

 

6 hours ago, antonio3427 said:

Adding 'rcu_nocbs=0-15' in syslinux config

 

Don't use that, either.

 

6 hours ago, antonio3427 said:

Set Power Supply Idle Control to Typical Current Idle

 

This^ is the only one you might need, and even that's probably cosmetic with a 3000-series CPU. But it does no harm.

 

BOOT_IMAGE=/bzimage pcie_acs_override=downstream,multifunction vfio_iommu_type1.allow_unsafe_interrupts=1 initrd=/bzroot

 

Revert this to the default, for the time being. In other words:

 

BOOT_IMAGE=/bzimage initrd=/bzroot

 

You have the latest BIOS, so nothing to change there. What power supply are you using?

 

Do the MemTest for at least 24 hours, then see how long your server runs if you start it in Safe mode, without Dockers and VMs?

 

Link to comment
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.