Hardware upgrade woes


Recommended Posts

I've been running UnRAID for a long time... likely 10 years at this point, and for most of that time I've been running great on a Xeon E3 1230 with 32GB of ECC memory. I have 3 SAS2LP controllers that tie into the case backplane where I currently have 19 disks running. Every so often I've had video playback stuttering... either through Plex, or if I have a drive mapped and start playing a video using a Win10 PC.

 

As my hardware was getting on in years I thought I'd upgrade to a Ryzen system - ASUS Pro WS X570 Ace MB, Ryzen 3900X and 32GB of ECC ram. 

 

My issue is that since switching I am getting a ton more buffering (to the point where videos can be unplayable), and I really don't understand why. Other than MB, CPU and RAM the infrastructure is the same... and everything should be much faster (on Passmark I went from an 8K on the Xeon to 32K on the Ryzen).

 

I'm looking to run the disk tunables scripts to ensure those are set correctly as I last did it 5-6 years ago (I think I was still on UnRAID 5.X the last time I ran this), but I don't know what else to check.

 

If anyone has suggestions I would greatly appreciate them. I've included a diagnostic dump for anyone who can pull useful information from it.

 

cydstorage-diagnostics-20200702-2124.zip

Edited by bkastner
Link to comment

Pls minimise multiple read / write on array in same time. Some tunable parameter also could help.

 

Your setting

nr_requests="128"
md_scheduler="auto"
md_num_stripes="3464"
md_queue_limit="80"
md_sync_limit="5"
md_write_method="auto"
 

Edited by Benson
Link to comment

I'm running through the full tunables testing now to see if it helps.

 

Even during a parity check though I am assuming I should be able to watch a few video streams without major impact. Again.. other than changing MB, CPU  RAM the system is exactly the same and I was able to do this before the hardware upgrade.

Link to comment

Okay... I've now confirmed the stutter isnt' just with parity check. I had only built the server on June 30th so though it was only an issue when parity was running, but if I try and start a video now I still get massive buffering going on. 

 

Does anyone have any suggestions? Again... for what I changed I don't understand why it would have negatively affected performance like this.

Link to comment

I have a GB switch that the UnRAID server ties directly into as does my internet router. I started copying files from UnRAID to my local PC as the playback was horrible and noticed how slow the transfer was which started leading me down the same thoughts. Everything from the cables to the switch onward is the same as before, but with the new MB there are new NICs and I am guessing that is where the issue lies. For some reason I thought when poking around that my one active NIC was showing 10MB/s instead of 100 of 1000, but I had thought it was just reporting in error... now I am wondering if that's the case, and I have no idea where I actually saw that reported.

 

I can talk through other parts of the network, but assume it's isolated to the UnRAID server as Plex in the house and outside are affected as is me doing local direct playback with Windows Film & TV Player

 

The Asus Pro WS x570-ACE has 2 NICs, one Intel and one Realtek.. I think for management, but they both show up in UnRAID... I think UnRAID tried to bond them when I started up, but I didn't know if bonding the 2 different manufacturer NICs was a good idea so I turned it off. Maybe I should try with it turned back on?

Edited by bkastner
Link to comment

Okay... so I brought the other NIC up and bonded them to test... network speeds are definitely better and I don't have the stuttering, but it does show eth0 as 10Mbps and eth1 as 1000 Mbps... no idea why, and I am open to any suggestions to help.

 

I know other's have used the same MB, so not sure why I am having difficulties.

sysinfo2.png

Link to comment

Okay... last comment before bed. It seems my issue is the Realtek RTL8117 NIC, which had been assigned as eth0. I've broken the bond, switched NICs so the Intel I211-AT NIC is eth0 and disabled the Realtek NIC and network performance (at least locally) is normal again. I also don't get any drop errors anymore.

 

Is there a known issue with the RTL8117? Or is it unique to me? I am hoping to get a 10GB NIC down the road so both on-board NICs will likely be turned off eventually, but I'd be curious to know if the issue is just my MB for some reason.

 

Hopefully everyone's Plex experience is back to normal and I can let this lie.

  • Like 1
Link to comment
  • 5 months later...

This may be related to https://bugzilla.kernel.org/show_bug.cgi?id=209725

If you can run a v5.5 or newer kernel, try this to see if it's a workaround:

 

  # echo 0 > /sys/devices/pci0000:00/0000:00:01.2/0000:01:00.0/link/l1_aspm

 

More gory details at:

https://lore.kernel.org/r/[email protected]

 

If you can reproduce this problem and the workaround above works, please let us know at [email protected].  We would really like help to get this fixed.  The output of "sudo lspci -vv" would be helpful.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.