Unraid Issues New Build


Recommended Posts

Hello Everyone I have been using unraid for years and just in the last couple of days I decided to build a newer faster server.

 

Intel I7 4790K 4.0Ghz CPU with 32 GB Ram and 3 gig nics.

 

The problem I am having is buffering and errors.  Buffers every sec and then throws the error that there isn't enough bandwidth.  So let me explain my setup and hopefully someone can scratch their head and help me figure this out.

 

Eth0 and Eth1 are bonded using 802.3AD and I use all Ubiquiti networking in the house.   On the 24 Port switch ports 23-25 are setup in a Aggregate together.   This Bond is on my default network and is the network that has public access.

 

Eth2 is all by itself and on it's own VLAN and I use this network and NIC to move my data back to the unraid shares or downloading things.

 

Right I am moving about 1.6 TB to the unraid server over Eth2 with no problems at all.   

 

On the default network I turn my Roky Utla on and try to stream a Episode from a TV and it sits there with a percentage meter and takes about 3 mins to start the show.  Once it starts it buffers like crazy.   I was like this is crazy.  So then I decided to go into the plex app on the roku and manually set it to use the IP of the Eth0 bond thinking ok this is 2 different nics and Eth0 Bond is a 2 gig pipe.   Nope still a problem.   I look at the Unraid dashboard and all resources are fine.  Every now and then I will see a cpu core to red but nothing major.   

 

I look at the band width on the Bond and on Eth2 and i see my traffic where I am moving files and the speeds are good.  Then I look at Eth0 and it is good too.  Plenty of room only using like 5mb outbound.  However, everything buffers and then throws the error that there is not enough bandwidth.  I just can't seem to figure it out.

 

My shares are setup using Most-Free and XFS is the file system.  I have 8 x 4TB Seagate Drives with no errors and on the dashboard, they are barely being used even while coping data.   While coping all this data I have not assigned a Cache drive yet.

 

One would think that with the potential of 3 GB of network bandwidth and a i7 4.0 Ghz cpu I would have NO bandwidth errors at all..

 

If I stop the copy of files, I then can stream 2 things at a time but that is about it before the buffering starts.   With my kids and my family, I have at times can have 4-6 people watching plex at a time and my old AMD server could do that better than this one can.

 

 

If anyone has any ideas or suggestions, please share them with me.  I am about to pull my hair out. LOL

Link to comment

Pls set Bond and Eth2 not in same IP subnet ( does Eth2 link to untag 192.168.2.x and VLAN 192.168.10.x network in smae time ? )

Usually I won't set tag+untag in same interface, if machine have more then one interface, just well config in switch side would be fine and simple.

 

eth2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.2.200  netmask 255.255.255.0  broadcast 192.168.2.255
        inet6 fe80::226:55ff:fee5:5fbb  prefixlen 64  scopeid 0x20<link>
        ether 00:26:55:e5:5f:bb  txqueuelen 1000  (Ethernet)
        RX packets 288133287  bytes 438181193969 (408.0 GiB)
        RX errors 0  dropped 144  overruns 0  frame 0
        TX packets 16044  bytes 1398463 (1.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 18  memory 0xdfd40000-dfd60000  

 

eth2.10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.10.100  netmask 255.255.255.0  broadcast 192.168.10.255
        inet6 fe80::226:55ff:fee5:5fbb  prefixlen 64  scopeid 0x20<link>
        ether 00:26:55:e5:5f:bb  txqueuelen 1000  (Ethernet)
        RX packets 135580433  bytes 425739116534 (396.5 GiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 6326  bytes 531842 (519.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 

Edited by Benson
Link to comment

Thanks for the response.

I have a couple questions about everything.

 

When you ask me not to set in the same subnet what do you really mean?   I thought 192.168.2.x and 192.168.10.x was different subnets.  Also the tag and untag settings where do I change those settings at in the GUI and how?  I do have the switched setup with tag and untag I think.   

 

I do appreciate any more help you can provide to help me out.

 

Thanks,

 

AH

Link to comment

From your diagnosis, eth2 seems config with two subnet. That's I suspect problem source.

 

Or could you simple try disconnect Eth1 and check all problem will gone ? ( This to exam problem relate Aggregate or not )

 

If result negative, pls provide screen dump of network setting in Unraid for ref.

Edited by Benson
Link to comment

Well I have starting moving files back to the unraid server over Eth2 and then started streaming 1 tv show and it still freezes and buffers..

 

I do have a question.   I wanted to bond eth0 & eth1 to create a 2gb pipe to my switch.   What is the best Bonding Method on to use?

 

Also if I want the fastest read and write times on the hdd's what is the best method on that.  I set my shares to Most-Free so that the space is even across drivews.   If you have any other recommendations please let me know.

 

Thanks

Link to comment

The network setting was normal.

 

Does streaming buffer issue still happen if stop file transfer in eth2 ? ( Best haven't other disk activity )


Last, does test within array recovery ? I found it start on 19-Mar and complete on 20-Mar

 

Mar 19 16:27:00 MediaNAS kernel: md: recovery thread: recon P ...
Mar 19 16:27:08 MediaNAS kernel: md: recovery thread: exit status: -4

Mar 19 16:27:14 MediaNAS emhttpd: Stopping services...

 

Mar 19 16:34:47 MediaNAS kernel: md: recovery thread: recon P ...
Mar 19 16:40:52 MediaNAS kernel: md: recovery thread: exit status: -4

Mar 19 16:41:59 MediaNAS emhttpd: Stopping services...

 

Mar 19 16:43:41 MediaNAS kernel: md: recovery thread: recon P ...
Mar 19 16:44:04 MediaNAS kernel: md: recovery thread: exit status: -4


Mar 19 21:12:01 MediaNAS kernel: md: recovery thread: recon P ...
Mar 20 09:26:27 MediaNAS kernel: md: recovery thread: exit status: 0
Mar 20 11:32:52 MediaNAS emhttpd: Stopping services...

 

 

 

Edited by Benson
Link to comment

yes the buffering stops when I stop coping files and it stops when only 1 stream at a time is going on.   But just as soon as I start to copy files on Eth2  it starts to buffer and then throws an error that the server doesn't have enough bandwidth..    On the roku I have set the manual settings to use Eth0 and still no luck.

 

With this being a brand new fresh build I hadn't done a sync on the parity.   I ran it last night and it has completed successfully.

 

 

Link to comment

So for what it is worth.  

 

I took and and added all 3 NICs to the bond and updated my switch for the lag of the 3 ports thnking a 3 gb pipe would be plenty of bandwidth and of course as soon as I start copying to the unraid box which is inbound traffic and start streaming from plex on outbound traffic I get the same issues.   Makes no sense to me

Link to comment

As you have 32G RAM, could you put 2 file i.e. ~5GB each and cache both in memory and stream them will have any different ?

 

How to put it in cache ? Just read whole file.

How to know both file fully cache ? If you found no more disk activity during streaming

 

** I always cache media file in RAM when access, but this just because I don't want the drive spinup for watch movie only. **

 

Edited by Benson
Link to comment

So you are writing directly to the array with a 3Gb stream and expecting to be able to read from the array at the same time?  I see you have a cache drive but you will quickly fill it up at 3Gb/sec and the writes will then bypass the cache and go straight to the array.  Are these writes small data type files or BluRay ISO's?  Small files can generate tremendous amounts of file handling overhead.  

 

Plus in the SMART section is this report:

ST4000DM000-1F2168_Z30158FJ-20190320-0234 parity (sdb) - DISK_INVALID.txt

Is this true?  What is going on here?  Are you rebuilding parity while you are attempting to read and write from the array? 

Link to comment

The Parity hadn't been synced yet.   I did resolve that problem and all those errors are gone.

 

I will be introducing cache drive affter I copy all my data to the new build.  I have about 4 TB of data left to move and yes that would fill my cache drive really quick.   

 

Yes the files remaining are movies and they all have a .mkv extension and range from 1.5 GB to 75 GB in size each.  

 

I see what you are saying but I assumed my server could handle it.   Also if it helps I did devide my Sata drives across 2 PCIe cards and the Sata ports built into the motherboard.  

 

 

Link to comment

If you are filling up your array, you could be saturating the hard drive as it could now be the slowest link in the chain and buffering could become an issue if you are both writing a new file and reading a different file from the same disk.  (Most of the time the slowest link has been the network speed!)  As I recall, most Hard drives can not use all of the bandwidth which SATA2 provides. 

Link to comment

So after 2 days of tinkering with this hoping that using the cache drive would resolve the problem it hasn't.

 

If I am using the Roku Ultra for example with a network cable not on wifi and I am watching a simple 4mb 720 TV show like Big Bang and the mover starts everything stops and throws an error that the server doesn't have enough bandwidth.

 

Is there anything I am missing..   

 

Server build itself is

4790K i7 CPU

32 Gb ram

8 HDD's connect to 2 pcie sata cards and the motherboard ports

1 SSD cache 

3 1gb nics 2 are bonded and 1 is on a different subnet

 

As I have said there is plenty of resources here for the server to multitask.  

 

If anyone can help that would be great.

 

Thanks in advance for any help.

 

Link to comment

Ok after posting my last post I thought of 1 more test to run.

 

I copied 28gb's to the cache drive.  Then I opened plex on my Roku and changed the ip address to eth2 nic that is not bonded and is on a different subnet.   Remember eth0 & eth1 are bonded and my ubiquiti switch has been set to aggregate..  

 

So I started playing a 8mb 1080P tv show and then started the mover and no problems at all. The mover finished moving the data and the tv show didn't buffer a single time.

 

 

Any recommendations on bonding methods would be greatly appreciated.

 

 

Link to comment

Hi again,

 

So if I'm reading your message correctly, when you are copying data to the cache drive in this last test, that copy is happening over a bonded connection on Unraid.  Separately your Roku is then accessing the system over a separate NIC port, and then everything is working fine.  But if you try to copy over a bonded connection while trying to play content on that same bonded connection, you have an issue with buffering, yes?

 

For more information on the various bonding methods Unraid supports, you can click the help text on the Network Settings page.  Here's a copy of that text:

 

Quote

Mode 0 (balance-rr)
This mode transmits packets in a sequential order from the first available slave through the last. If two real interfaces are slaves in the bond and two packets arrive destined out of the bonded interface the first will be transmitted on the first slave and the second frame will be transmitted on the second slave. The third packet will be sent on the first and so on. This provides load balancing and fault tolerance.

 

Mode 1 (active-backup) - default
This mode places one of the interfaces into a backup state and will only make it active if the link is lost by the active interface. Only one slave in the bond is active at an instance of time. A different slave becomes active only when the active slave fails. This mode provides fault tolerance.

 

Mode 2 (balance-xor)
This mode transmits packets based on an XOR formula. Source MAC address is XOR'd with destination MAC address modula slave count. This selects the same slave for each destination MAC address and provides load balancing and fault tolerance.

 

Mode 3 (broadcast)
This mode transmits everything on all slave interfaces. This mode is least used (only for specific purpose) and provides only fault tolerance.

 

Mode 4 (802.3ad)
This mode is known as Dynamic Link Aggregation. It creates aggregation groups that share the same speed and duplex settings. It requires a switch that supports IEEE 802.3ad dynamic link. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option. Note that not all transmit policies may be 802.3ad compliant, particularly inregards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Different peer implementations will have varying tolerances for noncompliance.

 

Mode 5 (balance-tlb)
This mode is called Adaptive transmit load balancing. The outgoing traffic is distributed according to the current load and queue on each slave interface. Incoming traffic is received by the current slave.

 

Mode 6 (balance-alb)
This mode is called Adaptive load balancing. This includes balance-tlb + receive load balancing (rlb) for IPV4 traffic. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the server on their way out and overwrites the src hw address with the unique hw address of one of the slaves in the bond such that different clients use different hw addresses for the server.

 

Mode 1 (active-backup) is the recommended setting. Other modes allow you to set up a specific environment, but may require proper switch support. Choosing a unsupported mode can result in a disrupted communication.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.