Copy Performance & Replacing Updated Files Issues


Recommended Posts

Forgive me if there are other topics that address this - I have done a cursory look and cannot see that it's been brought up before but it's unlikely that it's an unknown issue, yet it could be. And maybe I just cannot find a previous article that addresses these things.

Either way, I am looking for some help. ;-)

First of all, I am running an UnRAID server that is probably pushing some limits. It's a SuperMicro box from a few years ago that has dual Xeon E5-2600 processors, 16GB of DDR3 RAM, and a LSI 9210-8i HBA/RAID card in IT (Initiator Target) mode with 24 hot swap SATA/SAS drive bays and dual 1200W power supplies.

I started with 4 x 14TB Seagate IronWolf Pro drives, 4 x Western Digital 8TB Red NAS drives, 1 x 6TB Western Digital Black Drive and then the remainder was filled up with 2TB drives - some Seagate and some Western Digitals. Some were some WD green drives, some were WD blacks, some WD Enterprise and there were a couple Seagate Barracudas.

 

I have slowly been replacing the 2TB drives with more 14TB IronWolf Pros as we needed more space and could afford it. I now have replaced all but 4 of the 2TB drives leaving 2 x Barracudas and 2 x WD Enterprise - 3 are 7200RPM SATA 6Gb/s with 64MB cache and one of the WDs is the same except 3Gb/s SATA - it will be the next one replaced. Not sure if that is all relevant but it can't hurt.

For anyone doing the math, that means I know have 15 x 14TB drives and the array itself is now 242TB. Probably a unique build I suspect but so are my disk requirements.

There are 3 things I am experiencing that I am wondering if are unique to my build or not. I am hoping someone might have some light to shed on this that could help address these performance issues.

First, I deal with a lot of video files and they get updated at times. The issue with this is sometimes, when I am copying a new version of a file over (same exact name), I run into an error from my Windows 10 box during the copy. I can retry and usually, it will resume and finish but it might take 2,3 or even 4 times before that happens.

I think the problem is that the old file is sitting on a disk that is nearly full in the array individually and, since the new file is larger, that disk cannot accommodate the new file size so hence the error. Retrying seems to eventually move the file to another disk with adequate space.

Of course, this is in theory. But I also experience this same issue when I get low on space in the array overall. I may have a few GB free on a number of disks and more than enough space free overall on the array but no single disk with enough space to hold the file and this same thing will occur. I then replace one of the 2TB with another 14TB and the problem goes away. So I've learned to keep adequate free space on the array because of that.

But that doesn't help with this issue of replacing files with newer, larger versions. And since they are video files we are dealing with, they are typically 1-2 GB at the start but maybe newer versions are getting up to 8-10 GB or more. I believe if I delete the original before the copy then UnRAID seems to seek out the best place to put it and this doesn't occur but if I wait until the error, the original file has already been deleted and then it is just a matter of retrying a couple or more times before it decides to try and write it elsewhere and is successful.

The error I am getting is pretty consistent I believe but I haven't recorded every single time. Here is what I am getting currently as I am trying to update some files:

"An unexpected error is keeping you from copying the file. If you continue to receive this error, you can use the error code to search for help with this problem."

"Error 0x8007003A: The specified server cannot perform the requested operation."

If it helps, I am running a Windows domain and the array is added to it for permissions but I do not think it is a permission issue. The error itself hasn't really shed much light on the issue.

The 2nd issue is performance when copying. The UnRAID server has dual 100Mb built-in nics and I have bonded them thinking I would get better performance because of that. Instead, copying these large files I am getting at best 60-80MB/s but often that drops to 10MB/s or less... sometimes even dropping as low as 2MB/s. Everything else in the network is capable of 1Gb.

And, the longer I am copying files in succession (say a 40GB-100GB group of files or more in one go), the performance seems to drop the longer it runs (but not always). Other times I only get 6MB/s right out the gate but then spikes to 100MB/s or more for short periods of time. But it isn't consistent like our old array.

I don't need blazing performance but this seems to be a bottleneck that rears its head at the most inopportune times. Could bonding be contributing to the issue? Could it be drivers - either NIC or for the LSI adapter? I am considering adding quicker NICs but am concerned that the problem might not go away with that change.

This also contributes to the 3rd issue. Multiple users may be viewing these videos at any one time. It is only 3-5 users at max right now but if I am updating files or the monthly parity check is running (which takes at least 18 or more hours now), this can become an issue with playback as well.

We aren't running a lot of users and I didn't think that this server would be struggling doing this because it is still more capable than most NAS boxes you can buy. I am hoping that I am just running into some driver issues or maybe just stretching UnRAID into territory it isn't suited for but other than these issues, the box has been pretty stable and met our needs.

Anyone got any insight into what I may be running into? I don't use a cache drive because these files are so big that it often exhausts the cache too quickly when I am transferring large amounts of files. I was running a Dell PowerVault previously but we were running out of space even with 2 expansion boxes already added because the array was limited to 2TB SAS drives.

In the end, we were at 72TB and couldn't expand it more and sourcing replacement drives was starting to be an issue. It did, however, perform much better probably due to RAID 5 and the combined performance of multiple drives in the array but it also used a lot of power. Our bills have dropped considerably with the replacement SuperMicro not to mention the heat it was giving off was hard to cool as well and our cooling system has run less since the cutover.

I did expect performance to drop since we wouldn't be getting the combined striping performance of RAID 5 and that was something we could accept given the other features we would be gaining (larger drives and the ability to run different sizes and replace them over time) but this seems more than that.

So I am looking for some suggestions - maybe I shouldn't have bonded the NICs, maybe I should upgrade to some 1Gb NICs, maybe it's a driver issue, maybe I should look at some newer server, or maybe UnRAID isn't the best option for me for this purpose. It could still function as an adequate backup but should I look at FreeNAS or something like ZFS instead?

Anyone able to bestow some wisdom here? I thank you in advance for any advice that can assist with my problems. Thanks.

Motor

Edited by Motor
Link to comment
16 hours ago, Motor said:

this is in theory

There are definitely some gaps in your theory. Let me see if I can fill in some of the details of how things work.

 

Each data disk in the Unraid parity array is an independent filesystem. Each disk, and all files on that disk, can be read independently from all the others. Folders can span disks (User Shares), but files cannot.

 

Once Unraid has chosen a disk to write a file to, if the file won't fit the write fails. Unraid has no way to know how large a file will become when it chooses a disk for it.

 

Each User Share has a Minimum Free setting. If a disk has more than Minimum Free, the disk can be chosen depending on other settings and conditions for the User Share.

 

For example, Minimum Free for the User Share is set to 10GB. The disk has 15GB free. Since that is more than Minimum, the disk can be chosen. If you write a 10GB file to the share, and the disk is chosen, the write will succeed because there is enough space. After that successful write, the disk will have 5GB free, which is less than Minimum, so it won't be chosen again for that User Share (but see below for other conditions that might override that).

 

Another example, Minimum Free for the User Share is set to 10GB. The disk has 15GB free. Since that is more than Minimum, the disk can be chosen. If you write a 20GB file to the share, and the disk is chosen, the write will fail because there isn't enough space.

 

Once a disk has been chosen for the file, it never stops and decides it needs to move things around or choose another disk. The write just fails if it runs out of space.

 

Cache (or pools in v6.9+) also have Minimum Free setting that works in a similar manner. For versions before 6.9, Minimum Free for cache is in Global Share Settings. For 6.9+ with multiple pools, there is a separate setting for each pool. Click on the first disk in the pool to get to the settings page for the pool. (edit: I added more about this in a later post below).

 

You must set Minimum Free for the User Share (and for cache or pool if the share uses that) to larger (maybe much larger in your case) than the largest file you expect to write to the share.

 

Other conditions that affect which disk is chosen. The most important one for your scenario is what disk is chosen when you overwrite a file. The answer is the same disk the file is already on, regardless of any other conditions including Minimum Free. Also, Split Level overrides Minimum Free and Allocation Method, so if Split Level says a file belongs on the same disk as other files then that disk will be chosen.

 

Allocation Method is another thing considered when choosing a disk. Highwater is default and should be good enough for most purposes and better than the other settings for Allocation Method in most cases.

 

16 hours ago, Motor said:

probably pushing some limits

The limit you are pushing is you are letting your disks get too full. You may have to manually move files to other disks to make room if you are going to be overwriting files with larger files.

 

16 hours ago, Motor said:

I don't use a cache drive because these files are so big that it often exhausts the cache too quickly when I am transferring large amounts of files.

Good decision. Never any point in caching if you are writing more than cache can hold. It is impossible to move from cache to array as fast as you can write to cache.

 

I'll let someone else comment on your bonded NICs, but I will just note that there are multiple bonding modes each with their own features and requirements.

 

For the performance during parity check, you can set things up so parity check only runs at certain hours. Take a look at the Parity Check Tuning plugin.

Link to comment

You might ask why doesn't Unraid take a more reactive approach and try to move things around and try different disks when it runs out of space.

 

This could severely impact performance due to a lot of "mysterious thrashing" as it tries different disks. And keeping in mind that it still doesn't know how large the file will become, it might have to do this multiple times. And there still might not be enough space.

 

So instead, Unraid has given you settings so you can take a proactive approach to prevent this from happening in the first place.

Link to comment
6 hours ago, Motor said:

The UnRAID server has dual 100Mb built-in nics and I have bonded

 

You would only get around 10MB/s over a 100Mb NIC. Which mode do you have for the bonding? 

 

Mode 0 (balance-rr)
This mode transmits packets in a sequential order from the first available slave through the last. If two real interfaces are slaves in the bond and two packets arrive destined out of the bonded interface the first will be transmitted on the first slave and the second frame will be transmitted on the second slave. The third packet will be sent on the first and so on. This provides load balancing and fault tolerance.

Mode 1 (active-backup) - default
This mode places one of the interfaces into a backup state and will only make it active if the link is lost by the active interface. Only one slave in the bond is active at an instance of time. A different slave becomes active only when the active slave fails. This mode provides fault tolerance.

Mode 2 (balance-xor)
This mode transmits packets based on an XOR formula. Source MAC address is XOR'd with destination MAC address modula slave count. This selects the same slave for each destination MAC address and provides load balancing and fault tolerance.

Mode 3 (broadcast)
This mode transmits everything on all slave interfaces. This mode is least used (only for specific purpose) and provides only fault tolerance.

Mode 4 (802.3ad)
This mode is known as Dynamic Link Aggregation. It creates aggregation groups that share the same speed and duplex settings. It requires a switch that supports IEEE 802.3ad dynamic link. Slave selection for outgoing traffic is done according to the transmit hash policy, which may be changed from the default simple XOR policy via the xmit_hash_policy option. Note that not all transmit policies may be 802.3ad compliant, particularly inregards to the packet mis-ordering requirements of section 43.2.4 of the 802.3ad standard. Different peer implementations will have varying tolerances for noncompliance.

Mode 5 (balance-tlb)
This mode is called Adaptive transmit load balancing. The outgoing traffic is distributed according to the current load and queue on each slave interface. Incoming traffic is received by the current slave.

Mode 6 (balance-alb)
This mode is called Adaptive load balancing. This includes balance-tlb + receive load balancing (rlb) for IPV4 traffic. The receive load balancing is achieved by ARP negotiation. The bonding driver intercepts the ARP Replies sent by the server on their way out and overwrites the src hw address with the unique hw address of one of the slaves in the bond such that different clients use different hw addresses for the server.

Mode 1 (active-backup) is the recommended setting. Other modes allow you to set up a specific environment, but may require proper switch support. Choosing a unsupported mode can result in a disrupted communication.

 

Even with Mode 4 which I use for my dual 1Gb NICS a single client machine will only utilise 1Gb, but running 2 clients each would get 1Gb each.

Link to comment
12 hours ago, SimonF said:

Even with Mode 4 which I use for my dual 1Gb NICS a single client machine will only utilise 1Gb, but running 2 clients each would get 1Gb each.

 

You can achieve what you describe by simply using Mode 5 or 6, which work with a cheap, dumb switch. Mode 4 is capable of faster (i.e. 2 Gb/s to a single client, in your example) but, as the text you quoted says, everything needs to be compliant.

 

Link to comment
7 minutes ago, John_M said:

 

You can achieve what you describe by simply using Mode 5 or 6, which work with a cheap, dumb switch. Mode 4 is capable of faster (i.e. 2 Gb/s to a single client, in your example) but, as the text you quoted says, everything needs to be compliant.

 

Is there a way to change

 

Option lxmit_hash_policy

The option xmit_hash_policy is used to select the transmit hash policy for slave selection. There are three possible values:

Layer 2

Uses XOR of hardware MAC addresses to generate the hash. This algorithm will place all traffic to a particular network peer on the same slave.

Layer 2+3

Uses a combination of layer2 and layer3 protocol information to generate the hash. Traffic for different remote hosts is mapped to different slaves; one remote host is always mapped to the same slave.

Layer 3+4

Uses upper-layer protocol information, when available, to generate the hash. Traffic for different connections to the same remote host is mapped to different slaves; each individual connection is always mapped to the same slave. Each connection is identified by IP address plus port number.

Link to comment

Thanks trurl for the information. Kind of solidifies what I figured I was running into although I was unaware of the "Minimum Free" setting so that could help a bit with large copies. I guess it's just a symptom of how UnRAID works that I will have to learn to work with.

So is the "Split Level" feature the option to let folders span disks? I assume if folders aren't allow to span disks, then all the files in that folder have to be placed on the same disk, right?

As for "letting my disks get too full", that's a fair eval although I still run into the issue occasionally even when there are other drives with lots of space but your explanation addresses that and probably setting that "Minimum Free" and allowing folders to span disks will help with that.

I will also take a look at that Parity Check Tuning plugin as I am manually pausing and resuming it at times now given how long it takes each month.

 

Thanks.

Link to comment

SimonF, thanks for the information about bonding. I recall going thru it when I first setup the server a couple years ago and I selected Active-Backup (Mode 1) at that time I guess. It sounds like maybe Mode 4, 5 or 6 would be a better option?

With a small number of users I was probably expecting 1 nic would be enough and opted for redundancy but, given the performance issues I am experiencing, maybe I should try one of those other options to see how they make a difference? I suspect just testing the different options to see which works best is going to be the way to determine which one is best suited for my environment.

Obviously, I must have forgot about the multiple options but the breakdown makes sense. Thanks.

Link to comment

You can toggle Help for the whole webUI by clicking Help (?) on the main menu. You can toggle help for a specific setting by clicking on its label. Also, you can go directly to the Unraid Documentation by clicking the "manual" link in the bottom right corner of the webUI. (None of that is meant to discourage you from asking for help, advice, or further explanation.)

 

The default settings for a User Share already allow folders to span disks. User Shares are simply the aggregate of the top level folders on array and pools.

 

When you create a User Share, Unraid creates a top level folder named for the User Share on array or pool as needed according to the settings for the User Share.

 

Conversely, if you create a folder at the top level of array disk or pool, it is automatically a User Share named for the folder with default settings.

 

Split Level is about keeping files together on the same disk so other disks don't have to spin up (with a delay while another disk spins up) when accessing multiple files. For example, if you are old school and like to listen to Music as a whole Album, you might configure Split Level so that all Tracks on the Album are stored on the same disk. See the help and documentation for Split Level.

 

Link to comment

Thanks again trurl. So the "Split Level" feature actually works in reverse of what I would assume which is fine. Obviously, in my case, I don't need to touch it to have folders span drives then.

 

And I was aware of the drives not spinning up until needed. There is a small impact with that but it's worthwhile so the drives don't run all the time like in a RAID 5 array... probably more longevity that way.

Although I do remember coming across about an option that you can tag either certain drives or all of them to run all the time to combat that. I think it was a video where some of the early developers/adopters of UnRAID were talking about it. I don't recall being able to find it in the documentation at the time - do you know the feature I am referring to?

Other than that, I am scheduled some downtime to change the bonding option and I will see if that makes a difference or not.

Thanks for all the help.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.