ARC and/or L2ARC


Recommended Posts

I been thinking on suggesting/requesting this "idea" and just recently found out it is an actual thing that is already released. I'm wondering if Unraid will be adding this feature as it would be a great addition. For those that don't know, you can find more details about what ARC is below. 

 

(credits to zfsbuilds.com)
 

Quote

ZFS includes two exciting features that dramatically improve the performance of read operations. I’m talking about ARC and L2ARC. ARC stands for adaptive replacement cache. ARC is a very fast cache located in the server’s memory (RAM). The amount of ARC available in a server is usually all of the memory except for 1GB.

For example, our ZFS server with 12GB of RAM has 11GB dedicated to ARC, which means our ZFS server will be able to cache 11GB of the most accessed data. Any read requests for data in the cache can be served directly from the ARC memory cache instead of hitting the much slower hard drives. This creates a noticeable performance boost for data that is accessed frequently.

As a general rule, you want to install as much RAM into the server as you can to make the ARC as big as possible. At some point, adding more memory is just cost prohibitive. That is where the L2ARC becomes important. The L2ARC is the second level adaptive replacement cache. The L2ARC is often called “cache drives” in the ZFS systems.

These cache drives are physically MLC style SSD drives. These SSD drives are slower than system memory, but still much faster than hard drives. More importantly, the SSD drives are much cheaper than system memory. Most people compare the price of SSD drives with the price of hard drives, and this makes SSD drives seem expensive. Compared to system memory, MLC SSD drives are actually very inexpensive.

When cache drives are present in the ZFS pool, the cache drives will cache frequently accessed data that did not fit in ARC. When read requests come into the system, ZFS will attempt to serve those requests from the ARC. If the data is not in the ARC, ZFS will attempt to serve the requests from the L2ARC. Hard drives are only accessed when data does not exist in either the ARC or L2ARC. This means the hard drives receive far fewer requests, which is awesome given the fact that the hard drives are easily the slowest devices in the overall storage solution.

In our ZFS project, we added a pair of 160GB Intel X25-M MLC SSD drives for a total of 320GB of L2ARC. Between our ARC of 11GB and our L2ARC of 320GB, our ZFS solution can cache a total of 331GB of the most frequently accessed data! This hybrid solution offers considerably better performance for read requests because it reduces the number of accesses to the large, slow hard drives.

Things to Keep in Mind
There are a few things to remember. The cache drives don’t get mirrored. When you add cache drives, you cannot set them up as mirrored, but there is no need to since the content is already mirrored on the hard drives. The cache drives are just a cheap alternative to RAM for caching frequently access content.

Another thing to remember is you still need to use SLC SSD drives for the ZIL drives, even when you use MLC SSD drives for cache drives. The SLC SSD drives used for ZIL drives dramatically improve the performance of write actions. The MLC SSD drives used as cache drives are use to improve read performance.

If you decide to use MLC SSD drives for actual storage instead of using SATA or SAS hard drives, then you don’t need to use cache drives. Since all of the storage drives would already be ultra fast SSD drives, there would be no performance gained from also running cache drives. You would still need to run SLC SSD drives for ZIL drives, though, as that would reduce wear on the MLC SSD drives that were being used for data storage.

If you plan to attach a lot of SSD drives, remember to use multiple SAS controllers. The SAS controller in the motherboard for our ZFS Build project is able to sustain 140,000 IOPS. If you use enough SSD drives, you could actually saturate the motherboard’s SAS controller. As a general rule of thumb, you may want to have one additional SAS controller for every 24 MLC style SSD drives.

Effective Caching to Virtualized Environments
At this point, you are probably wondering how effectively the two levels of caching will be able to cache the most frequently used data, especially when we are talking about 9TB of formatted RAID10 capacity. Will 11GB of ARC and 320GB L2ARC make a significant difference for overall performance? It will depend on what type of data is located on the storage array and how it is being accessed. If it contained 9TB of files that were all accessed in a completely random way, the caching would likely not be effective. However, we are planning to use the storage for virtual machine (VPS) file systems and this will cache very effectively for our intended purpose.

When you plan to deploy hundreds of virtual machines, the first step is to build a base template that all of the virtual machines will start from. If you were planning to host a lot of Linux/cPanel virtual machines, you would build the base template by installing CentOS and cPanel. When you get to the step where you would normally configure cPanel through the browser, you would shut off the virtual machine. At that point, you would have the base template ready. Each additional virtual machine would simply be chained off the base template. The virtualization technology will keep the changes specific to each virtual machine in its own child or differencing file.

When the virtualization solution is configured this way, the base template will be cached quite effectively in the ARC (main system memory). This means the main operating system files and cPanel files should deliver near RAM-disk performance levels. The L2ARC will be able to effectively cache the most frequently used content that is not shared by all of the virtual machines, such as the content of the files and folders in the most popular websites or MySQL databases. The least frequently accessed content will be pulled from the hard drives, but even that should show solid performance since it will be RAID10 across 20 drives and none of the frequently accessed read requests will need to burden the RAID10 volume since they are already served from ARC or L2ARC.



TL;DR - Have SSD drives that also cache most accessed data and/or maybe to cache data depending on the administrators settings that best fit the server. (This would be on top of what the cache SSD already do with writes).

Edited by XiuzSu
TL;DR
  • Like 1
Link to comment

I don't think that is going to be implemented.

 

Cache drive kind of acts as an L2ARC for writing.   Nothing for read.  Closest you get there is if a file is still on the cache drive.  I believe any unused memory is also used for writing to the array.  

 

It has been requested that mover add a "move older than xxx date".  This would allow us to keep the cache drive somewhat full with the most recent files, kind of acting as an L2ARC for reading.

Link to comment

Right, but that is why I'm suggesting for this to be added. For example, as one of the biggest use of Unraid is Plex, if you play a TV series show it would be good if it caches the whole folder and sub folders (giving us options for caching like we have options for split levels on shares) so that you get better performance, allow the drive to spin down as you watch the TV series, etc. Also, as you mentioned, we already have the mover who could just delete the read cache files from the SSD and move the write cache files to the drives. So the 'MOVER' should just be renamed to something more fitting at that points. 😁

Edited by XiuzSu
More details
Link to comment
  • 2 years later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.