ZFS Performance Tuning - Getting the most of your UnRAID server and containers on ZFS


BVD

Recommended Posts

Hello All!

 

As I'd alluded to in my earlier SR-IOV guide, I've been (...slowly...) working on turning my server config/deployment notes into something that'd at least have the opportunity to be more useful to others as they're using UnRAID.

 

To get to the point as quickly as possible:

The UnRAID Performance Compendium

 

I'm posting this in the General section as it's all eventually going to run the gambit, from stuff that's 'generically UnRAID', to container/DB performance tuning, VMs, and so on. It's all written from the perspective of *my* servers though, so it's tinged with ZFS throughout - what this means in practice is that, while not all of the information/recommendations provided will apply to each person's systems, at least some part of them should be useful to most, if not all (all is the goal!).

 

I've been using ZFS almost since it's arrival on the open source scene, starting back with the release of OpenSolaris back in late 2008, and using it as my filesystem of choice wherever possible ever since. I've been slowly documenting my setup as time's gone on, and as I was already doing so for myself, I thought it might be helpful to build it out a bit further in a form that could be referenced by others (if they so choose).

 

I derive great satisfaction from doing things like this, relishing the times when work's given me projects where I get to create and then present technical content to technical folks... But with the lockdown, I haven't gotten out much, and work's been so busy with other things, I haven't much been able to scratch that itch. However, I'm on vacation this week, and finally have a few of them polished up to the point that I feel like they can be useful!

 

Currently guides included are (always changing, updated 08.03.22):

 

  • The Intro
    1.  Why would we want ZFS on UnRAID? What can we do with it? - A primer on what our use-case is for adding ZFS to UnRAID, what problems it helps solve, and why we should care. More of an opinion piece, but with some backing data enough that I feel comfortable and confident in the stance taken here. Also details some use cases for ZFS's feature sets (automating backups and DR, simplifying the process of testing upgrades of complex multi-application containers prior to implementing them into production, things like that).
  • Application Deployment and Tuning:
    1. Ombi - Why you don't need to migrate to MariaDB/MySQL to be performant even with a massive collection / user count, and how to do so
    2. Sonarr/Radarr/Lidarr - This is kind of a 'less done' version of the Ombi guide currently (as it's just SQLite as well), but with some work (in progress / not done) towards getting around a few of the limitations put in place by the application's hard-coded values
    3. Nextcloud - Using nextcloud, onlyoffice, elasticsearch, redis, postgres, nginx, some custom cron tasks, and customization of the linuxserver container (...and zfs) to get highly performant app responsiveness even while using apps like facial recognition, full text search, and online office file editing. Haven't finished documenting the whole of the facial recog part, nor elasticsearch.
    4. Postgres - Keeping your applications performance snappy using PG to back systems with millions of files, 10's or even hundreds of applications, and how to monitor and tune for your specific HW with your unique combination of applications
    5. MariaDB - (in progress) - I don't use Maria/MySQL much personally, but I've had to work with it a bunch for work and it's pretty common in homelabbing with how long of a history it has and the dev's desire to make supporting users using the DB easier (you can get yourself in a whole lot more trouble a whole lot quicker by mucking around without proper research in PG than My/Maria imo). Personally though? Postgres all the way. Far more configurable, and more performant with appropriate resources/tuning.
  • General UnRAID/Linux/ZFS related:
    1. SR-IOV on UnRAID - The first guide I created specifically for UnRAID, posted directly to the forum as opposed to in github. Users have noted going from 10's of MB/s up to 700MB/s when moving from default VM virtual NIC's over to SR-IOV NICs (see the thread for details
    2. Compiled general list of helpful commands - This one isn't ZFS specific, and I'm trying to add things from my bash profile aliases and the like over time as I use them. This one will be constantly evolving, and includes things like "How many inotify watchers are in use... And what the hell is using so many?", restarting a service within an LSIO container, bulk downloading from archive.org, and commands that'll allow you to do unraid UI-only actions from the CLI (e.g. stop/start the array, others).
    3. Common issues/questions/general information related to ZFS on UnRAID - As I see (or answer) the same issues fairly regularly in the zfs plugin thread, it seemed to make sense to start up a reference for these so it could just be linked to instead of re-typing each time lol. Also includes information on customization of the UnRAID shell and installing tools that aren't contained in the Dev/Nerdpacks so you can run them as though they're natively included in the core OS.
      • Hosting the Docker Image on ZFS - squeezing the most performance out of your efforts to migrate off of the xfs/btrfs cachepool - if you're already going through the process of doing so, might as well make sure it's as highly performant as your storage will allow

 

You can see my (incomplete / more to be added) backlog of things to document as well on the primary page in case you're interested. I plan to post the relevant pieces where they make sense as well (e.g. the Nextcloud one to the lsio nextcloud support thread, cross-post this link to the zfs plugin page... probably not much else at this point, but just so it reaches the right audience at least).

 

Why Github for the guides instead of just posting them here to their respective locations?

I'd already been working on documenting my homelab config information (for rebuilding in the event of a disaster) using Obsidian, so everything's already in markdown... I'd asked a few times about getting markdown support for the forums so I could just dump them here, but I think it must be too much of a pain to implement, so github seemed the best combination of minimizing amount of time re-editing pre-existing stuff I'd written, readability, and access.

 

Hope this is useful to you fine folks!

 

HISTORY:

- 08.04.2022 - Added Common Issues/general info, and hosting docker.img on ZFS doc links

- 08.06.2022 - Added MariaDB container doc as a work-in-progress page prior to completion due to individual request

- 08.07.2022 - Linked original SR-IOV guide, as this is closely tied to network performance

- 08.21.2022 - Added the 'primer' doc, Why ZFS on UnRAID and some example use-cases

Edited by BVD
updated history section, adding ZFSonUnRAID primer
  • Like 3
Link to comment

Thanks for this. Great help for those of us who like to play but do not have a lot of exp with the deeper end of servers. I have not used all the guides(yet) but got my way through some of them. I do have a couple questions. You said you have a larger Nextcloud server on your postgres DB. Did you have to substantially increase the max_connections? Once I installed max_connections was set to 100 (default). My nextcloud docker when running with the maps app installed would start throwing out  Doctrine\DBAL\Exception: Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] FATAL: remaining connection slots are reserved for non-replication superuser connections and Doctrine\DBAL\Exception: Failed to connect to the database: An exception occurred in the driver: SQLSTATE[08006] [7] FATAL: sorry, too many clients already. One of the searches I found stated that it could be client connections are not closing on the backend side and as a temp measure to increase max_connections. I increased to 200 then hit the NC server with a maps refresh and it quickly rose from 8-9 connections to 127-129 connections to the DB (watched through the query tool in pgAdmin) but then dropped back down to 8-9 connections not long after. I ask this as I did not find much in my google search with this error related to Nextcloud so I assume I either misconfigured something or, everyone else is just smart enough to increase max_connections to begin with. 

 

Also, in the Nextcloud guide you referenced a "go" file for DB tuning. Where would one find this, or if I missed a guide that has this mentioned in more detail can you point me in its direction? Would like to read up on that some more. 

 

Attached image with the database errors. I know there are other errors as well (I think they are directly related to the maps app) that I will look into later but I am focused on the DB errors as those one bring Nextcloud to a halt. 

 

NC_errors.jpg

Edited by Dent_
had more than I thought in the original image
Link to comment

Your PHP tuning needs to be 'compatible' (for lack of a better term) with your database tuning - if, for instance, you allow 200 php processes, but only allow 100 DB connections in your DB config, nextcloud will complain. In addition, one tuning variable within postgres.conf can impact another - the paragraph directly following the postgresql.conf sample variables has a brief explanation, where you have a total number of worker processes, maint processes, and allocations of memory each, which should be less than or equal to your shared buffers. You'll want to check out the postgres config reference linked in the guide for full explanations of each of the options and their impacts.

 

The database tuning information is in the respective DB's page, linked from the main page above - the postgres one is more complete at this point, as it's what I've primarily used for anything that needs a database for many years, while my MariaDB/MySQL experience is strictly relegated to working on other's environments (so I don't have everything already written up for it to just copy from my deployment notes).

  • Like 1
Link to comment

Thanks, I figured it was probably a variable I missed. Didn't think about amount of php processes to DB connections. As for the increase in connections I did increase the shared_buffers memory as well as the worker and maintenance variables based on your template and explanation at the same time.

 

 

Thanks again for your guide.  

Link to comment
34 minutes ago, Dent_ said:

Thanks, I figured it was probably a variable I missed. Didn't think about amount of php processes to DB connections. As for the increase in connections I did increase the shared_buffers memory as well as the worker and maintenance variables based on your template and explanation at the same time.

 

 

Thanks again for your guide.  

 

Happy to help!

 

I've been at a cabin with limited connectivity the last 3 days or so, but am returning to civilization tomorrow and have a couple more days worth of updates planned, so hopefully there'll be a bit more somewhere within the updates thatll help with some other parts of whatever you're workin on as well 👍

  • Like 1
Link to comment
  • 2 weeks later...

I run some Darr apps and notice on my ZFS array they are probably some of the more heavy I/o requirements.  It's a 6 disk Enterprise INTEL SSD set so it has the IOPS.  Most of the time I've looked to optimise it, I come back to the advice that ZFS has variable record sizes so there's no point.  But if we use postgres perhaps there's an optimisation opportunity to match the page size to the record size.  I'd be interested in your thoughts on that - thanks for the link to the Radarr guide.

Link to comment

If you apply the sqlite tuning, I'd argue that postgres isnt really required. Even with all the media I've stored, everything loads in less than a second after applying the tuning, along with the benefit of having all the containers data in a single location. To me, the added complexity isnt worth the negligible benefit (Not to mention the additional request load on PG that could be used for other more intensive applications such as nextcloud).

Link to comment
1 hour ago, BVD said:

If you apply the sqlite tuning, I'd argue that postgres isnt really required. Even with all the media I've stored, everything loads in less than a second after applying the tuning, along with the benefit of having all the containers data in a single location. To me, the added complexity isnt worth the negligible benefit (Not to mention the additional request load on PG that could be used for other more intensive applications such as nextcloud).

 

Thank you for the input BVD. I wanted to also ask if you planned on discussing what an optimal use of ZFS would be in unraid from a higher level. I know that you have already explained docker with a zvol (i've implemented this) but what are your recommendations for storing appdata for docker? (Details such as 1 dataset per app/container? 1 dataset for all appdata? ETC)

 

I am trying to figure out a way to optimally use 5 SSD's. As it stands I have them set up in this manner but I am unsure if it is even close to optimal. I have 128GB RAM available.

  • 2 860 EVO 1TB SATA SSD's running in a RAID0 BTRFS Cache pool (for landing downloads) part of the array.
  • 2 980 PRO 1TB NVME running in ZFS Mirror (docker appdata and docker image location)
    • 1 Zvol for the docker image. /pool/docker
    • 1 Dataset for docker appdata. /pool/appdata
  • 1 870 EVO 500GB unused for now (used to move data around for reformatting other drives)

I guess my question is I've only gotten this far and am wondering now how I move into really utilizing ZFS's benefits (snapshotting, backups, zfs send, etc). What would you suggest I do next?

Link to comment
19 hours ago, Partizanct said:

 

...I guess my question is I've only gotten this far and am wondering now how I move into really utilizing ZFS's benefits (snapshotting, backups, zfs send, etc). What would you suggest I do next?

 

I tried to allude to this in the guides themselves by first noting the fileset specifications recommended for that given application, but I guess it may not've been clear - I use a specific fileset for each application.

 

Not only is this the only way to actually tune zfs for each individual application (as you're applying filesystem level features specific to those applications), but it's also the only way to fully take advantage of the snapshot and backup features in a meaningful way - this way, you can apply specific snapshot requirements to your dbs that you don't necessarily need for your more static applications.

 

As a for-example, I've attached my zfs list output; filesets natively inherit the settings of their parents, so if you've a bunch of apps which have similar requirements, you don't have to keep manually setting them each time. My 'static-conf' filesets, which are almost never changed, the 'vms' fileset, which has all img files, etc, then just customize the few small changes needed for the others (...speaking of which... reminds me I need to move postgres over to wd/dock/dep - where all my databases [dependencies] are lol):

2033076589_ScreenShot2022-08-20at12_59_08AM.thumb.png.85aecb715a1a9d90ae78f918ed87fbf9.png

 

Edited by BVD
typo
  • Upvote 1
Link to comment
On 8/18/2022 at 11:57 PM, Marshalleq said:

I run some Darr apps and notice on my ZFS array they are probably some of the more heavy I/o requirements.  It's a 6 disk Enterprise INTEL SSD set so it has the IOPS.  Most of the time I've looked to optimise it, I come back to the advice that ZFS has variable record sizes so there's no point.  But if we use postgres perhaps there's an optimisation opportunity to match the page size to the record size.  I'd be interested in your thoughts on that - thanks for the link to the Radarr guide.

 

I actually do that in the guides, setting up the recordsize and explaining why it's set to what it is - equally. important for the sqlite applications (like the 'arrs), where we set the page size to the max (64k), and configure the fileset to match. Only way to make sonarr history viewing any kind of performant with 10's of thousands of shows 👍

Edited by BVD
Link to comment
8 hours ago, BVD said:

 

I tried to allude to this in the guides themselves by first noting the fileset specifications recommended for that given application, but I guess it may not've been clear - I use a specific fileset for each application.

 

Not only is this the only way to actually tune zfs for each individual application (as you're applying filesystem level features specific to those applications), but it's also the only way to fully take advantage of the snapshot and backup features in a meaningful way - this way, you can apply specific snapshot requirements to your dbs that you don't necessarily need for your more static applications.

 

As a for-example, I've attached my zfs list output; filesets natively inherit the settings of their parents, so if you've a bunch of apps which have similar requirements, you don't have to keep manually setting them each time. My 'static-conf' filesets, which are almost never changed, the 'vms' fileset, which has all img files, etc, then just customize the few small changes needed for the others (...speaking of which... reminds me I need to move postgres over to wd/dock/dep - where all my databases [dependencies] are lol):

2033076589_ScreenShot2022-08-20at12_59_08AM.thumb.png.85aecb715a1a9d90ae78f918ed87fbf9.png

 

It seems I have much to work on! Thanks

Link to comment
14 hours ago, BVD said:

 

I actually do that in the guides, setting up the recordsize and explaining why it's set to what it is - equally. important for the sqlite applications (like the 'arrs), where we set the page size to the max (64k), and configure the fileset to match. Only way to make sonarr history viewing any kind of performant with 10's of thousands of shows 👍

I've done what you're suggested on Lidarr, because that's by far the worst performing app for me, but don't really notice any difference so far.  In particular the updating of the library (which annoying seems to be entirely scanned now when triggering only a single artist) My library is probably 6x the size of yours.  What I was asking above was, does your understanding of zfs include why or why not the variable record sizes cover performance of different table sizes in a database?  Because I've gone down this path before of optimising record sizes, jumped on some forums and been shot down because they were adamant that it's not needed with the variable record size feature of ZFS.  Also, by reducing the record size, you apparently reduce the available compression (which I checked and my DB went from 1.3G to 1.6G with the smaller record size so it appears that at least that comment was correct.  Personally, I think you're onto something here, because I assume ZFS cannot be aware of database page size in a large single file like it can be with individual files and cannot align a page to a record without some help, but I could be wrong.  Since the official ZFS page has an example covering I think it's Postgres, that would seem to confirm it.

 

So I applaud your efforts here and await more commentary from others around any speed changes.

 

What I was hoping for was an increased refresh and scan speed and the subsequent 'reading file' which seems to happen twice on the whole library afterward to be a little quicker. But there are external factors with that.  In addition to the Lidarr DB on SSD, my audio is stored on a 6 disk Raidz2 but with a special vdev where all the metadata is stored, so it's about as good as I'm going to get without going all SSD.

 

I also use a product called Roon, which is a fantastic but expensive Music player.  That is the absolute slowest app I have.  It runs on Google Leveldb.  Any experience with that out of interest?

Link to comment
6 hours ago, Partizanct said:

It seems I have much to work on! Thanks

I'd recommend setting up some sensible defaults in the root that you think will apply to that whole drive, e.g. like turn compression on as most things will use it - probably the default record size can be left alone, xattr=sa and whatever else you want but those are the main ones from memory.  Then you tweak them per dataset.

Link to comment
1 hour ago, Marshalleq said:

I've done what you're suggested on Lidarr, because that's by far the worst performing app for me, but don't really notice any difference so far.  In particular the updating of the library (which annoying seems to be entirely scanned now when triggering only a single artist) My library is probably 6x the size of yours.  What I was asking above was, does your understanding of zfs include why or why not the variable record sizes cover performance of different table sizes in a database?  Because I've gone down this path before of optimising record sizes, jumped on some forums and been shot down because they were adamant that it's not needed with the variable record size feature of ZFS.  Also, by reducing the record size, you apparently reduce the available compression (which I checked and my DB went from 1.3G to 1.6G with the smaller record size so it appears that at least that comment was correct.  Personally, I think you're onto something here, because I assume ZFS cannot be aware of database page size in a large single file like it can be with individual files and cannot align a page to a record without some help, but I could be wrong.  Since the official ZFS page has an example covering I think it's Postgres, that would seem to confirm it.

 

So I applaud your efforts here and await more commentary from others around any speed changes.

 

What I was hoping for was an increased refresh and scan speed and the subsequent 'reading file' which seems to happen twice on the whole library afterward to be a little quicker. But there are external factors with that.  In addition to the Lidarr DB on SSD, my audio is stored on a 6 disk Raidz2 but with a special vdev where all the metadata is stored, so it's about as good as I'm going to get without going all SSD.

 

I also use a product called Roon, which is a fantastic but expensive Music player.  That is the absolute slowest app I have.  It runs on Google Leveldb.  Any experience with that out of interest?

 

Did you copy the data out and then back over after setting up the fileset? And what part of lidarr is slow - e.g. is it just general browsing, or looking at a specific set of pages? Could you share your zfs get output for lidarr, and maybe check the output of ioztat as well to see what it's IOPs usage is like? Finally, how many songs+artists are we talkin?

 

I think you might be talking about the dnode size setting (re: variable records) - this is something I haven't covered in the guides/docs yet, but maybe I should... You may also be talking about how it treats multiple inodes from differing objects when writing a record though, not entirely sure.

 

I could take a look with you at some point if you'd like (we can get a webex going or something after taking this to DMs) - I wonder if maybe you're hitting a frequency limitation (single threaded for sqlite, so boost speeds are important). You might also disable the zfs txg timeout value I'd noted in the postgres guide; it'll help with anything DB related, but should only be used if all zpools on your system are redundant (no stripe only pools). If this doesn't clear it up (it can be unset after the fact, it's a kernel level change), I'd want to start looking to see what our interrupt counts are like during slowness periods, arc hit rates, and so on. There's no reason a 1950x shouldn't be able to make lidarr with sqlite at least usable, even with ~40k songs (that's how many I've got currently at least, so it's the most I could comment on for now).

 

Just lemme know - it sounds like audio's a pretty big deal for you, would be happy to poke at it a bit with you if you like, maybe get a better experience out of it for you.

Edited by BVD
clarifying what 'taking a look' meant
Link to comment
8 hours ago, Partizanct said:

It seems I have much to work on! Thanks

 

Keep in mind, while filesets can be renamed (and hence, their directory structure changed) on the fly, the existing data will still have been written in the prior fileset configuration. If you change things like recordsize, xattr, anything about the data itself, you'll need to copy the data off/back (or send to a new fileset with those parameters) in order to apply that configuration to the data (only applied at the time of write).

 

... I think I'm going to write a doc going over some more generalized zfs information at this point. I'd resisted the urge, as there are a couple detriments in my mind to doing so:

1. ZFS, while powerful, isn't for everyone - one has to have the desire/drive (and just as importantly, the TIME) to do some of their own background research and learning, or it's super easy for them to have a much worse experience with zfs than alternatives (or worse yet, cause themselves massive suffering by copy/pasting something they saw online :()

2. There's already so much out there on the basics, adding more stuff to the main page could mean that people just skip reading anything there altogether - if there's a short 'here's what you need to learn further', I feel like it's more likely to be read than if you add more than is absolutely necessary.

 

Both of these things though I think aren't necessarily going to be a problem here - I was helping out another fellow forum member with some nextcloud issues where I'd asked if he'd be interested in proof-reading some of it for me once I got the time to write it, and he seemed amicable to the idea (we got his data back, AND his nextcloud instance running again - WOOOT!!). I just didn't know how useful it would be... But it sounds like there would be enough use to serve a purpose at least ❤️ 

Link to comment
41 minutes ago, BVD said:

 

Did you copy the data out and then back over after setting up the fileset?

Basically, I created a new dataset and copied it over with rsync after setting it all up.  Have been doing ZFS for a while now.  Same for the array with special vdev - that was a while ago now and wasn't a small task, but got there in the end.

 

42 minutes ago, BVD said:

 

And what part of lidarr is slow - e.g. is it just general browsing, or looking at a specific set of pages? Could you share your zfs get output for lidarr, and maybe check the output of ioztat as well to see what it's IOPs usage is like? Finally, how many songs+artists are we talkin?

My main gripes are not so much with the web pages, more to do with load times e.g. startup from docker and the forever chugging away in the background.  It may just be that my library is big.  Plex says I have 114000 tracks / 1092 artists / 8463 albums.  I hadn't seen ioztat before - I'm guessing that better than zpool iostat by going down to dataset level of something?

47 minutes ago, BVD said:

I think you might be talking about the dnode size setting (re: variable records) - this is something I haven't covered in the guides/docs yet, but maybe I should... You may also be talking about how it treats multiple inodes from differing objects when writing a record though, not entirely sure.

I'm talking about how ZFS will store in the default 128k block a block of up to 128k.  it's variable.  I believe it will literally turn a 128k block into a 64k one if it sees fit to do that.  I've always been suspicious about this though, particularly with databases.  The thing with ZFS is that just because one group of people told me, doesn't mean it's true - there are a lot of details to work through.  But I do know that ZFS has variable record sizes.

 

54 minutes ago, BVD said:

I could take a look with you at some point if you'd like (we can get a webex going or something after taking this to DMs) - I wonder if maybe you're hitting a frequency limitation (single threaded for sqlite, so boost speeds are important). You might also disable the zfs txg timeout value I'd noted in the postgres guide; it'll help with anything DB related, but should only be used if all zpools on your system are redundant (no stripe only pools). If this doesn't clear it up (it can be unset after the fact, it's a kernel level change), I'd want to start looking to see what our interrupt counts are like during slowness periods, arc hit rates, and so on. There's no reason a 1950x shouldn't be able to make lidarr with sqlite at least usable, even with ~40k songs (that's how many I've got currently at least, so it's the most I could comment on for now).

 

Just lemme know - it sounds like audio's a pretty big deal for you, would be happy to poke at it a bit with you if you like, maybe get a better experience out of it for you.

That's very kind thank you.  I may take you up on it in future as it would be fun to see your process of figuring it out.  Also, this is actually not on the 1950x (actually now it's a 2950x so I should update that), this is on the dual xeon machine.  Either way I'm basically unavailable for around 3 weeks due to things going on in my life so would have to be after that if it's anything more than comments in a forum.

Link to comment
2 hours ago, Marshalleq said:

Basically, I created a new dataset and copied it over with rsync after setting it all up.  Have been doing ZFS for a while now.  Same for the array with special vdev - that was a while ago now and wasn't a small task, but got there in the end.

 

My main gripes are not so much with the web pages, more to do with load times e.g. startup from docker and the forever chugging away in the background.  It may just be that my library is big.  Plex says I have 114000 tracks / 1092 artists / 8463 albums.  I hadn't seen ioztat before - I'm guessing that better than zpool iostat by going down to dataset level of something?

I'm talking about how ZFS will store in the default 128k block a block of up to 128k.  it's variable.  I believe it will literally turn a 128k block into a 64k one if it sees fit to do that.  I've always been suspicious about this though, particularly with databases.  The thing with ZFS is that just because one group of people told me, doesn't mean it's true - there are a lot of details to work through.  But I do know that ZFS has variable record sizes.

 

That's very kind thank you.  I may take you up on it in future as it would be fun to see your process of figuring it out.  Also, this is actually not on the 1950x (actually now it's a 2950x so I should update that), this is on the dual xeon machine.  Either way I'm basically unavailable for around 3 weeks due to things going on in my life so would have to be after that if it's anything more than comments in a forum.

 

 

114k tracks, you friggin MONSTER YOU!!! I'm super interested in taking a look at this, it sounds like an exciting challenge to me 🎉. Maybe start by getting some measurements to quantify the slowness, just something you can easily reproduce (like 'when I go to the activity page, it takes X seconds to load', stuff like that), then incrementally test a few things to see what we come out with. When you say super slow with operations... Are you using lidarr extended, referring to those scripts maybe, or do you mean the basic lidarr maintenance stuffs?

 

With that many tracks, what's your sqlite DB size? ~2GB or so maybe? At that size.. Man, there's a bunch of additional stuff to think about - the WAL size, the compile time parameters for max pages before a checkpoint, hell, even the option to cache all instead of metadata only as an option of last resort lol. Few starting points:
1. You want to vacuum that sucker regularly, especially after mass media changes - I'd start with this, especially if you've never done it before. With the container stopped of course, but just cd to the dir then

- "sqlite3 lidarr.db VACUUM;"

2. Definitely try the txg modification from the postgres side of things - as long as all your pools are redundant (z1/z2/z3/mirror) make the change, then start the container and evaluate.

- "echo 1 /sys/module/zfs/parameters/zfs_txg_timeout"

3. In order to avoid NUMA issues, use lstopo, then pin the container to *just* one numa node worth of cores - at the very least, ensure only one CPU can operate the containers threads. Otherwise, you're certain to hit a significant IRQ penalty due to context switching... With the lidarr devs choosing to compile sqlite with the default variables (and hence, the db parameters set upon creation time are 'sqlite default'), we get hit multiple times by constraints that simply don't account for such massive DBs. For example, the default max number of pages prior to checkpointing the DB (a very 'expensive' operation) is 1000, and with a maximum page size limit of 64K, that means every 64MB worth encounters a checkpoint. Assuming you're at ~2TB, you've got a minimum of ~30 checkpoints if you were to 're-write' the database, which is bonkers.

 

That's probably about as much as I'm comfortable recommending without actually looking at the thing - everything from modifying the zfs dataset to remove sa xattr and go back to posixacl (they do have the potential to incur a performance penalty, and arent necessary if you never access the data outside of the terminal shell of your unraid server - and I mean 'ever', so please, anyone else reading this after the fact, please leave xattr to sa unless you know you know better!) to throwing the database in a dedicated zvol, on down to crazy stuff like recompiling sqlite3 within the container image and rebuilding the database with new parameters to allow us to modify things like the sync behavior, that checkpoint option mentioned above, and a bunch of others, they're all on the table 😁

 

Anyway, just lemme know if you try any of the above what the outcomes are like, and/or if you'd eventually like to take me up on the second set of eyes; you can probably tell, but I'm eager to take a whack if you end up being game for it hehehe.

 

_____

 

ioztat is basically 'iostat, but at the zfs fileset level' - super helpful when trying to track down latency issues especially. I briefly touch on it here

 

_____

 

As for the variable page size - this is absolutely true. However, there are a lot of other factors at play here as well - for instance, as you rsync'd the data in bulk, that data wrote everything in 64K chunks because it had 'one big-ass file' (your lidarr.db file) to write at once, so it was able to fill those 64K blocks you set the lidarr fileset to use (also resulting in 'right after the copy, it was the most performant it'd ever be with these settings' by proxy). On top of that, the extended attributes of the file itself can be anything from 255 bytes up to 64KB in linux, so if we have xattr=sa, each file we've got extended attributes for has this linux metadata (which zfs thinks of as 'just data', and has it's own metadata for as it's contained within the file itself now), and *that* almost certainly won't be a full block...

 

My one concern here with the rsync is if those pages don't line up block-wise with the records written. Correcting this is as easy as vacuuming the db though, so the above options will cover that 👍

 

Anyway, this topic could get super long. Suffice it to say 'what you've read is very much true, but is wicked easy to take out of context, and often is taken as such by otherwise trusted online sources' lol. 

 

 

 

Link to comment
5 hours ago, BVD said:

 

Keep in mind, while filesets can be renamed (and hence, their directory structure changed) on the fly, the existing data will still have been written in the prior fileset configuration. If you change things like recordsize, xattr, anything about the data itself, you'll need to copy the data off/back (or send to a new fileset with those parameters) in order to apply that configuration to the data (only applied at the time of write).

 

... I think I'm going to write a doc going over some more generalized zfs information at this point. I'd resisted the urge, as there are a couple detriments in my mind to doing so:

1. ZFS, while powerful, isn't for everyone - one has to have the desire/drive (and just as importantly, the TIME) to do some of their own background research and learning, or it's super easy for them to have a much worse experience with zfs than alternatives (or worse yet, cause themselves massive suffering by copy/pasting something they saw online :()

2. There's already so much out there on the basics, adding more stuff to the main page could mean that people just skip reading anything there altogether - if there's a short 'here's what you need to learn further', I feel like it's more likely to be read than if you add more than is absolutely necessary.

 

Both of these things though I think aren't necessarily going to be a problem here - I was helping out another fellow forum member with some nextcloud issues where I'd asked if he'd be interested in proof-reading some of it for me once I got the time to write it, and he seemed amicable to the idea (we got his data back, AND his nextcloud instance running again - WOOOT!!). I just didn't know how useful it would be... But it sounds like there would be enough use to serve a purpose at least ❤️ 

Yes, everything you've provided so far has been excellent and I will be leveraging my spare 500gb drive to recreate my pool today.

Edited by Partizanct
Link to comment

@Partizanct (and @Marshalleq if you're interested of course, more the merrier!) would you have time to give this a once over?

Why would we want ZFS on UnRAID? What can we do with it?

 

This is much less a 'technical thing can technically be done X way' doc than a 'here's why you might be interested in it, what problems it solves, and in what ways'. Given this, and that those types of reference material can often be interpreted numerous differing ways by different folks, I just want to make sure it's at least coherent, without going so deep into the weeds that someone newer to ZFS would just click elsewhere after seeing the encyclopedia britannica thrown at em  as their 'introduction' lol.

 

Open to any and all feedback here - again, this isn't supposed to get super technical, and has a unique goal of explaining why someone should care, as opposed to the rest of them which go over how to actually do the stuff once you've decided you * do * care enough to put forth the effort, so there's no such thing as 'bad' or 'useless' feedback for this type of thing imo.

 

Anyway, thanks for your time!

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.