Partizanct Posted August 21, 2022 Share Posted August 21, 2022 1 hour ago, BVD said: @Partizanct (and @Marshalleq if you're interested of course, more the merrier!) would you have time to give this a once over? Why would we want ZFS on UnRAID? What can we do with it? This is much less a 'technical thing can technically be done X way' doc than a 'here's why you might be interested in it, what problems it solves, and in what ways'. Given this, and that those types of reference material can often be interpreted numerous differing ways by different folks, I just want to make sure it's at least coherent, without going so deep into the weeds that someone newer to ZFS would just click elsewhere after seeing the encyclopedia britannica thrown at em as their 'introduction' lol. Open to any and all feedback here - again, this isn't supposed to get super technical, and has a unique goal of explaining why someone should care, as opposed to the rest of them which go over how to actually do the stuff once you've decided you * do * care enough to put forth the effort, so there's no such thing as 'bad' or 'useless' feedback for this type of thing imo. Anyway, thanks for your time! I would be interested in hearing your thoughts on the creation of the top level pool itself and what settings you would recommend (since you can modify compression levels ETC by dataset). Quote Link to comment
BVD Posted August 21, 2022 Author Share Posted August 21, 2022 @Partizanct That was actually intentional on my part 😅 This is primarily because there's precious few that applies to 'everything'. For instance, everyone says 'ashift=12' for the pool, right? But that means that our physical layer block size is set to 4K, and there's a huge amount of NAND out there that's 8k, even some that're 16k, leaving a lot to be desired. Or what about setting dnodesize to auto? This is great, but really. works best with xattr set to SA, and if you're not accessing the data primarily over NFS/iSCSI/SMB, you could actually lose (not much, likely, but some) performance. Heck, setting xattr to sa also means that pool's linux only now, losing the portability to BSD kernels (and others), as it's a Linux only thing. I'd hate to recommend something like that too broadly, then the user find out years later when they try to move the pool to some hot new BSD based system that's got all the new bells and whistles that they simply can't because some guy online said it was a good idea and they never looked any further into it right? Better that those values get researched, their implications understood, and folks choose what's best for them and their specific situation. Recommendations differ for HDD vs NAND as well. The other part of my reasoning also goes back to what I feel is required for someone to be successful with ZFS (will to learn, ability to research, and time to invest in both). For this one doc at least, the idea isn't to give someone an all inclusive summary of the best way to use ZFS on UnRAID overall, but to spark that something that gets them into the game if they read it and find themselves thinking 'this could've saved me hours last week on X, I wonder what else it can do...' I do give more explicit detail where possible though - for instance, postgres has it's fileset configuration laid out, with explanations of why. for each, same as I hope to continue to do with each other app as I find time to get them translated from my deployment notes to the docs github. I mentioned there's precious little I'd say applies globally, but that which does boils down to: atime = off compression = (at least something more than 'off' - again, whether to use lz4 or zstd still kinda depends, as if someone were using the old Westmere or Nehalem procs, lz4 is probably it for them) Everything else has sane defaults for most systems, with recommendations for specific deployments needs... I'm sorry in advance - I know this isn't super helpful in and of itself! I just hope my reasoning on why I did it this way makes some kind of sense at least! Quote Link to comment
dirkinthedark Posted November 27, 2022 Share Posted November 27, 2022 I have no idea what any of that stuff in the guide is but I have a feeling I need to know it. Thank you, I'm sure I will be back here soon. Quote Link to comment
ensnare Posted January 25 Share Posted January 25 I just switched from TrueNAS Scale to Unraid and am sharing my ZFS datasets via Samba in smb-extras.conf. Read performance over Samba went from about 1.1GB/s to 100-300MB/s with tremendous variance. I can't quite pinpoint the cause of the slowness. This is a 15-wide RaidZ2 pool on a machine w/ 512GB RAM. Would anyone happen to have sane/optimized ZFS settings for sysctl.conf that I can try? Right now my settings look sort of like this - any input appreciated on how to get read speeds to full 10-gbit. /boot/config/smb-extra.conf [global] server multi channel support = yes aio read size = 1 aio write size = 1 local master = yes preferred master = yes dead time = 10 max smbd processes = 1000 vfs objects = catia shadow_copy2 fruit streams_xattr shadow: snapdir = .zfs/snapshot shadow: sort = desc shadow: format = -%Y-%m-%d-%H%M shadow: snapprefix = ^zfs-auto-snap_\(frequent\)\{0,1\}\(hourly\)\{0,1\}\(daily\)\{0,1\}\(monthly\)\{0,1\} shadow: delimiter = -20 fruit:model = MacSamba fruit:posix_rename = yes fruit:veto_appledouble = no fruit:nfs_aces = no fruit:wipe_intentionally_left_blank_rfork = yes fruit:delete_empty_adfiles = yes fruit:resource = file fruit:metadata = stream fruit:encoding = native fruit:advertise_fullsync = true fruit:aapl = yes log file = /var/log/samba/%m.log max log size = 10000 log level = 1 [media] path = /mediapool/media browseable = yes guest ok = yes writeable = yes read only = no create mask = 0777 directory mask = 0775 delete veto files = Yes veto files = /*.DS_Store/.apdisk/.TemporaryItems/.windows/.mac/ zfsacl:acesort = dontcare /boot/config/modprobe.d/zfs.conf options zfs l2arc_rebuild_enabled=1 options zfs zfs_prefetch_disable=1 options zfs l2arc_noprefetch=0 options zfs l2arc_write_max=524288000 options zfs l2arc_headroom=12 options zfs zfs_arc_max=350000000000 options zfs zfs_nocacheflush=1 # increase them so scrub/resilver is more quickly at the cost of other work options zfs zfs_vdev_scrub_min_active=24 options zfs zfs_vdev_scrub_max_active=64 # sync write options zfs zfs_vdev_sync_write_min_active=8 options zfs zfs_vdev_sync_write_max_active=32 # sync reads (normal) options zfs zfs_vdev_sync_read_min_active=8 options zfs zfs_vdev_sync_read_max_active=32 # async reads : prefetcher options zfs zfs_vdev_async_read_min_active=8 options zfs zfs_vdev_async_read_max_active=32 # async write : bulk writes options zfs zfs_vdev_async_write_min_active=8 options zfs zfs_vdev_async_write_max_active=32 options zfs zfs_dirty_data_max_percent=40 options zfs zfs_txg_timeout=15 # default : 32768 options zfs zfs_immediate_write_sz=131072 Quote Link to comment
JorgeB Posted January 25 Share Posted January 25 8 minutes ago, ensnare said: Read performance over Samba went from about 1.1GB/s to 100-300MB/s Did you run a single stream iperf test in both directions to confirm LAN bandwidth is OK? 1 Quote Link to comment
BVD Posted January 26 Author Share Posted January 26 @ensnare before diving too deeply into the configuration, my recommendation (as @JorgeB alluded to above) would be narrow down the source a bit (confirming this pool was originally created on scale (not core) as well would be useful). Can you go over what testing you've done so far to better pinpoint this? In general, assuming you've done nothing yet, a good start would be: Do you experience the same throughput with a generic IO stream? I'd use an extended FIO run with fully randomized IO bypassing the L2arc to start (don't expect it's storage related, but doesn't take much time, and rules out a ton of other junk), then hit it with bi-directional iperf as well, and finally nfs. Assuming you see 900+ MB/s for each of the above, THEN you can start focusing on SMB. You've a ton of additional config's added for samba, so I'd first try commenting all those out and simply copying a file over, then see what (if anything) changes to get a better idea on what direction to take this. Is *all* SMB traffic equally impacted, regardless of which host/application is attempting to write? Are read requests similarly impacted? Really the biggest thing here is to do some troubleshooting to narrow the focus of your analysis/investigation. If you could share what you've already done on that front, it'll probably help us give you a better idea of where to go next. Quote Link to comment
ensnare Posted January 28 Share Posted January 28 On 1/26/2023 at 8:10 AM, BVD said: @ensnare before diving too deeply into the configuration, my recommendation (as @JorgeB alluded to above) would be narrow down the source a bit (confirming this pool was originally created on scale (not core) as well would be useful). Can you go over what testing you've done so far to better pinpoint this? In general, assuming you've done nothing yet, a good start would be: Do you experience the same throughput with a generic IO stream? I'd use an extended FIO run with fully randomized IO bypassing the L2arc to start (don't expect it's storage related, but doesn't take much time, and rules out a ton of other junk), then hit it with bi-directional iperf as well, and finally nfs. Assuming you see 900+ MB/s for each of the above, THEN you can start focusing on SMB. You've a ton of additional config's added for samba, so I'd first try commenting all those out and simply copying a file over, then see what (if anything) changes to get a better idea on what direction to take this. Is *all* SMB traffic equally impacted, regardless of which host/application is attempting to write? Are read requests similarly impacted? Really the biggest thing here is to do some troubleshooting to narrow the focus of your analysis/investigation. If you could share what you've already done on that front, it'll probably help us give you a better idea of where to go next. What is the best way to install fio for benchmarking? It looks like the NerdTools package isn't active anymore, and I can't find the Slackware 15 binary. On 1/25/2023 at 6:58 AM, JorgeB said: Did you run a single stream iperf test in both directions to confirm LAN bandwidth is OK? Yes. iperf3 maxes out at 9.2Gbps in both directions with single stream. Quote Link to comment
BVD Posted January 28 Author Share Posted January 28 29 minutes ago, ensnare said: What is the best way to install fio for benchmarking? It looks like the NerdTools package isn't active anymore, and I can't find the Slackware 15 binary. Yes. iperf3 maxes out at 9.2Gbps in both directions with single stream. Ever since the nerd/dev packs went the way of the dodo I've just set up mirrors for all the tools I've ended up using and compiled my own, so while I'm not really certain anymore where the slackware stuff is, you can just build from source instead: https://git.kernel.dk/cgit/fio/ If you want to avoid mucking around with the hypervisor, you can always use a container of course. I pushed up a little container to github that I've used in the past in such situations in case it's helpful - just clone the repo + build, then run the command noted: https://github.com/teambvd/docker-alpine_fio Just be sure you've cd'd into the mountpoint path for your SMB share prior to running to ensure the test is valid 👍 Quote Link to comment
BVD Posted September 13 Author Share Posted September 13 On 8/20/2022 at 6:58 PM, Marshalleq said: ... My main gripes are not so much with the web pages, more to do with load times e.g. startup from docker and the forever chugging away in the background. It may just be that my library is big. Plex says I have 114000 tracks / 1092 artists / 8463 albums. I hadn't seen ioztat before - I'm guessing that better than zpool iostat by going down to dataset level of something? ... So I apparently finally hit the tipping point towards experiencing what you were seeing with Lidarr - seems to be somewhere in the 65-70k track range, where the way the queries to the DB are formulated means the sqlite DB just absolutely chonks in protest. I finished converting Lidarr over to postgres last night, and while it's still sub-optimal IMO from a query perspective, pg is basically able to just brute force its way through. Start-up times cut down to maybe a tenth of what they were previously, and all UI pages populating within a couple seconds at most 👍 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.