• Posts

  • Joined

  • Last visited

1 Follower

About Marshalleq

  • Birthday October 17


  • Gender
  • URL
  • Location
    New Zealand
  • Personal Text

Recent Profile Visitors

2388 profile views

Marshalleq's Achievements


Collaborator (7/14)



  1. Randomly I came across this openzfs man page which lists a device type specifically for storing dedup tables. I was not aware this device type was available. So this would be I guess to split it out from the other metadata and small file blocks that come with a special vdev, for those that want to do that - though I suspect that's pretty niche as the special device type should offer additional performance improvements in most cases. ZFS is awesome.
  2. Hey, I went through a few of these tests. You can add the special vdev at any time but you do have to recopy the data. A quick way I ended up figuring out how to do that was to first rename a dataset to a new name, the send receive that back to the original name, then delete the renamed set. There's also a rebalance script I could dig up, but there are caveats to that, so I ended up just doing the rename. I've read, that you can take a special vdev out again in certain circumstances, but to be honest it was not very clear and sounded scary - most people in that discussion concluded it wasn't for them. Remember the special vdev holds all the information about where the files are stored (I guess it's like ZFS FAT), so if it dies - the array is effectively dead, because it doesn't know how to find the files. Though again, only files that have been modified since the special vdev has been added are on it, so I'm not sure if the whole pool would die or not. EDIT: From that other thread "Supposedly, special vdevs can be removed from a pool, IF the special and regular vdevs use the same ASHIFT, and if everything is mirrors. So it won’t work if you have raidz vdevs or mixed Ashift." Honestly the best thread I found on it is below, with a fantastic thread of comments and questions at the bottom of it, worth reading if you're considering doing it. The opening paragraph from the article: Introduction ZFS Allocation Classes: It isn’t storage tiers or caching, but gosh darn it, you can really REALLY speed up your zfs pool. From the manual: Special Allocation Class The allocations in the special class are dedicated to specific block types. By default this includes all metadata, the indirect blocks of user data, and any deduplication tables. The class can also be provisioned to accept small file blocks." Link Happy reading!
  3. Have you found some doc that says performance of dedup with special Vdev is bad? I mean it’s going to be slower than ram In most cases but that doesn’t mean we will notice it or make it unusable. The other link that is continuously posted above is ambiguous. I’ve heard otherwise and that aligns with my experience. Or are you just speaking generally from educated guesses? (Genuine question). I have mine with HDDs so probably is why I don’t notice it.
  4. @subivoodoo Great feedback! Did you by chance try the special vdev (or want to) to see what the difference is in terms of ram usage for dedup? I figure for this test, any small ssd would do (though typically you'd want it to be mirrored).
  5. Nice find on the GPU-P - I hadn't realised we could do that now!
  6. Nice to hear @subivoodoo, yeah encryption that will do it! There's a video online somewhere about a guy doing something similar for multiple machines in his house. He configured steam data to be on a zfs pool for all computers and the the dedup meant he only had to store one copy. Cool idea - I would have thought the performance was bad, but apparently not. How's your RAM usage? @jortan I've seen you've replied, but I'm not going to read it sorry - I can see it's just more of the same and I don't see the value for everyone else of having a public argument. I get that differences of opinion get annoying and it feels good to be right, so lets just say you're right. Have a great day and don't stress about it.
  7. Sigh, yes, it absolutely is, the original poster declared that a home scenario was what they were working on and you seem to keep comparing it to disaster scenarios. No-one here is saying don't be careful, don't plan, backup your data or whatever applies, people need to be given some credit, they're not all morons. LOL, if I answer this it's going to get into a flame war, so I'm just going to leave it (and the remainder of the points). The poster has the information and two opinions on it. I have given actual evidence, you have given your experience, which I'm sure is also extensive. They can make their own decision as to whether this works for their lab, or whatever they end up doing. Thanks for the info, have a great day. Marshalleq.
  8. Just saw reply from @jortan (previous reply was just foreseeing some of the questions and trying to be helpful). None of my comments were directed at you, just directed at the misinformation lying around the web - which is what you find when you google and get old documents. Some of the newer stuff now is reflecting the newer state, but unfortunately also some of the newer stuff is still getting written by people whom haven't tried it for quite a while and are repeating out of date experiences - special vdevs in particular being the main case of change here. I do believe special vdevs hold all the DDT for the pool, it even says that on the page you linked. Except for when it's full of course. If you read that page a little deeper, it says this thrashing happens when the special vdev gets full, not 'constantly' as you say above - this is because it will start putting the DDT in the main pool instead of the special vdev once it starts getting full. Of course, this is talking about a busy corporate environment that is worrying about IOPS all the time, for the average person playing around at home (something that @subivoodoo seems to indicate is their scope i.e "My aproach is safe some space on clients... and play with IT stuff 😁 ") then this would not be an issue. In any case, I have a very high IOPS requirements and I am constantly marvelling at how well it does considering I'm just running Raidz1 on everything, have dedup on, my special vdev is running on great but older Intel SSD's that are actually quite slow, the mail server, the various web services, automation and undoubtedly a ton of misaligned cluster sizes which are killing it etc etc. It's under constant use and really it's incredible. Can we argue in a corporate environment we could get more performance? Absolutely, but if it were we wouldn't be running unraid, it wouldn't be all on one box and a whole bunch of other things. Sorry for the laborious post, but I think it's fair to say that the dedup scare mongering that's out there need some balance - again, not directed at you. Marshalleq. PS, I've tried that lancache, ran it for a few years, it works well sometimes, others not so much. Definitely worth a try though.
  9. Some stats on my setup to give you an indication: I run two configs 1 - 4x480G SSD's in RaidZ1 - this hosts docker and virtual machines. I have only 3 VM's at present totally about 50G. I have a bunch of dockers but only the VM's are deduped. There is 653G free and my dedup ratio across the whole pool is 1.11 (i.e. 11%). 2 - 4x16TB HDD's in Raidz1 with 2x 150G mirror for a special vdev with small blocks up to 32k enabled. Most of the Pool is unique data that cannot be deduped. There is 598G free on the array and 70G free on the special vdev. I dedup my backups folder, documents folders, isos, temp folder which totals about 530G, I am getting 1.14 dedup ratio across the whole pool which is about 14%. I think these numbers are pretty good. I thoroughly tested the memory usage before and after for the Raidz1 array as I was unsure if all or some of it would go to the special vdev. I noticed no difference at all. I did the same for the virtual machines on the array without the special vdev and while this was less scientific, also noticed no perceptible difference (I mention because so many people cry out that dedup uses too much RAM). Now, I do have 96G of RAM in this system, however before enabling dedup on anything the RAM usage was sitting around 93-96% full. It didn't change. I think this speaks well to the issue as I would have had big failures if it did use a lot of RAM. I've been running it like this for a long time now and no issues yet. I hope that helps!
  10. Yeah, I'm just using image files on unraid. I found great returns on virtual machines, especially when based on the same install iso - but even on different ones. I found reasonable returns on isos and documents. I probably have a few duplicated isos with not so obvious names so it saves having to sort that out. I don't think there's much benefit in dockers but could be wrong. And that's correct I don't use ZVOL's. I've tried them and found them them at best to be non-advantageous and a lot less flexible. I don't yet understand why anyone would use them really, except maybe for iscsi targets.
  11. I'm using dedup quite successfully. What I've learnt is that most people whom say it isn't worth it either haven't looked at it for a while (so are just continuing on old stories without checking) or are not applying it to the right type of data. In my case I'm running a special vdev. It works extremely well for the content that can be deduped (such as VM's). I've never noticed any extra memory being used either as I do believe this is handled by the special vdev. I'm using unraid - tried TrueNAS scale but it's containerisation is just awful - hopefully they figure out what market they're aiming for there and fix their strategy in a future version not too far away.
  12. I would suggest that you log this upstream as it sounds like a bug.
  13. I just put all mine in /mnt. I am not sure that you can have two mount points for one pool, but ZFS is very powerful so perhaps that's a feature I've not seen before. You can change the mount point of an existing ZFS pool with zfs set mountpoint=/myspecialfolder mypool. I suspect to get your drives to show up as zfs, that your restore has lost you the unassigned devices / plus plugin? Not that is not the same as the unassigned devices heading you have above. At least that's what I think I'm seeing in your screenshot.
  14. To be honest I'm not sure I'm following you so much on the smb and security side. Everything else was very well outlined though. I think you're saying you use a link from zfs to the unraid share which to me sounds absolutely horrible. So I only know the way I've done it, which is outlined below. SMB permissions with ZFS are done manually via smb-extra.conf (which is in the /boot/config/smb directory). The unraid smb GUI does not like anything outside of it's own array (I honestly don't know why you'd put in this artificial restriction, but they do) I've always preferred the console method anyway as it's more powerful. So the point here being, you're using the same SMB system that unraid uses, but you're bypassing their artificial restriction of the GUI. At least this is how I do it, someone else might have a better way. Here's a typical one, then a more advanced one to help you out. [isos] path = /mnt/Seagate48T/isos comment = ZFS isos Drive browseable = yes valid users = mrbloggs write list = mrbloggs vfs objects = [pictures] path = /mnt/Seagate48T/pictures comment = ZFS pictures Drive browseable = yes read only = no writeable = yes oplocks = yes dos filemode = no dos filemode = no dos filetime resolution = yes dos filetimes = yes fake directory create times = yes csc policy = manual veto oplock files = /*.mdb/*.MDB/*.dbf/*.DBF/ nt acl support = no create mask = 664 force create mode = 664 directory mask = 2775 force directory mode = 2775 guest ok = no vfs objects = fruit streams_xattr recycle fruit:resource = file fruit:metadata = netatalk fruit:locking = none fruit:encoding = private acl_xattr:ignore system acl valid users = mrbloggs write list = mrbloggs Also about the access denied, with ZFS on unraid you do have to go through and set nobody.users on each of these shares at the file level. So basically # chown nobody.users /zfs -Rfv Who knows, perhaps this is all you need to do to get your method to work. Good luck!
  15. Well, I particularly liked the user centric approach of the Time Machine style - but the more options the better!