ZFS plugin for unRAID


steini84

Recommended Posts

2 minutes ago, jortan said:

 

This seems about what I would expect.  You're not streaming data directly and uninterrupted to your spinning rust as you would be in a RAID0-like configuration.  For every write to one disk, ZFS is having to store that data and metadata redundantly on another disk. Then the first disk gets interrupted because it needs to write data/metadata to provide redundancy for another disk. You're not streaming data neatly in a row, there are seek times involved.

 

If you want performance with spinning rust, get more spindles and ideally switch to RAID10 (mirrrored pairs)

 

Thanks @jortan. While you replied, I was about to edit my original post to add the output of "zpool iostat" for the two zpool layouts.

 

3-wide striped array (no parity):

              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zfs         21.9G  10.9T      0    727      0   728M
zfs         21.9G  10.9T      0    722      0   722M
zfs         21.9G  10.9T      0    733      0   648M
zfs         24.6G  10.9T      0  1.05K      0   693M
zfs         24.6G  10.9T      0    715      0   716M
zfs         24.6G  10.9T      0    753      0   738M
zfs         27.2G  10.8T      0  1.03K      0   647M
zfs         27.2G  10.8T      0    734      0   734M
zfs         27.2G  10.8T      0    722      0   723M
zfs         27.2G  10.8T      0    735      0   719M
zfs         29.9G  10.8T      0  1.05K      0   630M
zfs         29.9G  10.8T      0    724      0   725M
zfs         29.9G  10.8T      0    728      0   728M
zfs         29.9G  10.8T      0    736      0   737M
zfs         32.5G  10.8T      0  1.06K      0   655M
zfs         32.5G  10.8T      0    742      0   743M
zfs         32.5G  10.8T      0    739      0   740M
zfs         32.5G  10.8T      0    734      0   735M
zfs         35.1G  10.8T      0   1014      0   612M
zfs         35.1G  10.8T      0    727      0   727M
zfs         35.1G  10.8T      0    727      0   728M
zfs         35.1G  10.8T      0    743      0   728M
zfs         37.8G  10.8T      0    925      0   588M
zfs         37.8G  10.8T      0    705      0   706M
zfs         37.8G  10.8T      0    709      0   709M
zfs         37.8G  10.8T      0    713      0   714M
zfs         40.5G  10.8T      0  1.04K      0   657M
zfs         40.5G  10.8T      0    742      0   743M
zfs         40.5G  10.8T      0    730      0   716M
zfs         40.5G  10.8T      0    724      0   643M
zfs         43.1G  10.8T      0    978      0   685M
zfs         43.1G  10.8T      0    719      0   720M
zfs         43.1G  10.8T      0    721      0   722M

 

4-wide raidz1 array:

              capacity     operations     bandwidth 
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
zfs         96.3G  14.5T      8  6.36K   120K   544M
zfs         96.3G  14.5T      9  4.95K   124K   421M
zfs         96.2G  14.5T      8  5.44K   120K   546M
zfs         96.2G  14.5T      8  6.34K   120K   516M
zfs         96.2G  14.5T      8  6.53K   120K   568M
zfs         96.2G  14.5T      8  5.46K   120K   566M
zfs         96.2G  14.5T      8  5.94K   120K   583M
zfs         96.2G  14.5T     10  5.94K   148K   539M
zfs         96.2G  14.5T      9  5.44K   132K   528M
zfs         96.3G  14.5T      8  4.34K   120K   375M
zfs         96.3G  14.5T      5  6.08K  79.9K   525M
zfs         96.3G  14.5T      8  6.33K   120K   598M
zfs         96.3G  14.5T      8  5.98K   120K   565M
zfs         96.3G  14.5T      8  5.82K   120K   578M
zfs         96.3G  14.5T     11  5.75K   160K   550M
zfs         96.3G  14.5T      8  5.97K   120K   557M
zfs         96.3G  14.5T      8  6.00K   120K   566M
zfs         96.3G  14.5T     10  3.98K   128K   285M
zfs         96.3G  14.5T      5  4.55K  79.9K   561M
zfs         96.3G  14.5T      8  5.79K   120K   506M
zfs         96.3G  14.5T      8  6.23K   120K   495M
zfs         96.3G  14.5T      8  6.21K   120K   522M
zfs         96.3G  14.5T      5  5.96K  79.9K   577M

 

Before reading your reply I thought that the reason for the lower performance were the reads that are interleaved in between the writes.

I know very little about zfs but I wasn't expecting a write operation to require reads. Does this have something to do with the ZIL?

Link to comment
Posted (edited)
40 minutes ago, Andrea3000 said:

Before reading your reply I thought that the reason for the lower performance were the reads that are interleaved in between the writes.

I know very little about zfs but I wasn't expecting a write operation to require reads. Does this have something to do with the ZIL?

No. That's how a system works.

 

You use raidz for data that must absolutely not be lost nor altered. If you wan't to run you files with best performances, you then use a Raid-0 / Raid-10 (For redundancy), so two vdev stripped or two mirrors stripped.

RAID-0 is faster than RAID-1, which is faster than RAIDZ-1, which is faster than RAIDZ-2, which is faster than RAIDZ-3.

Edited by gyto6
Link to comment
Posted (edited)

What can be done, is to use as Special device to boost your system. The special_small_blocks property authorizes you to store not only metadata, but data.

According to your dataset recordsize property value, the special_small_blocks value represents the threshold block size for including small file blocks into the special allocation class. Blocks smaller than or equal to this value will be assigned to the special allocation class while greater blocks will be assigned to the regular class.

 

What I use for my applications datasets is that I put the same value to the recordsize and special_small_blocks, so my whole applications runs on my NVMe drives.

 

In consequence, my SATA metadata files are stored on my NVMe drives for better indexing, and are not troubled by the applications using more R/W operations. I use two Special mirrored dev because if the Special fails, your whole pool is lost.

 

So my personal files which must not be altered, have the parity and checksum running in their pool and is what makes most of its operations as they're rarely used. My applications always running do not interfere on this activity has they run on the NVMe Special drives which are more efficient.

An app can be reinstalled, a corrupt picture is not retrievable..

 

Btw, I don't risk any data loss as all my devices have Power Loss Protection (PLP), runs behind a UPS set to turn off the server softly after 5min without power. Finally, my data are saved every hour on my NAS somewhere else in my house, and on my Sharepoint ou-site.

 

Backup is a must to keep your data safe. Do not hope your system to take care of them without any trouble.

 

What I meant is that you won't optimize your system without caveats or risks. You ALWAYS must imagine the worst, to think about all it implies, then find a solution. You'd be better letting your drives in raidz for know and think about a NVMe drives for your applications which requires a lot of I/O.

Edited by gyto6
Link to comment
11 hours ago, gyto6 said:

What can be done, is to use as Special device to boost your system. The special_small_blocks property authorizes you to store not only metadata, but data.

According to your dataset recordsize property value, the special_small_blocks value represents the threshold block size for including small file blocks into the special allocation class. Blocks smaller than or equal to this value will be assigned to the special allocation class while greater blocks will be assigned to the regular class.

 

What I use for my applications datasets is that I put the same value to the recordsize and special_small_blocks, so my whole applications runs on my NVMe drives.

 

In consequence, my SATA metadata files are stored on my NVMe drives for better indexing, and are not troubled by the applications using more R/W operations. I use two Special mirrored dev because if the Special fails, your whole pool is lost.

 

So my personal files which must not be altered, have the parity and checksum running in their pool and is what makes most of its operations as they're rarely used. My applications always running do not interfere on this activity has they run on the NVMe Special drives which are more efficient.

An app can be reinstalled, a corrupt picture is not retrievable..

 

Btw, I don't risk any data loss as all my devices have Power Loss Protection (PLP), runs behind a UPS set to turn off the server softly after 5min without power. Finally, my data are saved every hour on my NAS somewhere else in my house, and on my Sharepoint ou-site.

 

Backup is a must to keep your data safe. Do not hope your system to take care of them without any trouble.

 

What I meant is that you won't optimize your system without caveats or risks. You ALWAYS must imagine the worst, to think about all it implies, then find a solution. You'd be better letting your drives in raidz for know and think about a NVMe drives for your applications which requires a lot of I/O.

 

Thank you.

 

I experimented a bit with a NVME special vdev and while the performance have increased slightly, it isn't worth the effort (and risk) for me.

I got the SATA cables that I was expecting and I put all 8 disks online in a raidz2 pool. With 1M recordsize I can saturate the 10gbe bandwidth with sequential reads or writes.

For random read/writes I'm around 180-200MB/s with 1M recordsize. I'm pretty happy with that and I'll probably stick with raidz2 for extra peace of mind.

 

Thank you all for all the valuable advices, much appreciated.

Link to comment

The other thing you can do is

zfs set sync=disabled poolname

 

This forces all writes to be asynchronous.  This doesn't risk data corruption, but it does risk the last 5 seconds of your data in the case of a power failure which could lead to data loss in some scenarios.  If you happen to be moving data to the array at the time - The sender has been told the data is moved so the source is deleted, but it won't actually get written to the destination for 0-5 seconds. 

 

You could see significant performance benefit on a busy pool though - particularly one seeing lots of small synchronous writes.

Link to comment
6 hours ago, jortan said:

The other thing you can do is

zfs set sync=disabled poolname

 

This forces all writes to be asynchronous.  This doesn't risk data corruption, but it does risk the last 5 seconds of your data in the case of a power failure which could lead to data loss in some scenarios.  If you happen to be moving data to the array at the time - The sender has been told the data is moved so the source is deleted, but it won't actually get written to the destination for 0-5 seconds. 

 

You could see significant performance benefit on a busy pool though - particularly one seeing lots of small synchronous writes.


Thanks @jortan.

Most of my use cases involve only asynchronous writes.

In addition, I have an Intel Optane P1600x that I’m using as slog device to handle synchronous writes (It has power loss protection).

Therefore, I’m more inclined to still allow synchronous write, my priority is not to lose any data.

  • Like 1
Link to comment

Sorry, I'm back with another ZFS related question.

 

I'm trying to access the ZFS snapshots via Samba on MacOS.

I know that for Windows I can use the shadow_copy2 vfs object to make the snapshots appears as shadow copies.

 

But what about MacOS?

I found online that this can be done in TrueNAS via the zfsacl vfs object:

https://www.truenas.com/community/threads/how-to-access-zfs-snapshots-over-smb.69449/ 

but this doesn't appear to be available in unRAID.

 

Is there a workaround?

Link to comment

Hello,

I have a pool configured a couple years ago with 5 drives in a SilverStone CS380 with 8 bays. I would like to fill the rest of the bays which I understand requires me to scale the pool to only 4 drives. I attempted to remove a device from the pool, but was met with the 'operation not supported on this type of pool' message.

I'm running zfs-2.0.6-1 and zfs-kmod-2.0.6-1

and here is the status of my pool

zpool status
  pool: dumpster
 state: ONLINE
  scan: scrub repaired 0B in 05:05:24 with 0 errors on Thu Jun  2 05:00:25 2022
config:

        NAME        STATE     READ WRITE CKSUM
        dumpster    ONLINE       0     0     0
          raidz1-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdd     ONLINE       0     0     0
            sde     ONLINE       0     0     0
            sdf     ONLINE       0     0     0
            sdg     ONLINE       0     0     0

errors: No known data errors

 

Where should I go to get more information on how to resize this pool?

Link to comment
4 minutes ago, Shiddy said:

I attempted to remove a device from the pool, but was met with the 'operation not supported on this type of pool' message.

You can't remove devices from zfs raidz pools, you'd need to backup and re-create the pool.

Link to comment

where do i find info about what versions this supports. ive looked threw this thread a bit but no real answers. is it safe to upgrade to 6.10 apparently im having fatal crashes that might be fixed in upgrade.

Link to comment
14 hours ago, myths said:

where do i find info about what versions this supports. ive looked threw this thread a bit but no real answers. is it safe to upgrade to 6.10 apparently im having fatal crashes that might be fixed in upgrade.

 

Can confirm the current "next" build of 6.10.2-rc3 is also working if you wanted to try that.

 

VVV oops, you're right - thanks

Edited by jortan
Link to comment
2 minutes ago, jortan said:

Can confirm the current "next" build of 6.10.2-rc3 is also working if you wanted to try that.

This is a old build btw...

Keep in mind that 6.10.2 is newer than 6.10.2rc3

Link to comment
  • 2 weeks later...

I have ZFS working on unraid 6.10.3 with the zfs plugins.
My pool is /zfs-sas
The dataset is /zfs-sas/isos
I make a sym-link from /zfs-sas/isos to /mnt/user/isos
it works, but the log starts filling up with

Jun 19 01:30:19 unraid1 emhttpd: error: share_luks_status, 6151: Operation not supported (95): getxattr: /mnt/user/isos

I want to use the zfs-sas/isos as the isos share.
Eventually I want to use other datasets like this, vms, etc.
These instructions were from SpaceInvader One's video.

What is the correct way to do this?

Link to comment
16 minutes ago, BrandonPollack said:

After upgrading I guess I didn't wait long enough for the rebuild or something and now no zfs commands (zpool or zfs) are working (not in /usr/sbin)

 

Tried uninstalling and reinstalling the plugin etc but to no avail...any tips?

 

Scratch that.  Uninstalling completely, rebooting, and then reinstalling completely solved it

  • Like 1
Link to comment
  • 2 weeks later...
On 6/19/2022 at 4:35 AM, sbrewer said:

I have ZFS working on unraid 6.10.3 with the zfs plugins.
My pool is /zfs-sas
The dataset is /zfs-sas/isos
I make a sym-link from /zfs-sas/isos to /mnt/user/isos
it works, but the log starts filling up with

Jun 19 01:30:19 unraid1 emhttpd: error: share_luks_status, 6151: Operation not supported (95): getxattr: /mnt/user/isos

I want to use the zfs-sas/isos as the isos share.
Eventually I want to use other datasets like this, vms, etc.
These instructions were from SpaceInvader One's video.

What is the correct way to do this?

 

 

Did you find anyway to get rid of those log entries?

 

I have had this same issue for the past year or so. I can symlink my datasets into a regular folder that already exists as a share and be fine. but all datasets would be grouped into one folder. I map different datasets to different sharepoints on different machines with different permissions. Would like to go back to being able to use individual folders for shares though.

Link to comment
Posted (edited)
On 1/25/2022 at 3:53 AM, subivoodoo said:

 

The rest can be configured within the iSCSI plugin, you can just pick this manually created backstorage there:

 

grafik.thumb.png.b502a36670e5398ccff5174c2fcd45c4.png

 

If you don't need it any longer, remove the initiator mapping and delete the backstorage entry (note the zvol still exists):

 

And last but not least how to clone an existing zvol and/or delete it:

 

zfs snapshot YOURPOOLNAME/testzvol@yoursnapshotname
zfs clone -p YOURPOOLNAME/testzvol@yoursnapshotname YOURPOOLNAME/testzvol.myclone

zfs destroy YOURPOOLNAME/testzvol.myclone
zfs destroy YOURPOOLNAME/testzvol@yoursnapshotname
zfs destroy YOURPOOLNAME/testzvol

 

 

Hello sry to bother but this is exactly what i want to do :D

 

could you please give more details on how to set it up?

 

i did everything you wrote 

 

i have a dataset, a have a volume, i can make snapshoots and i think it made a clone :D

 

the Main Volume i mapped via iscsi, works fine. 

 

buuuuut how do i mount that clone to my VM?    

 

had it working but now iam getting blown away by the  amount of snapshots...

and now the clone is write protected xD

could u please make an bit more step by step manual for this kind of setup?

 

and one thing i would like to ask for is the used user script to semi automate the update prozess, would you share it for us?

 

regards Dom

 

Edited by domrockt
Link to comment
Posted (edited)
13 hours ago, Defq0n said:

 

 

Did you find anyway to get rid of those log entries?

 

I have had this same issue for the past year or so. I can symlink my datasets into a regular folder that already exists as a share and be fine. but all datasets would be grouped into one folder. I map different datasets to different sharepoints on different machines with different permissions. Would like to go back to being able to use individual folders for shares though.

 

/mnt/user is where unraid does it's magic for shares backed by Unraid arrays / cache devices.  My advice would be to leave /mnt/user alone and not attempt to manually mount (or link) filesystems here.

 

Maybe I'm missing something but can you not go to:

 

- Settings

- VM Manager

 

And set "Default ISO storage path" to  /zfs-sas/isos ?

 

Fairly sure this is what most Unraid/ZFS users are doing.

 

 

Edited by jortan
Link to comment
On 7/5/2022 at 11:52 AM, domrockt said:

 

Hello sry to bother but this is exactly what i want to do :D

 

could you please give more details on how to set it up?

 

i did everything you wrote 

 

i have a dataset, a have a volume, i can make snapshoots and i think it made a clone :D

 

the Main Volume i mapped via iscsi, works fine. 

 

buuuuut how do i mount that clone to my VM?    

 

had it working but now iam getting blown away by the  amount of snapshots...

and now the clone is write protected xD

could u please make an bit more step by step manual for this kind of setup?

 

and one thing i would like to ask for is the used user script to semi automate the update prozess, would you share it for us?

 

regards Dom

 

 

I don't know how to make more step by step infos as in my original post... background knowledge of ZFS and iSCSI is needed for this kind of "official not supported" solutions.

 

Attach to VM????

The iSCSI solution has noting to do with VM's => iSCSI is used to add remote disks on any client system in a network... a laptop or whatever supports iSCSI as client. The example of use a zVOL as disk in a VM on Unraid doesn't need iSCSI.

 

But probably we can figure out how to fix your main issue (too many snapshots)?

 

User script: I've attached my personal user script for resetting the "game-disk" which are used by 3 computers in my network (deletes the iSCSI targets, deletes all the clones, does new clones and recreates the iSCSI targets).
=> but note that this script is 100% for my personal setup

 

Regards

iSCSI-RenewAllKidsGames .sh

Link to comment

Hey all,

 

I tried @gyto6's scripts for mounting the docker containers into a zvol, this one.

 

On 3/24/2022 at 6:39 PM, gyto6 said:
zfs create -V 20G pool/docker # -V refers to create a ZVOL
cfdisk /dev/pool/docker # To create easily a partition
mkfs.btrfs -q /dev/pool/docker-part1 # Simple to format in the desired sgb
mount /dev/pool/docker-part1 /mnt/pool/docker # The expected mount point

 

The problem I ran into was after creating the partition, I got this issue with the next line of the script:

mkfs.btrfs -q /dev/tank/docker-part1
probe of /dev/tank/docker-part1 failed, cannot detect existing filesystem.
WARNING: cannot read superblock on /dev/tank/docker-part1, please check manually

ERROR: use the -f option to force overwrite of /dev/tank/docker-part1

 

I'm not sure what to do at this point.  Anybody able to help?

Edited by asopala
Link to comment
8 hours ago, asopala said:

Hey all,

 

I tried @gyto6's scripts for mounting the docker containers into a zvol, this one.

 

 

The problem I ran into was after creating the partition, I got this issue with the next line of the script:

mkfs.btrfs -q /dev/tank/docker-part1
probe of /dev/tank/docker-part1 failed, cannot detect existing filesystem.
WARNING: cannot read superblock on /dev/tank/docker-part1, please check manually

ERROR: use the -f option to force overwrite of /dev/tank/docker-part1

 

I'm not sure what to do at this point.  Anybody able to help?

Are you sure you've "written" the modification when you created the partition?

It's the final step to create the partition.

Edited by gyto6
Link to comment
8 minutes ago, gyto6 said:

Are you sure you've "written" the modification when you created the partition?

It's the final step to create the partition.

 

I redid everything, and hit write on this.  Does hitting quit delete the write?  It doesn't seem like it.  Does this look right?

 

           Disk: /dev/tank/docker
                        Size: 20 GiB, 21474836480 bytes, 41943040 sectors
                  Label: gpt, identifier: B6A566AE-0026-D647-9992-460B2508AA95

    Device                      Start           End       Sectors      Size Type
>>  /dev/tank/docker1            2048      41943006      41940959       20G Linux filesystem     



















 ┌─────────────────────────────────────────────────────────────────────────────────────────────┐
 │Partition UUID: 2B9DD3E0-DE51-4241-A035-ED4142AE7E0D                                         │
 │Partition type: Linux filesystem (0FC63DAF-8483-4772-8E79-3D69D8477DE4)                      │
 └─────────────────────────────────────────────────────────────────────────────────────────────┘
       [ Delete ]  [ Resize ]  [  Quit  ]  [  Type  ]  [  Help  ]  [  Write ]  [  Dump  ]


                     Write partition table to disk (this might destroy data)

 

Link to comment
44 minutes ago, BVD said:

Also, why would we want btrfs on top of zfs?

At the time, I couldn't start with my docker.img if the file was set onto a ZFS partition.

 

I switched to a XFS ZVOL since. I'll try to run it back onto a ZFS partition.

Edited by gyto6
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.