ZFS plugin for unRAID


steini84

Recommended Posts

Hi Everyone,

 

I just wanted to share you a problematic trouble I've met.

 

For some unknown reasons, I was no more able to load some part of the Unraid GUI and some dockers's been stopped. I've tried to generate a diagnostic file in the settings, and it's has display in windows a writing to one of my NVMe drive, has it it was executing an operation on it.

 

The problem is that the system was no more responsive and after an hour, I decided to reset the system. After what, I've lost my zpools because the NVMe Drives was no more recognized as ZFS disks.

 

He's the configuration of the ZFS Pool before the lost of the NVMe's drives partition.

 

Quote

zpool status fastraid
  pool: fastraid
 state: ONLINE
config:

        NAME                                      STATE     READ WRITE CKSUM
        fastraid                                  ONLINE       0     0     0
          raidz1-0                                ONLINE       0     0     0
            wwn-SSD1                ONLINE       0     0     0
            wwn-SSD2                ONLINE       0     0     0
            wwn-SSD3                ONLINE       0     0     0
            wwn-SSD4                ONLINE       0     0     0
        special
          mirror-2                                ONLINE       0     0     0
            nvme-CT2000P5SSD8_XX-part1  ONLINE       0     0     0
            nvme-CT2000P5SSD8_YY-part1  ONLINE       0     0     0
        logs
          mirror-3                                ONLINE       0     0     0
            nvme-CT2000P5SSD8_XX-part2  ONLINE       0     0     0
            nvme-CT2000P5SSD8_YY-part2  ONLINE       0     0     0

 

This not the first time that I've begin to lost control of my unraid server, but until now I didn't lost anything. All I can say, it's begun to appear with the version 2.0 of the plugin, but maybe that it's linked to something else?

 

rohrer-enard.fr-diagnostics-20220220-1151.zip

Edited by gyto6
Link to comment

All I can see in the logs are the following looks like issues with nvme drives.

 

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme1: Device not ready; aborting reset, CSTS=0x1

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme1: Removing after probe failure status: -19

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme0: Removing after probe failure status: -19

Link to comment
3 hours ago, SimonF said:

All I can see in the logs are the following looks like issues with nvme drives.

 

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme1: Device not ready; aborting reset, CSTS=0x1

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme1: Removing after probe failure status: -19

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme0: Device not ready; aborting reset, CSTS=0x1

Feb 20 11:28:25 rohrer-enard kernel: nvme nvme0: Removing after probe failure status: -19

That's what I've seen too.

 

I disconnect the server from power. Plugged it back and...

 

image.png.3f06aa02d290523c9dd43e7184030ecf.png

 

I'll check if I can update my HBA card's firmware.

 

Sorry, everything is fine.. And thanks for your work on ZFS Plugin. ☺️

Link to comment
Hi Everyone,
 
I just wanted to share you a problematic trouble I've met.
 
For some unknown reasons, I was no more able to load some part of the Unraid GUI and some dockers's been stopped. I've tried to generate a diagnostic file in the settings, and it's has display in windows a writing to one of my NVMe drive, has it it was executing an operation on it.
 
The problem is that the system was no more responsive and after an hour, I decided to reset the system. After what, I've lost my zpools because the NVMe Drives was no more recognized as ZFS disks.
 
He's the configuration of the ZFS Pool before the lost of the NVMe's drives partition.
 
zpool status fastraid
  pool: fastraid
 state: ONLINE
config:
        NAME                                      STATE     READ WRITE CKSUM
        fastraid                                  ONLINE       0     0     0
          raidz1-0                                ONLINE       0     0     0
            wwn-SSD1                ONLINE       0     0     0
            wwn-SSD2                ONLINE       0     0     0
            wwn-SSD3                ONLINE       0     0     0
            wwn-SSD4                ONLINE       0     0     0
        special
          mirror-2                                ONLINE       0     0     0
            nvme-CT2000P5SSD8_XX-part1  ONLINE       0     0     0
            nvme-CT2000P5SSD8_YY-part1  ONLINE       0     0     0
        logs
          mirror-3                                ONLINE       0     0     0
            nvme-CT2000P5SSD8_XX-part2  ONLINE       0     0     0
            nvme-CT2000P5SSD8_YY-part2  ONLINE       0     0     0
 
This not the first time that I've begin to lost control of my unraid server, but until now I didn't lost anything. All I can say, it's begun to appear with the version 2.0 of the plugin, but maybe that it's linked to something else?
 
rohrer-enard.fr-diagnostics-20220220-1151.zip

Are you keeping your docker.img on the zfs array? It can cause the lockup you are describing.


Sent from my iPhone using Tapatalk
Link to comment

I think that I've found a way around.

 

After many researches, I'm not the first having trouble with the P5 from Crucial. To me, the problem is that it tends to disappear for no reason. So I've found the same troubleshoot on this forum and tested @JorgeB's solution. Link

 

He/She's been suggesting to disable IOMMU (AMD), or Intel VT-d (INTEL) for my old motherboard.

 

As I'm not doing virtualization for now, I find it efficient to just disable IOMMU. My Crucial are back again.

 

I'll be back if I'm having this same trouble again.

 

Thanks a lot for your help.

Edited by gyto6
Link to comment

I have 4 x 4tb ironwolf drives in raidz0. my upload speed to the share is max 25 MB/s and download is max 130 MB/s over a 10gb link. I have checked with iperf that I can saturate the link from unraid and towards unraid. can someone please suggest what I can do to isolate the issue? Thanks

Link to comment

What kind of controllers?  I assume encryption is off and the cpu you have is reasonable?  If that is the typical target file size Have you tried setting recordsize=1M on the dataset?  There are quite a few optimisations you can make but that’s probably the most obvious one for a large file. 
 

You can also try ensure you’ve got your ashift set correctly, perhaps it’s not?  
 

kinda guessing so far to be honest. 
 

If that’s still the same you could post your zpool get all and zfs get all might show something else. 

Link to comment
16 hours ago, Marshalleq said:

What kind of controllers?  I assume encryption is off and the cpu you have is reasonable?  If that is the typical target file size Have you tried setting recordsize=1M on the dataset?  There are quite a few optimisations you can make but that’s probably the most obvious one for a large file. 
 

You can also try ensure you’ve got your ashift set correctly, perhaps it’s not?  
 

kinda guessing so far to be honest. 
 

If that’s still the same you could post your zpool get all and zfs get all might show something else. 

I am running on poweredge t430 with 2 x E5-2630 v3, 80gb ram. encryption is enabled. The real use case of this would be to run a vsphere datastore so not massive files being written all the time. controller is perc h330. this is output of command

 

root@UnRAID:~# zpool get all
NAME     PROPERTY                       VALUE                          SOURCE
citadel  size                           14.5T                          -
citadel  capacity                       40%                            -
citadel  altroot                        -                              default
citadel  health                         ONLINE                         -
citadel  guid                           14517773413838527564           -
citadel  version                        -                              default
citadel  bootfs                         -                              default
citadel  delegation                     on                             default
citadel  autoreplace                    off                            default
citadel  cachefile                      -                              default
citadel  failmode                       wait                           default
citadel  listsnapshots                  off                            default
citadel  autoexpand                     off                            default
citadel  dedupratio                     1.00x                          -
citadel  free                           8.64T                          -
citadel  allocated                      5.91T                          -
citadel  readonly                       off                            -
citadel  ashift                         0                              default
citadel  comment                        -                              default
citadel  expandsize                     -                              -
citadel  freeing                        0                              -
citadel  fragmentation                  2%                             -
citadel  leaked                         0                              -
citadel  multihost                      off                            default
citadel  checkpoint                     -                              -
citadel  load_guid                      9965434910978594573            -
citadel  autotrim                       off                            default
citadel  compatibility                  off                            default
citadel  feature@async_destroy          enabled                        local
citadel  feature@empty_bpobj            active                         local
citadel  feature@lz4_compress           active                         local
citadel  feature@multi_vdev_crash_dump  enabled                        local
citadel  feature@spacemap_histogram     active                         local
citadel  feature@enabled_txg            active                         local
citadel  feature@hole_birth             active                         local
citadel  feature@extensible_dataset     active                         local
citadel  feature@embedded_data          active                         local
citadel  feature@bookmarks              enabled                        local
citadel  feature@filesystem_limits      enabled                        local
citadel  feature@large_blocks           enabled                        local
citadel  feature@large_dnode            enabled                        local
citadel  feature@sha512                 enabled                        local
citadel  feature@skein                  enabled                        local
citadel  feature@edonr                  enabled                        local
citadel  feature@userobj_accounting     active                         local
citadel  feature@encryption             enabled                        local
citadel  feature@project_quota          active                         local
citadel  feature@device_removal         enabled                        local
citadel  feature@obsolete_counts        enabled                        local
citadel  feature@zpool_checkpoint       enabled                        local
citadel  feature@spacemap_v2            active                         local
citadel  feature@allocation_classes     enabled                        local
citadel  feature@resilver_defer         enabled                        local
citadel  feature@bookmark_v2            enabled                        local
citadel  feature@redaction_bookmarks    enabled                        local
citadel  feature@redacted_datasets      enabled                        local
citadel  feature@bookmark_written       enabled                        local
citadel  feature@log_spacemap           active                         local
citadel  feature@livelist               enabled                        local
citadel  feature@device_rebuild         enabled                        local
citadel  feature@zstd_compress          enabled                        local
citadel  feature@draid                  disabled                       local

Link to comment
On 2/4/2022 at 7:40 PM, Iker said:

@JJJonesJr33 @diannao The fixed version is already live, as I'm planning a new version with additional features, the fix was applied to the current version; what that means is that you have to uninstall and install the plugin again in order to get the version with the fix.

 

Thanks for your help!.

 

I am up and running with the new update! Thank you! I just wish that the unraid forums would of sent me an email when you replied :D
image.thumb.png.13d0250c0c929f7605a96e8d317fe604.png

  • Like 1
Link to comment

Hi

Did you guys make a new update?
Cause the last time im moved my docker form the btrfs image to  a ZFS directory and now out of the blue i cant update or remove or start docker containers anymore?!

I even deleted the directory (zfs destroy -r) and reinstalled all dockers.. after  1 day i had the exact same issue again.

 

Quote

Execution error

Image can not be deleted, in use by other container(s)

 

Has someone a solution?

Link to comment
25 minutes ago, PyCoder said:

Did you guys make a new update?

No.

 

25 minutes ago, PyCoder said:

Has someone a solution?

Please see the first post under "unRAID settings".

I think this is one of the known issues when the Docker image or path is on a ZFS filesystem.

 

Maybe @steini84 has some more information about this.

Link to comment
1 hour ago, ich777 said:

No.

 

Please see the first post under "unRAID settings".

I think this is one of the known issues when the Docker image or path is on a ZFS filesystem.

 

Maybe @steini84 has some more information about this.

 

Yeah but there was also an issue with docker.img on ZFS with update 2.0 or 2.1 thats why I changed it to directory which worked now for weeks till 2 days ago.

Hmmm, I'll switch back to docker.iso if that doesn't work I'll try zvol with ext4.

 

lets try :)

 

Edit: docker.iso on zfs blocked /dev/loop and docker in zfs directory f* ups contrainers.

 

My solution with only ZFS:
zfs create -V 30gb pool/docker
mkfs.ext4 /dev/zvol/pool/docker
echo "mount /dev/zvol/pool/docker /mnt/foobar" >> /boot/config/go

 

Working like a charm :)


 

Edited by PyCoder
solution
Link to comment
On 2/18/2022 at 6:53 PM, Stokkes said:

 

Everything is pretty much already there, seems like it would make the most sense to hook into ZED. The plugin includes the notification for scrubs, I'm just suggesting we add notifications for state change as well and make it part of the package. 

 

On 2/18/2022 at 7:56 PM, ich777 said:

Sure thing, but this needs to be tested and a custom script needs to be made.

The last thing that I want is that you get two or more messages when one operation finished for example...

 

I only built the notification through ZED for a finished scrub into the plugin because it is actually trivial to do a scrub after a unclean shutdown and it is much nicer that you get a notification when it's finished.

 

If now users want more notifications/features maybe a WebUI for the ZFS plugin is necessary to turn on/off notifications/features, but this means serious work and I'm also not to sure if this will actually make sense because ZFS is sometime in the near future built into unRAID itself.

 

Also keep in mind you can add your own script(s) for notifications you want to ZED, I also don't know what makes more sense, what's your opinion about that @steini84?

 

On 2/18/2022 at 8:57 PM, steini84 said:


What I think would make most sense is to have the zfs plug-in as vanilla as possible and extra functionality to a companion plugin. Then the companion plug-in would continue to work when zfs becomes native emoji41.png


Sent from my iPhone using Tapatalk

 

So I have been looking into this today after not receiving notifications through zed after the latest ZFS for unRAID update.

The functionality is already there, the new update 2.1 just disabled it all...

The plugin runs zed as "zed -d /etc/zfs/zed.d", where -d is the directory that contains the enabled ZEDLETS (actions that get triggered upon an event happening). The problem is, this directory now only contains the scrub_finish-notify.sh deployed through this plugin.

This means, all other event handlers are disabled! The default ZEDLETS and rc file are located in /usr/etc/zfs/zed.d.

If you dont run zed using the -d param, it will use these and it will just work if you adjust your zed.rc file, for example to send notifications through the dynamix notify script.

I think this change in the latest version is not a good solution, since it basically crippled the default setup which was working perfectly before

 

This is my zed.rc file

ZED_EMAIL_ADDR="root@unRAID"
ZED_EMAIL_PROG="/usr/local/emhttp/plugins/dynamix/scripts/notify"
ZED_EMAIL_OPTS="-i warning -e 'ZFS event daemon' -s '@SUBJECT@' -d \"\`cat $pathname\`\""
ZED_NOTIFY_INTERVAL_SECS=60

ZED_NOTIFY_VERBOSE=1

# These are kept at their default value
ZED_USE_ENCLOSURE_LEDS=0
ZED_SYSLOG_TAG="zed"
ZED_SYSLOG_SUBCLASS_EXCLUDE="history_event"

 

So, just starting zed without any additional parameters and using the zed.rc file from above, you will receive notifications about everything going on

Edited by 0xcd0e
Added zed.rc
Link to comment
5 minutes ago, 0xcd0e said:

scrub_finish-notify.sh

Exactly.

 

5 minutes ago, 0xcd0e said:

If you dont run zed using the -d param, it will just use these and it will just work if you adjust your zed.rc file, for example to send notifications through the dynamix notify script.

I think this change in the latest version is not a good solution, since it basically crippled the default setup which was working perfectly before

This was done by intention.

You can always disable the scrub on a unclean shutdown which will remove everything related to the notifications via ZED when you issue this command from a Unraid terminal:

sed -i '/unclean_shutdown_scrub=/c\unclean_shutdown_scrub=false' "/boot/config/plugins/unRAID6-ZFS/settings.cfg"

and reboot afterwards.

 

If you do this it is all reverted back to default and you can setup notifications as you like via ZED.

  • Thanks 1
Link to comment
19 minutes ago, 0xcd0e said:

So, just starting zed without any additional parameters and using the zed.rc file from above, you will receive notifications about everything going on

I know, I crawled through the entire documentation but that was not the goal from this modification and not everyone needs every notification because these can get a bit annoying if you leave it at default and I really don't like how they look like or at least how the are formatted.

 

That's why I built the function to disable the scrub and the functionality entirely.

 

EDIT: Also keep in mind that somewhere in the near future ZFS will be integrated to Unraid natively and I really don't know how this will be set up.

Link to comment
29 minutes ago, ich777 said:

I know, I crawled through the entire documentation but that was not the goal from this modification and not everyone needs every notification because these can get a bit annoying if you leave it at default and I really don't like how they look like or at least how the are formatted.

 

That's why I built the function to disable the scrub and the functionality entirely.

 

EDIT: Also keep in mind that somewhere in the near future ZFS will be integrated to Unraid natively and I really don't know how this will be set up.

 

Yeah I understand your intention and what this feature is meant to do, I was just a bit concerned because this new default setting disabled all of my ZEDLETS^^

Thank you for continuing to build on this, I'm very excited for everything that is to come on unRAID

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.