ZFS Array transfers very slow


Go to solution Solved by 905jay,

Recommended Posts

Hi Everyone, 

Was wondering if someone could help point me in the right direction. 

My array disks were all xfs as default, and using unbalance to move data on the disks was moving at about 30mb/s.

I was doing this with the intention of converting the array disks to zfs with unraid 6.12.3

 

Enabling turbo write didn't seem to solve the problem.

 

Now I am trying to redistribute the data a little more evenly across the array disks, all of which are zfs. 

 

Moving about 700G of data has taken so far about 9.5 - 10hrs.

 

I have 2 Parity disks - Seagate IronWolf 8TB 

For the array I have 4 disks - Seagate Ironwolf 8TB

 

They are connected to a dual port LSI HBA SAS-9211-8i flashed in IT mode.

 

Diagnostics attached. I'd be happy to elaborate more but I didn't want to turn this into a novel :) 

 

EDIT: added the HBA model ( LSI SAS-9211-8i)

 

Screenshot from 2023-08-28 23-02-49.png

unraid-diagnostics-20230828-2303.zip

Edited by 905jay
Link to comment

Be aware that "zfs in the array" is currently a real killer for UNRAID. There is a bug that slows down everything to snail speed. It does not hit everyone, but if you have it, you are alone.

The only thing that "worked" so far (for me and others) is to revert to xfs and finally "modprobe -r zfs" to get rid completely from this evildoer...

 

BTW: TWO parity disks for just 4 data drives??? sounds a bit like overkill to me...

Edited by MAM59
Link to comment
9 hours ago, MAM59 said:

Be aware that "zfs in the array" is currently a real killer for UNRAID. There is a bug that slows down everything to snail speed. It does not hit everyone, but if you have it, you are alone.

The only thing that "worked" so far (for me and others) is to revert to xfs and finally "modprobe -r zfs" to get rid completely from this evildoer...

 

BTW: TWO parity disks for just 4 data drives??? sounds a bit like overkill to me...

 

2 parity drives isn't overkill. My use case for unRAID is much more than downloaded Linux torrents :)

I have moved off of OneDrive /Google Drive /Google Photos. This server has my life on it and irreplaceable memories and documents.  

Link to comment
1 minute ago, 905jay said:

Reverting back to XFS isn't a viable option considering how long these transfers are taking. But these transfer speeds are nowhere near to being acceptable. 

Have I misconfigured something that would be causing these speeds?

You have done nothing wrong. And these crawl speeds will continue even after the transfer is finished someday.

"isnt a viable option" isnt a viable option 🙂 Only the way back guarantees you normal speeds again someday (it took me 2 weeks to clean up that zfs mess again)

 

Link to comment
Just now, 905jay said:

 

2 parity drives isn't overkill. My use case for unRAID is much more than downloaded Linux torrents :)

I have moved off of OneDrive /Google Drive /Google Photos. This server has my life on it and irreplaceable memories and documents.  

Note that having 2 parity drives does not protect you from data loss - it just provides protection against drives failing.   There are lots of other ways to lose data.  You still need a robust backup strategy for anything important.   It may be that the second parity drive is better used as an off-array backup.

Link to comment
1 minute ago, MAM59 said:

You have done nothing wrong. And these crawl speeds will continue even after the transfer is finished someday.

"isnt a viable option" isnt a viable option 🙂 Only the way back guarantees you normal speeds again someday (it took me 2 weeks to clean up that zfs mess again)

 

 

Even with XFS my speeds were only around 30mb/s

This was the reason I moved to ZFS to begin with, was because of the supposed speed gains. 

 

So if XFS was giving me 30mb/s
and ZFS is giving me under 20mb/s

 

What is the actual solutions here?

 

Not saying you're wrong, but I don't feel the advice is entirely accurate

Link to comment
6 minutes ago, itimpi said:

Note that having 2 parity drives does not protect you from data loss - it just provides protection against drives failing.   There are lots of other ways to lose data.  You still need a robust backup strategy for anything important.   It may be that the second parity drive is better used as an off-array backup.

 

I completely understand that and I do have a backup strategy in place. 

But unless there is a compelling reason to remove one, I see no reason to do so.

I don't want to mess with the parity or the array in that regard. 
I was previously using 2 WD RED 6TB Parity disks, and I had 2 RED 6TB, 2 RED 4TB, 2 RED 3TB disks, and I had 2 of them fail at the exact same time. WD took close to a year to send me the warrantied disks and it caused me HUGE headaches.

 

Long winding story, but I have experience losing 2 disks at the same time, and it was a nightmare for me. 

And knowing that it's happened once, I know it can happen again. 

Edited the note above as after I read it I realized it sounded snippy on my part without the additional context (wasn't trying to be a dick) :)

Edited by 905jay
added context to the initial message
Link to comment
1 minute ago, 905jay said:

But unless there is a compelling reason to remove one, I see no reason to do so.

one reason is speed. write speed is cut down by ~50% per parity disk. So you end up with around 25% of the max with 2 parity disks

 

The next reason is "parity + zfs dont go well together". Zfs does not know anything about the parity disk, worst case are many data drives with single zfs each (not an zfs array but every disk with its own zfs file system). These ZFS drives have no clue about each other (and the parities too of course). ZFS is allocating new Blocks quite randomly (not sequential), so it happens that D1 steps to sector 3444, D2 to sector 9999, D3 to 123123 and so on. Poor Parities have to follow these steps to keep up with the data. This results in an enourmous amount of stepping (watch the temperature of your parity, you will see a difference to the data drives). Of course this will slow down the data writes greatly too, even down to 0% if io is overrun.

The cleverness of ZFS and the parity of UNRAID give a real bad couple.

 

Ok, still not knowing why you were down to 30/Mb with xfs. Jorge should take a look at your diagnostics I think, he is the driver/disk/controller specialist here

 

 

Link to comment
2 minutes ago, MAM59 said:

one reason is speed. write speed is cut down by ~50% per parity disk. So you end up with around 25% of the max with 2 parity disks

 

The next reason is "parity + zfs dont go well together". Zfs does not know anything about the parity disk, worst case are many data drives with single zfs each (not an zfs array but every disk with its own zfs file system). These ZFS drives have no clue about each other (and the parities too of course). ZFS is allocating new Blocks quite randomly (not sequential), so it happens that D1 steps to sector 3444, D2 to sector 9999, D3 to 123123 and so on. Poor Parities have to follow these steps to keep up with the data. This results in an enourmous amount of stepping (watch the temperature of your parity, you will see a difference to the data drives). Of course this will slow down the data writes greatly too, even down to 0% if io is overrun.

The cleverness of ZFS and the parity of UNRAID give a real bad couple.

 

Ok, still not knowing why you were down to 30/Mb with xfs. Jorge should take a look at your diagnostics I think, he is the driver/disk/controller specialist here

 

 

 

I unfortunately don't have any diagnostics from before I converted all the array disks to ZFS. 

 

So will these quirks be fixed in a later release in terms in terms of Parity and ZFS playing nicely together?

Or is this just a nature of the beast and I should most certainly move back to XFS?

Would it be better to create BTRFS on the array disks instead of XFS?

 

Link to comment
1 minute ago, 905jay said:

So will these quirks be fixed in a later release in terms in terms of Parity and ZFS playing nicely together?

Or is this just a nature of the beast and I should most certainly move back to XFS?

Dunno, I have no connection to limetech. I just fell into the same trap as you did now (but my speed was ~80Mb/s down from normal 280MB/s)

 

The stepping problem cannot be fixed unless they find a way to get the free running zfs disks synced somehow. But this would mean you have to build a RAIDZ array and can forget about UNRAID's parity completely (and lose the freedom to add disks of any size (...up to the size of the parity drive). So this would kill their own business.

 

My hope for zfs was that it contains a read cache, but it does not help if anything else slows down to a slimy snail...

 

What works is zfs in a pool device (but again, there is no real benefit as pools are on SSDs/NVMes usually, fast enough without tricks).

 

BTRFS is an option but some reports here give it a quite bad reputation (data loss). ZFS was introduced to replace BTFRS in UNRAID.

 

So XFS is the only real and safe harbour currently (my data is valuable too, I don't take risks with them).

 

Maybe Jorge finds the knot to untie your machine.

 

  • Thanks 1
Link to comment
1 minute ago, MAM59 said:

Dunno, I have no connection to limetech. I just fell into the same trap as you did now (but my speed was ~80Mb/s down from normal 280MB/s)

 

The stepping problem cannot be fixed unless they find a way to get the free running zfs disks synced somehow. But this would mean you have to build a RAIDZ array and can forget about UNRAID's parity completely (and lose the freedom to add disks of any size (...up to the size of the parity drive). So this would kill their own business.

 

My hope for zfs was that it contains a read cache, but it does not help if anything else slows down to a slimy snail...

 

What works is zfs in a pool device (but again, there is no real benefit as pools are on SSDs/NVMes usually, fast enough without tricks).

 

BTRFS is an option but some reports here give it a quite bad reputation (data loss). ZFS was introduced to replace BTFRS in UNRAID.

 

So XFS is the only real and safe harbour currently (my data is valuable too, I don't take risks with them).

 

Maybe Jorge finds the knot to untie your machine.

 

@JorgeB would you be able to provide any input on top of the information provided by @MAM59 ?

If the painful move back to XFS is a necessary evil, I'll start the process right away.

 

Please validate the order of operations be for my own sanity, clarity and consistency: 

  1. Remove 1 parity disk
  2. Re-run a parity sync (is this step necessary?)
  3. Allocate the removed parity disk to the array as Disk 5 with XFS
  4. Evacuate ZFS Disk1 to XFS Disk5
  5. Format Disk 1 to XFS
  6. Continue the disk evacuation and format process until all array disks are back to XFS

Am I missing anything? 

 

 

 

 

Link to comment
36 minutes ago, 905jay said:

This was the reason I moved to ZFS to begin with, was because of the supposed speed gains.

 

The speed gains are when it is used in a multi-drive pool outside the main array.   If you are interest in this then it is quite likely that you will create an array of HDD using ZFS and not bother with the main Unraid array.   Although this will perform faster there are downsides such as having all drives always spinning and less flexibility on expansion.

 

Have you tried setting Turbo write for the array?   That would normally get you far better speeds than the 30MB/s that you quote (albeit at the expense of having all drives spun up).

 

Link to comment
40 minutes ago, 905jay said:

This was the reason I moved to ZFS to begin with, was because of the supposed speed gains. 

This is incorrect, zfs is faster with pools, with the array it should be the same speed as the other filesystem, due to the existing issue it's slower, but it will never be faster.

  • Like 1
Link to comment
1 minute ago, itimpi said:

 

The speed gains are when it is used in a multi-drive pool outside the main array.   If you are interest in this then it is quite likely that you will create an array of HDD using ZFS and not bother with the main Unraid array.   Although this will perform faster there are downsides such as having all drives always spinning and less flexibility on expansion.

 

Have you tried setting Turbo write for the array?   That would normally get you far better speeds than the 30MB/s that you quote (albeit at the expense of having all drives spun up).

 

 

Yes I have enabled turbo write and set it to reconstruct write

Link to comment
13 minutes ago, JorgeB said:

This is incorrect, zfs is faster with pools, with the array it should be the same speed as the other filesystem, due to the existing issue it's slower, but it will never be faster.

Thanks @JorgeB and @MAM59

 

 

Initially with the previous setup (below) my transfer speeds were around 30mb/s when moving data around/between the disks

2 Ironwolf 8TB Parity

4 Ironwolf 8TB Array (XFS)

 

This was the reason I moved to ZFS to begin with, was because those speeds were utterly unacceptable. Now relizing that this was in-fact a mistake I can begin the long process of reverting all the array disks to XFS. 

 

So now moving all the array disks back to XFS, and dropping down to 1 Parity disk, what else do I need to do in your collective opinions to realize something close to proper transfer speeds? 

 

Does anything here look off to you folks? Or is there something I need to tune, or perhaps I may have changed at some point in the past that's messing with my speeds?

image.thumb.png.e90520ad6eb2cb645805379997e9a18f.png

Edited by 905jay
added screenshot of disk settings
Link to comment
52 minutes ago, MAM59 said:

one reason is speed. write speed is cut down by ~50% per parity disk. So you end up with around 25% of the max with 2 parity disks

 

The next reason is "parity + zfs dont go well together". Zfs does not know anything about the parity disk, worst case are many data drives with single zfs each (not an zfs array but every disk with its own zfs file system). These ZFS drives have no clue about each other (and the parities too of course). ZFS is allocating new Blocks quite randomly (not sequential), so it happens that D1 steps to sector 3444, D2 to sector 9999, D3 to 123123 and so on. Poor Parities have to follow these steps to keep up with the data. This results in an enourmous amount of stepping (watch the temperature of your parity, you will see a difference to the data drives). Of course this will slow down the data writes greatly too, even down to 0% if io is overrun.

The cleverness of ZFS and the parity of UNRAID give a real bad couple.

 

Ok, still not knowing why you were down to 30/Mb with xfs. Jorge should take a look at your diagnostics I think, he is the driver/disk/controller specialist here

 

 

 

I just wanted to mention that in terms of the temperatures I don't see much difference, even during this current data move I'm doing

image.thumb.png.27a00ec7ac050fe49c47f3dc021591ef.png

 

image.thumb.png.8d3f3f458f0e8cbcc3374706c7ffc550.png

Link to comment

what are those 2 "cache" drives good for?

For "cache" you should use fast (as fast as possible) drives and not slower ones than the one of the main array. (your main disks turn around with 7200 RPM whereas your cache is slow with 5640 RPM, thats just pain and no gain)

 

So its "slow down" instead of "speed up"

 

stop the copy, turn off cache, restart the copy (using the same parameters as before, it will skip already moved files and just continue from where it was stopped) and see if there is a speed improvement.

 

Update: the above maybe a bit misleading. The slow cache won't help you with your copy. UNRAID uses the cache only for writing NEW files from the Network. Once Mover did his evil ( 🙂 ) job, they reside on the array. The idea is that the cache is fast and can keep up with LAN speed even for huge files. Your cache is the opposite.

 

Edited by MAM59
Link to comment
6 minutes ago, MAM59 said:

what are those 2 "cache" drives good for?

For "cache" you should use fast (as fast as possible) drives and not slower ones than the one of the main array. (your main disks turn around with 7200 RPM whereas your cache is slow with 5640 RPM, thats just pain and no gain)

 

So its "slow down" instead of "speed up"

 

stop the copy, turn off cache, restart the copy (using the same parameters as before, it will skip already moved files and just continue from where it was stopped) and see if there is a speed improvement.

 

 

The two WD RED 6TB drives are used for my ProxMox backups that are shipped to unraid over a 10G DAC.

They aren't used in unraid in any cache capacity other than a place to sit until backups kick off at 3am. 

 

my appdata and downloads sit on a Samsung 980 Pro (not pictured)  

Link to comment
1 minute ago, 905jay said:

The two WD RED 6TB drives are used for my ProxMox backups that are shipped to unraid over a 10G DAC.

They aren't used in unraid in any cache capacity other than a place to sit until backups kick off at 3am. 

 

my appdata and downloads sit on a Samsung 980 Pro (not pictured)  

ah ok, so I was mislead by the graphics and the naming scheme.

 

980 Pro is fast enough for 10G, good choice!

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.