Poor copy disk speed.


Go to solution Solved by frodr,

Recommended Posts

The copy speed on my servers are generally quite poor.

 

For Server1 in the signature. Copy a single file from 10 x sata ssd raidz2 to 2 x NVMe ssd in btrfs raid 0 should be significantly better than below 400MB/s. That's the speed from one single sata ssd. To my understanding the speed should be: 10 x 550 MB/s x 50% = 2750 MB/s. Right/wrong? 

 

Very nice to get views on this topic. //

 

1823694194_Screenshot2024-03-05at03_01_41.png.9a1f1bd3d7d93bdd85d392dc7c90734f.png

 

 

 

kjell-diagnostics-20240305-1820.zip

Edited by frodr
Refrased
Link to comment
Posted (edited)

Can anybody share some light on this topic, please? If the speed above is what's to be expected, well, then I now that. And now need chasing improvements.

 

Cheers,

Edited by frodr
spell check
Link to comment
On 3/5/2024 at 6:06 PM, frodr said:

Copy a single file from 10 x sata ssd raidz2 to 2 x NVMe ssd in btrfs raid 0

Not clear if you are talking locally or over SMB? This looks local but the title mentions SMB.

 

On 3/5/2024 at 6:06 PM, frodr said:

To my understanding the speed should be: 10 x 550 MB/s x 50% = 2750 MB/s. Right/wrong?

That's never a straight calculation like that, it can vary with a lot of things, including devices used, what you are using to make the copy, and the CPU can have a large impact, single copy operations are single threaded, with a recent fast CPU you should see between 1 and 2 GB/s.

Link to comment
  • frodr changed the title to Poor copy disk speed.
1 hour ago, JorgeB said:

Not clear if you are talking locally or over SMB? This looks local but the title mentions SMB.

 

That's never a straight calculation like that, it can vary with a lot of things, including devices used, what you are using to make the copy, and the CPU can have a large impact, single copy operations are single threaded, with a recent fast CPU you should see between 1 and 2 GB/s.

 

Sorry, its local speed, not over smb.

 

The singel threaded performance of I7-13700 is quite good. Can you see/suggest any reason this performance is well below 1 - 2GB/s?

Link to comment
38 minutes ago, frodr said:

The singel threaded performance of I7-13700 is quite good.

It is, I would expect 2GB/s+ without any device bottlenecks.

 

Note that in my experience btrfs raid0 does not scale very well past a certain number of devices, I would try copying from one NVMe device to another, assuming they are good fast devices, and use pv instead, rsync is not built for speed:

 

pv /path/to/large/file > /path/to/destination

 

 

Link to comment
3 hours ago, JorgeB said:

It is, I would expect 2GB/s+ without any device bottlenecks.

 

Note that in my experience btrfs raid0 does not scale very well past a certain number of devices, I would try copying from one NVMe device to another, assuming they are good fast devices, and use pv instead, rsync is not built for speed:

 

pv /path/to/large/file > /path/to/destination

 

 

 

Ok, the test shows 420 - 480 MB/s from 2 x Kingston KC3000 M.2 2280 NVMe SSD 2TB in ZFS mirror to 2 x WD Black SN850P NVMe 1TB in btrfs Raid 0. Test both directions.

What I forgot to tell you, and to remember myself, is that the NVMe drives sits on a HBA, HighPoint Rocket 1508. It is a pcie 4 with 8 ports M.2 HBA. Well, I now know the penalty for being able to populate 8 NVMe´s with good cooling in a W680 chipset mobo. 

 

Thanks for following along. My use case isn't that dependent on max NVMe speed. (But I would quickly change the mobo if Intel includes IGPU into higher I/O cpus as Xeon 2400/3400).

 

Link to comment
LnkSta:    Speed 16GT/s, Width x4

 

Link for all NVMe devices is not reporting as downgraded, but the HBA will have to share the bandwidth of a x16 slot, since 8 NVMe devices at x4 would require 32 lanes, I suggest installing a couple of devices on the board m.2 slots and restest, in case the HBA is not working correctly.

Link to comment
1 hour ago, JorgeB said:
LnkSta:    Speed 16GT/s, Width x4

 

Link for all NVMe devices is not reporting as downgraded, but the HBA will have to share the bandwidth of a x16 slot, since 8 NVMe devices at x4 would require 32 lanes, I suggest installing a couple of devices on the board m.2 slots and restest, in case the HBA is not working correctly.

 

The hba should run at x8 (slot7), not x16, as I have a pcie card the connecting pcie slot 4, Removing pcie card from the connecting slot (slot4), the hba in slot 7 should run at x16. Doing so the mobo startup stops before bios setting with code 94. I will try to solve this issue.

Link to comment
43 minutes ago, frodr said:

The hba should run at x8 (slot7), not x16

Either way it's not possible to have x32 lanes, still the performance should be much better if the x8 link was the only issue, but it could still be HBA related, it would be good to test with a couple of devices using the m.2 slots.

Link to comment
34 minutes ago, JorgeB said:

Either way it's not possible to have x32 lanes, still the performance should be much better if the x8 link was the only issue, but it could still be HBA related, it would be good to test with a couple of devices using the m.2 slots.

 

I will test moving around m.2´s. Strange thing, the mobo hangs on code 94 if hba in slot7 and none in the connected slot4. When adding a NIC in slot 4, the mobo do not hang. Tried resetting CMOS, no change. Also tried a few bios setting without luck. I will have to address Supermicro Support. 

Link to comment
Quote

0f:00.0 Mass storage controller: Broadcom / LSI PEX880xx PCIe Gen 4 Switch (rev b0)

Is this your Rocket 1508? It shows full Gen4 16x connection:

Quote

        LnkCap:    Port #247, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <4us, L1 <32us
        LnkSta:    Speed 16GT/s, Width x16

From what I understand, since PEX is used (bifurcation controller), all the drives share 16x, so in theory 4 drives will operate in full x4 speed, but more drives operating simulataneously will share 16x bandwitdth and will slow down, since you can't fit 32x bandwidth in a 16x-wide bus.

 

Did you try to boot your machine with a live ubuntu or debian and see if you get the same performance?

Did you try removing all other pcie cards and test with the rocket solo?

How many drives have you populated in the controller? If not 8 drives, did you try to move the drives around to see if you get any better performance? i.e. slot 1 and 2, 2 and 3, etc...

Link to comment
16 hours ago, shpitz461 said:

Is this your Rocket 1508? It shows full Gen4 16x connection:

From what I understand, since PEX is used (bifurcation controller), all the drives share 16x, so in theory 4 drives will operate in full x4 speed, but more drives operating simulataneously will share 16x bandwitdth and will slow down, since you can't fit 32x bandwidth in a 16x-wide bus.

 

Did you try to boot your machine with a live ubuntu or debian and see if you get the same performance?

Did you try removing all other pcie cards and test with the rocket solo?

How many drives have you populated in the controller? If not 8 drives, did you try to move the drives around to see if you get any better performance? i.e. slot 1 and 2, 2 and 3, etc...

 

Yes, Rocket 1508. Great that its running x16. I thought that is mobo slot7 only runs x8 when slot4 is populated. 

 

I will try Ubuntu or debian and move m.2´s around when a ZFS scrub test is done, tomorrow it seems. 

I have 6 x m.2 in the hba as when tested. 

 

Slot7 is the only slot available for x16. (Dear Intel, why can´t we have a proper HEDT motherboard with iGPU cpu´s?)

Link to comment

Ok, I know a little bit more.

 

The HBA only works with a pcie card in the slot that is connected to the X16 slot. This slot then only works at x8:

LnkSta:    Speed 16GT/s, Width x8 (downgraded)

 

Removing the pcie card from the switched slot, the mobo hangs at code 94 pre bios. I have talked to Supermicro Support, which by the way responds very quickly these days, Highpoint products are not validated. But they are consulting bios engineer.

 

This means that the HBA is running 6 x M.2´s at x8. And it is some switching between the slots as well. The copy test was done between 4 of the M.2´s, The last 2 is only sitting as unassigned devices. 

 

Link to comment
  • Solution

I removed the HBA card and mounted in the three M.2 slots on the mobo to get a baseline. 

 

cache:        2 x mirror ZFS nvme ssd

tempdrive:  1 x btrfs nvme ssd

kjell_ern:     10 x sata ssd in raidz2

 

762459964_Screenshot2024-03-12at21_28_00.thumb.png.a5319a1a58c23a9c6841893ad07ac1dc.png

 

The nvme drives seems to be ok. 

 

The reason for introducing the HBA in the first place, was to rebuild the storage pool "kjell_ern" (where the media library lives), to include nvme ssd´s as meta cache to improve write speeds. Unless Supermicro Support comes up with a fix, I abort the idea of a hba, and saves 30W on top. 

 

We can kinda call this case closed. Thanks for holding hand.

 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.