frodr Posted March 5 Share Posted March 5 (edited) The copy speed on my servers are generally quite poor. For Server1 in the signature. Copy a single file from 10 x sata ssd raidz2 to 2 x NVMe ssd in btrfs raid 0 should be significantly better than below 400MB/s. That's the speed from one single sata ssd. To my understanding the speed should be: 10 x 550 MB/s x 50% = 2750 MB/s. Right/wrong? Very nice to get views on this topic. // kjell-diagnostics-20240305-1820.zip Edited March 8 by frodr Refrased Quote Link to comment
frodr Posted March 8 Author Share Posted March 8 (edited) Can anybody share some light on this topic, please? If the speed above is what's to be expected, well, then I now that. And now need chasing improvements. Cheers, Edited March 8 by frodr spell check Quote Link to comment
JorgeB Posted March 8 Share Posted March 8 On 3/5/2024 at 6:06 PM, frodr said: Copy a single file from 10 x sata ssd raidz2 to 2 x NVMe ssd in btrfs raid 0 Not clear if you are talking locally or over SMB? This looks local but the title mentions SMB. On 3/5/2024 at 6:06 PM, frodr said: To my understanding the speed should be: 10 x 550 MB/s x 50% = 2750 MB/s. Right/wrong? That's never a straight calculation like that, it can vary with a lot of things, including devices used, what you are using to make the copy, and the CPU can have a large impact, single copy operations are single threaded, with a recent fast CPU you should see between 1 and 2 GB/s. Quote Link to comment
frodr Posted March 8 Author Share Posted March 8 1 hour ago, JorgeB said: Not clear if you are talking locally or over SMB? This looks local but the title mentions SMB. That's never a straight calculation like that, it can vary with a lot of things, including devices used, what you are using to make the copy, and the CPU can have a large impact, single copy operations are single threaded, with a recent fast CPU you should see between 1 and 2 GB/s. Sorry, its local speed, not over smb. The singel threaded performance of I7-13700 is quite good. Can you see/suggest any reason this performance is well below 1 - 2GB/s? Quote Link to comment
JorgeB Posted March 8 Share Posted March 8 38 minutes ago, frodr said: The singel threaded performance of I7-13700 is quite good. It is, I would expect 2GB/s+ without any device bottlenecks. Note that in my experience btrfs raid0 does not scale very well past a certain number of devices, I would try copying from one NVMe device to another, assuming they are good fast devices, and use pv instead, rsync is not built for speed: pv /path/to/large/file > /path/to/destination Quote Link to comment
frodr Posted March 8 Author Share Posted March 8 3 hours ago, JorgeB said: It is, I would expect 2GB/s+ without any device bottlenecks. Note that in my experience btrfs raid0 does not scale very well past a certain number of devices, I would try copying from one NVMe device to another, assuming they are good fast devices, and use pv instead, rsync is not built for speed: pv /path/to/large/file > /path/to/destination Ok, the test shows 420 - 480 MB/s from 2 x Kingston KC3000 M.2 2280 NVMe SSD 2TB in ZFS mirror to 2 x WD Black SN850P NVMe 1TB in btrfs Raid 0. Test both directions. What I forgot to tell you, and to remember myself, is that the NVMe drives sits on a HBA, HighPoint Rocket 1508. It is a pcie 4 with 8 ports M.2 HBA. Well, I now know the penalty for being able to populate 8 NVMe´s with good cooling in a W680 chipset mobo. Thanks for following along. My use case isn't that dependent on max NVMe speed. (But I would quickly change the mobo if Intel includes IGPU into higher I/O cpus as Xeon 2400/3400). Quote Link to comment
JorgeB Posted March 9 Share Posted March 9 That's much lower than I would expect, possibly something else going on. Type: lspci -vvv > /boot/lspci.txt Then attach that file here Quote Link to comment
frodr Posted March 9 Author Share Posted March 9 3 hours ago, JorgeB said: That's much lower than I would expect, possibly something else going on. Type: lspci -vvv > /boot/lspci.txt Then attach that file here lspci.txt Quote Link to comment
frodr Posted March 9 Author Share Posted March 9 7 hours ago, frodr said: lspci.txt 160.47 kB · 0 downloads I might have an idea. Intel Core/W680 supports 20 pcie lanes, right? In the server is hba (x16), nic (x4) and sata card (x4). I guess the hba might drop down to x8 effectively. Quote Link to comment
JorgeB Posted March 10 Share Posted March 10 LnkSta: Speed 16GT/s, Width x4 Link for all NVMe devices is not reporting as downgraded, but the HBA will have to share the bandwidth of a x16 slot, since 8 NVMe devices at x4 would require 32 lanes, I suggest installing a couple of devices on the board m.2 slots and restest, in case the HBA is not working correctly. Quote Link to comment
frodr Posted March 10 Author Share Posted March 10 1 hour ago, JorgeB said: LnkSta: Speed 16GT/s, Width x4 Link for all NVMe devices is not reporting as downgraded, but the HBA will have to share the bandwidth of a x16 slot, since 8 NVMe devices at x4 would require 32 lanes, I suggest installing a couple of devices on the board m.2 slots and restest, in case the HBA is not working correctly. The hba should run at x8 (slot7), not x16, as I have a pcie card the connecting pcie slot 4, Removing pcie card from the connecting slot (slot4), the hba in slot 7 should run at x16. Doing so the mobo startup stops before bios setting with code 94. I will try to solve this issue. Quote Link to comment
JorgeB Posted March 10 Share Posted March 10 43 minutes ago, frodr said: The hba should run at x8 (slot7), not x16 Either way it's not possible to have x32 lanes, still the performance should be much better if the x8 link was the only issue, but it could still be HBA related, it would be good to test with a couple of devices using the m.2 slots. Quote Link to comment
frodr Posted March 10 Author Share Posted March 10 34 minutes ago, JorgeB said: Either way it's not possible to have x32 lanes, still the performance should be much better if the x8 link was the only issue, but it could still be HBA related, it would be good to test with a couple of devices using the m.2 slots. I will test moving around m.2´s. Strange thing, the mobo hangs on code 94 if hba in slot7 and none in the connected slot4. When adding a NIC in slot 4, the mobo do not hang. Tried resetting CMOS, no change. Also tried a few bios setting without luck. I will have to address Supermicro Support. Quote Link to comment
shpitz461 Posted March 11 Share Posted March 11 Quote 0f:00.0 Mass storage controller: Broadcom / LSI PEX880xx PCIe Gen 4 Switch (rev b0) Is this your Rocket 1508? It shows full Gen4 16x connection: Quote LnkCap: Port #247, Speed 16GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <4us, L1 <32us LnkSta: Speed 16GT/s, Width x16 From what I understand, since PEX is used (bifurcation controller), all the drives share 16x, so in theory 4 drives will operate in full x4 speed, but more drives operating simulataneously will share 16x bandwitdth and will slow down, since you can't fit 32x bandwidth in a 16x-wide bus. Did you try to boot your machine with a live ubuntu or debian and see if you get the same performance? Did you try removing all other pcie cards and test with the rocket solo? How many drives have you populated in the controller? If not 8 drives, did you try to move the drives around to see if you get any better performance? i.e. slot 1 and 2, 2 and 3, etc... Quote Link to comment
frodr Posted March 11 Author Share Posted March 11 16 hours ago, shpitz461 said: Is this your Rocket 1508? It shows full Gen4 16x connection: From what I understand, since PEX is used (bifurcation controller), all the drives share 16x, so in theory 4 drives will operate in full x4 speed, but more drives operating simulataneously will share 16x bandwitdth and will slow down, since you can't fit 32x bandwidth in a 16x-wide bus. Did you try to boot your machine with a live ubuntu or debian and see if you get the same performance? Did you try removing all other pcie cards and test with the rocket solo? How many drives have you populated in the controller? If not 8 drives, did you try to move the drives around to see if you get any better performance? i.e. slot 1 and 2, 2 and 3, etc... Yes, Rocket 1508. Great that its running x16. I thought that is mobo slot7 only runs x8 when slot4 is populated. I will try Ubuntu or debian and move m.2´s around when a ZFS scrub test is done, tomorrow it seems. I have 6 x m.2 in the hba as when tested. Slot7 is the only slot available for x16. (Dear Intel, why can´t we have a proper HEDT motherboard with iGPU cpu´s?) Quote Link to comment
frodr Posted March 12 Author Share Posted March 12 Ok, I know a little bit more. The HBA only works with a pcie card in the slot that is connected to the X16 slot. This slot then only works at x8: LnkSta: Speed 16GT/s, Width x8 (downgraded) Removing the pcie card from the switched slot, the mobo hangs at code 94 pre bios. I have talked to Supermicro Support, which by the way responds very quickly these days, Highpoint products are not validated. But they are consulting bios engineer. This means that the HBA is running 6 x M.2´s at x8. And it is some switching between the slots as well. The copy test was done between 4 of the M.2´s, The last 2 is only sitting as unassigned devices. Quote Link to comment
Solution frodr Posted March 12 Author Solution Share Posted March 12 I removed the HBA card and mounted in the three M.2 slots on the mobo to get a baseline. cache: 2 x mirror ZFS nvme ssd tempdrive: 1 x btrfs nvme ssd kjell_ern: 10 x sata ssd in raidz2 The nvme drives seems to be ok. The reason for introducing the HBA in the first place, was to rebuild the storage pool "kjell_ern" (where the media library lives), to include nvme ssd´s as meta cache to improve write speeds. Unless Supermicro Support comes up with a fix, I abort the idea of a hba, and saves 30W on top. We can kinda call this case closed. Thanks for holding hand. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.