Unoid Posted February 4 Share Posted February 4 (edited) Unraid 6.12.6 Epyc 7302, 16c 3.3ghz 256 GB pc3200 ecc pool in question: 4x4TB team group MP43 nvme pcie 3 4x4x4x4 bifurcation raid z1. 32GB Ram Cache. pool at about 50% use. Compression=on Scrub speeds range from 90MB/s to 250MB/s. shouldn't scrub speeds be a good bit faster? These Team group mp43's do 3000 read, 2400ish writes sequential. TLC NAND with DRAM. 5 hour 11 minutes for 6.3TB data used of 11.8TB pool: speedteam state: ONLINE scan: scrub repaired 0B in 05:11:58 with 0 errors on Thu Feb 1 06:11:59 2024 config: NAME STATE READ WRITE CKSUM speedteam ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 /dev/nvme0n1p1 ONLINE 0 0 0 /dev/nvme1n1p1 ONLINE 0 0 0 /dev/nvme2n1p1 ONLINE 0 0 0 /dev/nvme3n1p1 ONLINE 0 0 0 errors: No known data errors Edited February 5 by Unoid Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 That is low, this is what I get during a scrub, EPYC 7232P Quote Link to comment
Unoid Posted February 5 Author Share Posted February 5 (edited) I have compression=on, I'm running a scrub again to gather metrics. I don't see any extra logs to give me more insight. NVME temps are all under 50C, (I just added heatsinks to them) I captured the first 5 minutes of scrub AFTER it does the initial "indexing" or whatever operation: pool: speedteam state: ONLINE scan: scrub in progress since Mon Feb 5 10:30:29 2024 7.92T scanned at 0B/s, 188G issued at 421M/s, 7.92T total 0B repaired, 2.32% done, 05:20:51 to go Disk IO: Writes are all less than 150kb/s. Reads in screenshot. CPU is barely being taxed. mostly sitting around 3-10% across all threads, with occasional spikes to 50-100% Total CPU average under 6%. Edited February 5 by Unoid Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 50 minutes ago, Unoid said: I have compression=on That should not have a big impact, if any, but not sure what the issue could be, is the pool fast when copying for example from within the pool? Quote Link to comment
Unoid Posted February 5 Author Share Posted February 5 1 hour ago, JorgeB said: That should not have a big impact, if any, but not sure what the issue could be, is the pool fast when copying for example from within the pool? Last I tested it was doing many GB/s writes copying large sequential. Just ran a test copying a 9GB blue ray rip (h265) I have. rsync: sent 8,868,009,936 bytes received 35 bytes 311,158,244.60 bytes/sec Grafana only showing peak of 129MB/s write for the rsync. running a dd if=/dev/zero of=/mnt/user/speedtest oflag=direct bs=128k count=32k reported 685MB/s in terminal, but grafana showed 2.03GB/s write to one nvme. a VM with a vdisk mounted of the same zpool share running kdiskmark showed similar speeds to the screenshot in orig post. Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 dd is not a very good test, try copying one or more large files inside the pool, you can use Windows explorer for example, the data will be copied locally, it won't transverse the LAN, typically speeds should be around 1 to 2GB/s. Quote Link to comment
Unoid Posted February 5 Author Share Posted February 5 I copied a 27GB file to and from the same zpool mount. my desktop with smb mount to it is limited to 5GBps which this result is earily like reading and writing over the network making out the 5GBps. These speeds should be bottlenecked by write speed of around 2GB/s... Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 2 minutes ago, Unoid said: I copied a 27GB file to and from the same zpool mount. my desktop with smb mount to it is limited to 5GBps which this result is earily like reading and writing over the network making out the 5GBps. If you use Windows explorer (with Windows 8 or newer), the data won't use the network, the copy is made locally with Samba server side copy, like this, I'm currently using a gigabit LAN connection, so it would be 115MB/s max: Quote Link to comment
Unoid Posted February 5 Author Share Posted February 5 1 minute ago, JorgeB said: If you use Windows explorer (with Windows 8 or newer), the data won't use the network, the copy is made locally with Samba server side copy, like this, I'm currently using a gigabit LAN connection, so it would be 115MB/s max: I did the same on a windows 11 gaming desktop. same speed I showed Quote Link to comment
JorgeB Posted February 5 Share Posted February 5 So that suggests the pool may just be slow, and not just a scrub issue, could be device related, you'd need to test with different ones if possible. Quote Link to comment
Unoid Posted February 5 Author Share Posted February 5 (edited) JorgeB: Thank you for helping walking me through troubleshooting steps. At this point I'm going to have to set the shares to send data to my HDD main array and run mover. then remake the zpool and run more tests. extended smart test = 0 errors. PCI-E link is accurate at 8GT/s x4 lanes (PCI-E 3.0 nvme's in a carrier card in a PCI-E 4.0 x16 slot). I can't tell what the issue may be. When data is moved I'll try disk benchmarks on each nvme separately. Edited February 5 by Unoid Quote Link to comment
Unoid Posted February 6 Author Share Posted February 6 Random unraid question. the tunables for zfs in /sys/module/zfs/parameters/* Am I able to set each in the /boot/modprobe.d/zfs.conf I'm thinking of changing settings for ashift and default recordsize etc. Also can I use the CLI as root to zfs create? instead fo using the GUI? Quote Link to comment
JorgeB Posted February 6 Share Posted February 6 The default ashift should be fine, recordsize can be changed at any time, it will only affect newly written data: zfs set recordsize=1M pool_name I use 1M for all my pools, in my tests it performs better with large files, especially with raidz. Quote Link to comment
Unoid Posted February 6 Author Share Posted February 6 (edited) 5 hours ago, JorgeB said: The default ashift should be fine, recordsize can be changed at any time, it will only affect newly written data: zfs set recordsize=1M pool_name I use 1M for all my pools, in my tests it performs better with large files, especially with raidz. I've been doing a LOT of reading on zfs on NVME. my drives only expose 512 sector size, not 4k which seems weird for a newish PCI-E 3.0 4TB device. ashift=9 is what I've read 2^9=512. I have a spreadsheet of a lot of tests to run in different configurations. hopefully I'll find out what is slowing these nvme's down so horribly. JorgeB, If I create zfs vdevs/pools with various options in bash, does the /boot/ OS know how to persist what I did? That's why I asked if I need to modify the zfs.conf file on /boot Edited February 6 by Unoid Quote Link to comment
JorgeB Posted February 6 Share Posted February 6 7 minutes ago, Unoid said: ashift=9 is Correct, but Unraid uses by default ashift=12, which is 4K and the current default recommendation by ZFS. 10 minutes ago, Unoid said: JorgeB, If I create zfs vdevs/pools with various options in bash, does the /boot/ OS know how to persist what I did? The ashift used when creating the pool is always the one used, it cannot be changed after, recordsize can be changed at anytime (for new data) Quote Link to comment
Unoid Posted February 6 Author Share Posted February 6 35 minutes ago, JorgeB said: Correct, but Unraid uses by default ashift=12, which is 4K and the current default recommendation by ZFS. The ashift used when creating the pool is always the one used, it cannot be changed after, recordsize can be changed at anytime (for new data) May I ask what topography you have in your nvme zpool? z1? mirrored vdevs? how many disks Quote Link to comment
JorgeB Posted February 6 Share Posted February 6 I have one with a 5 device raiz1 and another with 4 devices, also raidz1, both using all default settings except recordsize=1M. Quote Link to comment
Unoid Posted February 6 Author Share Posted February 6 Are they 4k sector size disks? Quote Link to comment
JorgeB Posted February 6 Share Posted February 6 They are NMVe devices, they report 512B sectors, but the current recommendation AFAIK, is to always use ashift=12 with flash based devices. Quote Link to comment
JorgeB Posted February 6 Share Posted February 6 Even if the devices were really 512B it should not affect performance using a large ashift, though it could waste a little more space. Quote Link to comment
Unoid Posted February 7 Author Share Posted February 7 (edited) I ran tests after backing up data to HDD. I only played with raid type [0, z1, 2vdev mirror] I left ashift at 12 since that's the default the GUI uses. I changed recordsizees=[16k,128k,256k,1M] I noticed that per raid type running the fio command, the first test is at default pool of 128k, it takes a while to write the 8 job blocks, but every test with different recordsize set to the pool the 8 job chunks of 10GB were still written at the initial 128K recordsize. It introduces error into these results. fio command taken from reading this nvme article on benchmarking nvme and keeping ARC from fudging the numbers. https://pv-tech.eu/posts/common-pitfall-when-benchmarking-zfs-with-fio/ Sharing the results anyways: fio command: fio --rw=read --bs=1m --direct=1 --ioengine=libaio --size=10G --group_reporting --filename=/mnt/user/speedtest/bucket --name=job1 --offset=0G --name=job2 --offset=10G --name=job3 --offset=20G --name=job4 --offset=30G --name=job5 --offset=40G --name=job6 --offset=50G --name=job7 --offset=60G --name=job8 --offset=70G 4x4TB TeamGroup mp34 --- Type_(recordsize, ashift) r0_(16K,12): READ: bw=3049MiB/s (3197MB/s), 3049MiB/s-3049MiB/s (3197MB/s-3197MB/s), io=80.0GiB (85.9GB), run=26867-26867msec WRITE: bw=778MiB/s (816MB/s), 778MiB/s-778MiB/s (816MB/s-816MB/s), io=80.0GiB (85.9GB), run=105330-105330msec r0_(128K,12): READ: bw=3057MiB/s (3206MB/s), 3057MiB/s-3057MiB/s (3206MB/s-3206MB/s), io=80.0GiB (85.9GB), run=26796-26796msec WRITE: bw=6693MiB/s (7018MB/s), 6693MiB/s-6693MiB/s (7018MB/s-7018MB/s), io=80.0GiB (85.9GB), run=12239-12239msec r0_(512k,12): READ: bw=3063MiB/s (3212MB/s), 3063MiB/s-3063MiB/s (3212MB/s-3212MB/s), io=80.0GiB (85.9GB), run=26746-26746msec WRITE: bw=3902MiB/s (4092MB/s), 3902MiB/s-3902MiB/s (4092MB/s-4092MB/s), io=80.0GiB (85.9GB), run=20994-20994msec r0_(1M,12): READ: bw=3059MiB/s (3208MB/s), 3059MiB/s-3059MiB/s (3208MB/s-3208MB/s), io=80.0GiB (85.9GB), run=26776-26776msec WRITE: bw=3969MiB/s (4162MB/s), 3969MiB/s-3969MiB/s (4162MB/s-4162MB/s), io=80.0GiB (85.9GB), run=20639-20639msec --- z1_(16k,12): READ: bw=3050MiB/s (3198MB/s), 3050MiB/s-3050MiB/s (3198MB/s-3198MB/s), io=80.0GiB (85.9GB), run=26860-26860msec WRITE: bw=410MiB/s (430MB/s), 410MiB/s-410MiB/s (430MB/s-430MB/s), io=80.0GiB (85.9GB), run=199875-199875msec z1_(128K,12): READ: bw=2984MiB/s (3129MB/s), 2984MiB/s-2984MiB/s (3129MB/s-3129MB/s), io=80.0GiB (85.9GB), run=27456-27456msec WRITE: bw=5873MiB/s (6158MB/s), 5873MiB/s-5873MiB/s (6158MB/s-6158MB/s), io=80.0GiB (85.9GB), run=13949-13949msec z1_(512K,12): READ: bw=2990MiB/s (3135MB/s), 2990MiB/s-2990MiB/s (3135MB/s-3135MB/s), io=80.0GiB (85.9GB), run=27402-27402msec WRITE: bw=1596MiB/s (1674MB/s), 1596MiB/s-1596MiB/s (1674MB/s-1674MB/s), io=80.0GiB (85.9GB), run=51318-51318msec z1_(1M,12): READ: bw=1086MiB/s (1139MB/s), 1086MiB/s-1086MiB/s (1139MB/s-1139MB/s), io=80.0GiB (85.9GB), run=75447-75447msec WRITE: bw=1949MiB/s (2043MB/s), 1949MiB/s-1949MiB/s (2043MB/s-2043MB/s), io=80.0GiB (85.9GB), run=42039-42039msec --- 2vdev mirror_(16K,12): READ: bw=3091MiB/s (3241MB/s), 3091MiB/s-3091MiB/s (3241MB/s-3241MB/s), io=80.0GiB (85.9GB), run=26506-26506msec WRITE: bw=1521MiB/s (1595MB/s), 1521MiB/s-1521MiB/s (1595MB/s-1595MB/s), io=80.0GiB (85.9GB), run=53867-53867msec 2vdev mirror_(128K,12): READ: bw=3085MiB/s (3234MB/s), 3085MiB/s-3085MiB/s (3234MB/s-3234MB/s), io=80.0GiB (85.9GB), run=26558-26558msec WRITE: bw=4421MiB/s (4636MB/s), 4421MiB/s-4421MiB/s (4636MB/s-4636MB/s), io=80.0GiB (85.9GB), run=18529-18529msec 2vdev mirror_(512K,12): READ: bw=3090MiB/s (3240MB/s), 3090MiB/s-3090MiB/s (3240MB/s-3240MB/s), io=80.0GiB (85.9GB), run=26510-26510msec WRITE: bw=3486MiB/s (3655MB/s), 3486MiB/s-3486MiB/s (3655MB/s-3655MB/s), io=80.0GiB (85.9GB), run=23500-23500msec 2vdev mirror_(1M,12): READ: bw=3104MiB/s (3255MB/s), 3104MiB/s-3104MiB/s (3255MB/s-3255MB/s), io=80.0GiB (85.9GB), run=26393-26393msec WRITE: bw=3579MiB/s (3753MB/s), 3579MiB/s-3579MiB/s (3753MB/s-3753MB/s), io=80.0GiB (85.9GB), run=22891-22891msec deleted fio bucket file re-run in case setting recordsize=1M but bucket wrote on first default run of 128K READ: bw=3258MiB/s (3416MB/s), 3258MiB/s-3258MiB/s (3416MB/s-3416MB/s), io=80.0GiB (85.9GB), run=25145-25145msec WRITE: bw=4440MiB/s (4656MB/s), 4440MiB/s-4440MiB/s (4656MB/s-4656MB/s), io=80.0GiB (85.9GB), run=18451-18451msec ^^^ Significant difference confirming running this test without delete the fio bucket file used for testing affects speed. I want to gather data points on trying ashift=[9,12,13]. However this isn't exposed to the gui on zpool creation. I may get time to just create the pool in bash and set ashift there, then do the format and mount (unsure if the GUI can pick it up if I do it via CLI). edit: I remade the pool in my desired style of raid z1. immediately set recordsize=1M check out the fio bucket written at 128k vs 1M when running. z1_(1M,12): 128k fio bucket: READ: bw=1086MiB/s (1139MB/s), 1086MiB/s-1086MiB/s (1139MB/s-1139MB/s), io=80.0GiB (85.9GB), run=75447-75447msec WRITE: bw=1949MiB/s (2043MB/s), 1949MiB/s-1949MiB/s (2043MB/s-2043MB/s), io=80.0GiB (85.9GB), run=42039-42039msec 1M bucket: READ: bw=3221MiB/s (3378MB/s), 3221MiB/s-3221MiB/s (3378MB/s-3378MB/s), io=80.0GiB (85.9GB), run=25432-25432msec WRITE: bw=6124MiB/s (6422MB/s), 6124MiB/s-6124MiB/s (6422MB/s-6422MB/s), io=80.0GiB (85.9GB), run=13376-13376msec Makes me wish I had deleted the fio bucket after every run. I'm settling on 1M and z1. I may still try ashift changes. Edited February 7 by Unoid Quote Link to comment
Unoid Posted February 9 Author Share Posted February 9 I loaded up 800GB of movies (loves 1M sectorsize)to the zpool. scrubs average over 11GB/s reads (4 nvmes at 2.9GB/s). I'm curious if the last zpool when loaded up with 55% capacity could really slow the scrub down to 50-100MB/s? Quote Link to comment
JorgeB Posted February 10 Share Posted February 10 Seems unlikely to me, zfs pools do slow down after 90% capacity, but AFIK it does not affect the scrubbing speed. 1 Quote Link to comment
Unoid Posted February 10 Author Share Posted February 10 Update after loading 6.5TB of movies back to the zpool pool: speedteam state: ONLINE scan: scrub in progress since Sat Feb 10 12:50:25 2024 8.40T scanned at 0B/s, 355G issued at 2.01G/s, 8.40T total 0B repaired, 4.13% done, 01:08:28 to go Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.