Slow read speeds from the array


Recommended Posts

Hi all,

 

I'm a new Unraid user and i'm very happy so far, except of my reading speed from the array. I did a lot of tests and I'm very confused about the transfer rates.

That's the Situation:

- 5 Seagate 4TB HDDS (2 Parity).

- 2 Samsung 256GB Cache Disks

- 2x1 Gigabit Link Aggragation Network

 

- The best reading transfer rate from the array is around 50 MB/s. Small file have 35MB/s. Large files 50MB/s

smb_array.JPG

 

From Cache Disk full 100 MB/s.

 

smb_ssd_cache.JPG

 

I expected also 100 MB/s from the array. 

- diskspeed.sh, version 2.6.4 came to the following result:


/dev/sdb (Cache): 565 MB/sec avg
/dev/sdc (Cache 2): 565 MB/sec avg
/dev/sdd (Disk 1): 163 MB/sec avg
/dev/sde (Disk 2): 152 MB/sec avg
/dev/sdf (Disk 3): 163 MB/sec avg
/dev/sdg (Parity): 152 MB/sec avg
/dev/sdh (Parity 2): 149 MB/sec avg
 

Now it's getting more weird. I made a CrystalDiskMark via mounted Network Volume (no cached share) and the results seems to be okay.

crystaldiskmark.JPG

So there shouldn't be a transfer issue with my harddrives. But where is the bottleneck?

It can't be the Samba protocol or the network, because my cache disk is very fast via samba network. (FTP has the same transfer rates).

 

Why do I have just a maximum of 50MB's transfer speed instead of 100?

 

Please find the attached screenshots.

I hope someone has an explanation. Thanks for your help and sorry for my bad english.

 

Edited by Dexter84
Link to comment
  • Replies 51
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I had an issue with slow read speeds on Window 7 computers.  What I found was the cause of the problem was the RealTek Windows 7 were old.  I updated them to the latest drivers and the issue disappeared.  (By the way, I always go to the site for my MB manufacturer (or computer manufacturer) rather than using ever use MS generic drivers to update a working system.  I had a MS update for a modem--back in the Win 95 days-- that semi-bricked the modem.  Luckily, I  had made a Tape backup image of the drive before I started...)

 

Link to comment

Hi guys,

thanks a lot for your answers! The solution was:

 

Quote

To fix the issue add to "Samba extra configuration" on Settings -> SMB:


max protocol = SMB2_02

and

Quote

2-Go to Settings -> Global Share Settings -> Tunable (enable direct IO): set to Yes

 

This increased my reading speed to 60-100 MB/s. It's not always 100MB/s (I don't know why) but it's significant faster than before.

The speed changes from file to file, very strange

BTW: my writing speed has also increased :)

Edited by Dexter84
Link to comment
  • 10 months later...
On 6/4/2017 at 11:14 AM, Dexter84 said:

Hi guys,

thanks a lot for your answers! The solution was:

 

and

 

This increased my reading speed to 60-100 MB/s. It's not always 100MB/s (I don't know why) but it's significant faster than before.

The speed changes from file to file, very strange

BTW: my writing speed has also increased :)

 

I'm tired of my home Unraid system. I've had it for around 3 years. In general it has been slow, never ever saturated gigabit, not by far, but at least with some versions I remember speeds (read/write) of up to 60mb/sec which would be fine for me. But nowadays (6.5.0) reads are about 15-30mb/sec (mostly on the lower end) and writes 5-15mb/sec. In other words ridiculous performance. And I tried the two above tips, no change, perhaps became even worse. The hardware has always been the same (not great), that's why I'm not going to give a full list here now but try to focus the comparison between different software versions of unraid. There is a cache drive but not used for caching, so reads/writes directly to the array.  The clients are two PCs with Windows 10, both showing the same speeds.

Edited by nid
Link to comment
1 hour ago, nid said:

 

I'm tired of my home Unraid system. I've had it for around 3 years. In general it has been slow, never ever saturated gigabit, not by far, but at least with some versions I remember speeds (read/write) of up to 60mb/sec which would be fine for me. But nowadays (6.5.0) reads are about 15-30mb/sec (mostly on the lower end) and writes 5-15mb/sec. In other words ridiculous performance. And I tried the two above tips, no change, perhaps became even worse. The hardware has always been the same (not great), that's why I'm not going to give a full list here now but try to focus the comparison between different software versions of unraid. There is a cache drive but not used for caching, so reads/writes directly to the array.  The clients are two PCs with Windows 10, both showing the same speeds.

 

I've been using unRAID for around 8 years, and until lately I haven't had speed issues I couldn't fix. I have poor read speeds as well now, and I don't know if it started on 6.4 or 6.5 or earlier. The problem seems to come and go, sometimes I get 160MB/s on my 10G nic and other times it's 10MB/s. This is on great hardware, quad core Xeon with 16GB off DDR3 1333. 

Link to comment

@nid I see that the Disk 1 had some sort of problem in the past. You seem to have noticed because you ran a short SMART self-test which wouldn't really show anything. I'd run an extended self-test and see if that reveals a problem. Your cache disk has a high UDMA error count which tends to indicate a bad SATA cable, though this may be historical. If you've replaced the cable and the errors don't increase you've already fixed it. Your disks spin up and down quite a few times during the course of a day. It might be worth increasing the spin down delay. I don't see anything out of the ordinary. Maybe leave it running for a few days before grabbing diagnostics next time.

Link to comment
5 hours ago, John_M said:

@nid I see that the Disk 1 had some sort of problem in the past. You seem to have noticed because you ran a short SMART self-test which wouldn't really show anything. I'd run an extended self-test and see if that reveals a problem. Your cache disk has a high UDMA error count which tends to indicate a bad SATA cable, though this may be historical. If you've replaced the cable and the errors don't increase you've already fixed it. Your disks spin up and down quite a few times during the course of a day. It might be worth increasing the spin down delay. I don't see anything out of the ordinary. Maybe leave it running for a few days before grabbing diagnostics next time.

Thanks for the info. I'm not using the cache disk for the user shares at all. It is just there as an always on disk for some appdata and torrent traffic. I tried today a laptop with windows 7 and the results are the same. A large test file is copied from the server with an average rate of 20mb/s (starts with 30 and falling to 20). The same file used to copy with 57mb/s a few months ago. I don't remember exactly when it changed. I always update to new Unraid releases right away.

 

I should add, I notice the speed (both for read and write) is very unstable. On large files like movies. Big rises and big falls, in a kind of smooth timing, not abrupt.

Edited by nid
Link to comment

Iperf3 (fortunately) confirms the behavior. The NIC of the Unraid is PCIe Intel 82574L. I remember 1-2 years ago the Iperf3 results from Unraid were fine, close to 1000mbits/sec.

 

 

 

From WINDOWS MACHINE A TO WINDOWS MACHINE B:

 

PS C:\Users\Admin> C:\Users\Admin\Desktop\iperf-3.1.3-win64\iperf-3.1.3-win64\iperf3.exe -c 192.168.1.2
Connecting to host 192.168.1.2, port 5201
[  4] local 192.168.1.1 port 57064 connected to 192.168.1.2 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   112 MBytes   938 Mbits/sec
[  4]   1.00-2.00   sec   112 MBytes   940 Mbits/sec
[  4]   2.00-3.00   sec   112 MBytes   944 Mbits/sec
[  4]   3.00-4.00   sec   112 MBytes   944 Mbits/sec
[  4]   4.00-5.00   sec   112 MBytes   941 Mbits/sec
[  4]   5.00-6.00   sec   112 MBytes   943 Mbits/sec
[  4]   6.00-7.00   sec   112 MBytes   942 Mbits/sec
[  4]   7.00-8.00   sec   112 MBytes   942 Mbits/sec
[  4]   8.00-9.00   sec   112 MBytes   943 Mbits/sec
[  4]   9.00-10.00  sec   112 MBytes   944 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec                  sender
[  4]   0.00-10.00  sec  1.10 GBytes   942 Mbits/sec                  receiver

iperf Done.

 

 

FROM UNRAID TO WINDOWS MACHINE A:

 

root@Tower:/# iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  4] local 192.168.1.250 port 53680 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.00   sec  31.8 MBytes   266 Mbits/sec    0   97.0 KBytes
[  4]   1.00-2.03   sec  30.0 MBytes   246 Mbits/sec    0   97.0 KBytes
[  4]   2.03-3.01   sec  28.8 MBytes   244 Mbits/sec    6   67.0 KBytes
[  4]   3.01-4.01   sec  58.5 MBytes   494 Mbits/sec    0   85.5 KBytes
[  4]   4.01-5.01   sec  69.7 MBytes   581 Mbits/sec    0   92.7 KBytes
[  4]   5.01-6.02   sec  70.0 MBytes   586 Mbits/sec    0   92.7 KBytes
[  4]   6.02-7.01   sec  67.1 MBytes   568 Mbits/sec    0   98.4 KBytes
[  4]   7.01-8.02   sec  71.2 MBytes   592 Mbits/sec    0   98.4 KBytes
[  4]   8.02-9.02   sec  61.7 MBytes   515 Mbits/sec    0   99.8 KBytes
[  4]   9.02-10.00  sec  65.4 MBytes   560 Mbits/sec    0    120 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.00  sec   554 MBytes   465 Mbits/sec    6             sender
[  4]   0.00-10.00  sec   554 MBytes   465 Mbits/sec                  receiver

iperf Done.

 

FROM UNRAID TO WINDOWS MACHINE B:


root@Tower:/# iperf3 -c 192.168.1.2
Connecting to host 192.168.1.2, port 5201
[  4] local 192.168.1.250 port 50404 connected to 192.168.1.2 port 5201
[ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
[  4]   0.00-1.03   sec  36.4 MBytes   297 Mbits/sec    0   64.2 KBytes
[  4]   1.03-2.03   sec  35.0 MBytes   293 Mbits/sec    0   64.2 KBytes
[  4]   2.03-3.02   sec  35.0 MBytes   295 Mbits/sec    0   64.2 KBytes
[  4]   3.02-4.03   sec  32.2 MBytes   269 Mbits/sec    0   64.2 KBytes
[  4]   4.03-5.02   sec  35.0 MBytes   295 Mbits/sec    0   69.9 KBytes
[  4]   5.02-6.03   sec  30.0 MBytes   250 Mbits/sec    0   69.9 KBytes
[  4]   6.03-7.02   sec  35.0 MBytes   296 Mbits/sec    0   69.9 KBytes
[  4]   7.02-8.03   sec  35.0 MBytes   292 Mbits/sec    0   69.9 KBytes
[  4]   8.03-9.01   sec  48.8 MBytes   416 Mbits/sec    0   69.9 KBytes
[  4]   9.01-10.01  sec  75.0 MBytes   627 Mbits/sec    0   69.9 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-10.01  sec   397 MBytes   333 Mbits/sec    0             sender
[  4]   0.00-10.01  sec   397 MBytes   333 Mbits/sec                  receiver

iperf Done.
 

 

FROM WINDOWS MACHINE A TO UNRAID:

 

PS C:\Users\Admin> C:\Users\Admin\Desktop\iperf-3.1.3-win64\iperf-3.1.3-win64\iperf3.exe -c 192.168.1.250
Connecting to host 192.168.1.250, port 5201
[  4] local 192.168.1.1 port 57062 connected to 192.168.1.250 port 5201
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-1.00   sec   101 MBytes   844 Mbits/sec
[  4]   1.00-2.00   sec   103 MBytes   862 Mbits/sec
[  4]   2.00-3.00   sec  90.6 MBytes   761 Mbits/sec
[  4]   3.00-4.00   sec  97.1 MBytes   815 Mbits/sec
[  4]   4.00-5.00   sec  94.6 MBytes   794 Mbits/sec
[  4]   5.00-6.00   sec  59.1 MBytes   495 Mbits/sec
[  4]   6.00-7.00   sec  58.6 MBytes   492 Mbits/sec
[  4]   7.00-8.00   sec  58.6 MBytes   492 Mbits/sec
[  4]   8.00-9.00   sec  53.4 MBytes   447 Mbits/sec
[  4]   9.00-10.00  sec  56.4 MBytes   473 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec   772 MBytes   647 Mbits/sec                  sender
[  4]   0.00-10.00  sec   772 MBytes   647 Mbits/sec                  receiver

iperf Done.

 

 

 

 

 

Link to comment

So it's not the disk sub-system but the networking that's the issue. Are you seeing dropped packets? Is it a bad switch port or cable. Have you changed any of the NIC's settings - the Tips and Tweaks plugin can be used to turn on/off optimisations. Maybe try a new NIC?

 

Link to comment
14 minutes ago, John_M said:

So it's not the disk sub-system but the networking that's the issue. Are you seeing dropped packets? Is it a bad switch port or cable. Have you changed any of the NIC's settings - the Tips and Tweaks plugin can be used to turn on/off optimisations. Maybe try a new NIC?

 

Ping reports 0 lost packets after a prolonged time, so all good there. In Tips and Tweaks I have enabled Disable NIC Flow Control and Disable NIC Offload, which are the recommended settings of the plugin. Just turned them off. Flow Control status remained off. NIC Offload status changed to on. Iperf3 now showing healthy values and both read and write speed went up to a stable 50-70mbytes/sec! Strange though, I remember I had those two options enabled since a long time.

 

Next I restarted the server and after that again the same sloppiness. Slow and unstable, from 10 to 50 mbytes/sec any direction. Flow Control status remained off. NIC Offload status remained on. So those two NIC settings survived the reboot, at least according to the Tips and Tweaks report. Strangely this time iperf3 shows good results (close to gigabit). But copies are more or less suffering the same as before.

 

Re-enabled Disable NIC Flow Control and of course Flow Control status remained off but after that I have stable read from the server at about 47mbyte/sec and write to server at about 60-70 mbyte/sec. All these values are by using different random large files to both directions.

 

I hoped there was an easy pattern to this but now I lost it...

Link to comment
15 hours ago, nid said:

Ping reports 0 lost packets after a prolonged time, so all good there.

 

The way to check for lost packets is normally to look at the ifconfig printout.

 

Ping defaults to sending tiny packets which have a quite low probability of packet loss unless the network is very wacky. If you want to test with ping, then you should specify at least a 1kB packet size.

Link to comment
1 hour ago, pwm said:

 

The way to check for lost packets is normally to look at the ifconfig printout.

 

Ping defaults to sending tiny packets which have a quite low probability of packet loss unless the network is very wacky. If you want to test with ping, then you should specify at least a 1kB packet size.

Ok, ifconfig shows 5 dropped packets

        RX packets 12685671  bytes 15279444400 (14.2 GiB)
        RX errors 0  dropped 5  overruns 0  frame 0
        TX packets 16707800  bytes 23779571302 (22.1 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

This is in 18 hours online. I guess it's nothing to worry about?

Edited by nid
Link to comment

Much too few lots packets to be the reason for your slow transfer speed speed.

 

So it's more likely the different transfer speeds is caused by differences in how the different machines can handle latencies (how quick it can send back acknowledges) compared to how many outstanding packets the sending side will allow. With a large window size (many outstanding packets), you can get a high bandwidth even if the receiving side is slow to acknowledge. With a small window size (allowing only a few outstanding packets), then the sender will stall sending out more packets until a new acknowledge arrives allowing the sender to send out more packets.

 

The windows machines may be tweaked so they allow more outstanding packets, making them manage a higher transfer rate even if the other windows machine is slow to acknowledge.
While unRAID might allow less outstanding packets, making it often have to stop and wait for acknowledges.

 

Ping can normally not show as small time intervals as we are talking about in a local network. But tcpdump or some other tool that records all packets in/out over the network should be able to record the actual packet flow and if one side has to stall.

Link to comment

I don't know how to interpret the printout of tcpdump. Here I copy/paste a portion of it while copying a large file. By the way while tcpdump is running the copy speed is dropping by 10-20mbyte/sec. Don't know if that says something.

 

 

23:00:34.133753 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 115894:115986, ack 309468225, win 21072, length 92 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.134509 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309468225:309468353, ack 115986, win 2864, length 128 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.136465 IP 192.168.1.250.http > 192.168.1.2.59383: Flags [P.], seq 379293:379506, ack 1, win 69, length 213: HTTP
23:00:34.139331 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 115986:116270, ack 309468353, win 21072, length 284 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.141518 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309468353:309468717, ack 116270, win 2864, length 364 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.142216 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116270:116378, ack 309468717, win 21070, length 108 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.142619 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309468717:309468857, ack 116378, win 2864, length 140 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.143149 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116378:116470, ack 309468857, win 21070, length 92 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.143609 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309468857:309468985, ack 116470, win 2864, length 128 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.145928 IP 192.168.1.250.http > 192.168.1.2.59383: Flags [P.], seq 379506:379710, ack 1, win 69, length 204: HTTP
23:00:34.146451 IP 192.168.1.2.59383 > 192.168.1.250.http: Flags [.], ack 379710, win 2051, length 0
23:00:34.155464 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116470:116778, ack 309468985, win 21069, length 308 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.156380 IP 192.168.1.250.http > 192.168.1.2.59383: Flags [P.], seq 379710:379860, ack 1, win 69, length 150: HTTP
23:00:34.157881 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309468985:309469349, ack 116778, win 2864, length 364 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.158617 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116778:116886, ack 309469349, win 21068, length 108 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.159093 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309469349:309469489, ack 116886, win 2864, length 140 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.160613 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116886:116978, ack 309469489, win 21067, length 92 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.161230 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309469489:309469617, ack 116978, win 2864, length 128 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.166520 IP 192.168.1.250.http > 192.168.1.2.59383: Flags [P.], seq 379860:379977, ack 1, win 69, length 117: HTTP
23:00:34.167062 IP 192.168.1.2.59383 > 192.168.1.250.http: Flags [.], ack 379977, win 2050, length 0
23:00:34.172370 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 116978:117102, ack 309469617, win 21073, length 124 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.172789 IP 192.168.1.250.microsoft-ds > 192.168.1.2.59261: Flags [P.], seq 309469617:309469797, ack 117102, win 2864, length 180 SMB-over-TCP packet:(raw data or continuation?)

23:00:34.173442 IP 192.168.1.2.59261 > 192.168.1.250.microsoft-ds: Flags [P.], seq 117102:117210, ack 309469797, win 21072, length 108 SMB-over-TCP packet:(raw data or continuation?)

^C
21570 packets captured
26361 packets received by filter
4776 packets dropped by kernel

 

 

Link to comment
56 minutes ago, pwm said:

Alas, the tcpdump data can't help - the machine can't keep up with the load so the capture doesn't represent the real situation  :(

 

I know the machine sucks but it's not doing much except for copying data from/to it and running Rutorrent and Crashplan. It was built with low consumption in mind and it's pretty good at that. Right now (after disabling those two settings in Tips and Tweaks) I think it is mostly as good as it used to be. It is just not consistent. I keep testing randomly once every now and then and sometimes speeds are ok sometimes they are not. This can also be seen with smaller files, for instance photo folders where you can see how fast or slow the Windows thumbnails generation is getting done. The image files are between 1 and 15 MB.

 

I can say on average terms we are OK here. It is a (cheap) home server after all. It is just a bit annoying and disappointing when it gets slow at moments when you need it, for example when my wife wants to pick and edit photos from a folder with many files. She complains why it is now so slow, it used to be faster etc. But it was never super fast. Not Gigabit fast. If it was a read from a share from a Windows machine I believe it would be snappier and more consistent. And the disappointing fact is that Unraid is a Linux based system which ought to be a great performer. I would have built just another Windows system as a file server if I knew that it would perform like that. Can my motherboard with the weak AMD E240 be the only reason for not being able to achieve optimal performance (close to Gigabit speeds)?

Link to comment

The network card itself is the biggest factor - the amount of hardware acceleration the card can manage greatly affects how much work the CPU will have to do.

 

A gbit network card has to handle a huge number of packets/second. A full gbit/s transfer would mean around 75000 packets/second. This is an extreme load for the CPU. One improvement the card can help with is something called Interrupt Coalescing - the card waits until it has multiple packets received or until there is a pause in the transfer before issuing an interrupt. This saves a lot of CPU resources because the CPU can then pick up multiple packets at the same time. Better cards can also compute the packet checksums in hardware.

 

So if the CPU is weak, then it might help to install a better network card.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.