woolooloo Posted September 16, 2010 Share Posted September 16, 2010 Below is my original post when I couldn't find out why I was getting the same performance writing to my cache drive as to the array. That appears to have resolved itself. My current problem is that I am only getting 20MB/s when I pull a file off the array over gigabit ethernet. I get better performance writing to the array (32MB/s) than I do reading. I have tested across multiple machines, they all get slow speeds pulling files off the array and the cache drive. ***************************** I recently upgraded some systems in my house and used some of the leftover hardware to upgrade my unRaid today. Ran into a few headaches, but think I've got them sorted and my performance seems to have improved so I am pretty happy. Parity check speed has increased from about 25 MB/s to about 42 MB/s. I ended up still having a couple disks on a Promise TX4 PCI card because 2 of the SATA ports on the Asus board weren't actually SATA ports that could be used, I guess they are reserved for some special Asus utility. Hopefully things will improve a little more still when I completely get rid of the PCI drives. My write performance is still pretty terrible. Writing to my cache drive (a 500GB 7200RPM Samsung SATA drive), I'm getting around 32MB/s which is the same as I get if I write to one of the protected drives. By contrast, copying the same file directly to my HTPC gets roughly 120MB/s. I understand the 32MB/s to the array is reasonable, but I don't understand why my cache drive performance is so bad. It is on the Promise PCI card, but it still seems awful slow. Any ideas? Also, If anyone knows of a good PCI-e 1x SATA card, I would appreciate letting me know. I thought my Adaptec card was 1x, but it is a 4x card and my one 16x slot is going to go to the Supermicro 8xSATA card I am ordering. I'm still going to need at least two more SATA ports and only have PCI-e 1x and PCI slots left on the MB. Major miscalculation on my part. Losing the 2 ports on the MB and not being able to use the Adaptec 4x card like I thought I could really hurt. Link to comment
jazzysmooth Posted September 16, 2010 Share Posted September 16, 2010 how many drives are attached to the Promise controller? They all have to share the bandwidth so the more drives the less each one gets. Are any of the others getting written to by the cache drive? Link to comment
SSD Posted September 16, 2010 Share Posted September 16, 2010 The bandwidth is shared only if multiple drives are being accessed at the same time. This is therefore a big performance issue for parity checks, but not typically for normal array usage. Link to comment
woolooloo Posted September 16, 2010 Author Share Posted September 16, 2010 I've got two other drives connected to that controller, but they were not being accessed at the time I did my performance testing sending to the cache drive. Link to comment
woolooloo Posted September 16, 2010 Author Share Posted September 16, 2010 It really seemed odd that the write performance to the cache drive is exactly what it is to an array drive, so I just copied another file and watched the drive statistics to make sure none of the other drivers were being accessed and they weren't, just the cache drive. I was also wondering if there is a way to bump up the NIC receive/transmit buffers on the unRAID? I recently did that on my Win7 machines and got about a 50% bump in network performance for large files (not exactly sure what it did to small files, but I'm mostly concerned about the large ones). Link to comment
Rajahal Posted September 16, 2010 Share Posted September 16, 2010 This is a great little 2 port PCIe x1 card that I've used in two different servers with success: SATA2 Serial ATA II PCI-Express RAID Controller Card (Silicon Image SIL3132) Very odd results with your cache drive, I can't explain that. Maybe run SMART on the drive to see if it is faulty? You could also try copying some files directly to the cache drive via the cache share and see if the speed is the same there. Link to comment
woolooloo Posted September 16, 2010 Author Share Posted September 16, 2010 This is a great little 2 port PCIe x1 card that I've used in two different servers with success: SATA2 Serial ATA II PCI-Express RAID Controller Card (Silicon Image SIL3132) Cool, just what I need and a great price, thanks for the tip! Very odd results with your cache drive, I can't explain that. Maybe run SMART on the drive to see if it is faulty? You could also try copying some files directly to the cache drive via the cache share and see if the speed is the same there. That is actually how I'm testing, sending straight to the cache share. I just started up a SMART long test, I'll see if there are any issues there. Link to comment
woolooloo Posted September 16, 2010 Author Share Posted September 16, 2010 Ok, it just got much worse :'( I just got a new WD TV Live HD media player and was testing Bluray playback by streaming Avatar to it, and it stuttered like crazy. I searched their forums and most people who have the problem are running on a wireless network, I'm running on a wired gigabit network (note the player is only 10/100). In any case, I did a quick test to see what kind of speeds I get reading off of my unRAID. The result - 20MB/s! That's right, I can write to the protected array and to my cache drive at 32MB/s, but I can only read at 20MB/s. I repeated the test unRAID<--->Desktop and unRAID<--->HTPC, same results. The unRAID and HTPC share the same switch, the desktop is on its own. I can transfer Desktop<--->HTPC > 120MB/s, so I don't think it is the network hardware! I really don't think this is a result of my hardware upgrade from yesterday, the reason I bothered to upgrade was the performance had been slow to begin with, unfortunately I do not have benchmarks from before the upgrade so I don't have numerical data, but it still "feels" like it is slow as before. The MB that went into the unRAID (Asus P5Q) has been in my desktop the past couple years and has had great network performance swapping files with the HTPC. Just found something that seems off, I'm getting ready to do some Googling but if anyone has any ideas, I greatly appreciate them: Output from ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:22:15:14:c4:90 inet addr:192.168.0.12 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:11695978 errors:0 dropped:10731 overruns:10731 frame:10731 TX packets:10733065 errors:0 dropped:0 overruns:0 carrier:2 collisions:0 txqueuelen:1000 RX bytes:880307210 (839.5 MiB) TX bytes:1146210435 (1.0 GiB) Interrupt:29 Note that I did drop the RX and TX buffers on my desktop NIC back to defaults and it didn't make a difference in the speed to/from unRAID. I am not using Jumbo Frames, and I have not made any other changes to the NIC settings on my desktop. I haven't made any changes to the NIC on the unRAID. Link to comment
Spritzup Posted September 16, 2010 Share Posted September 16, 2010 Can you try running a tracert on your network? Also, what's the latency on your ping times? Are you sure that everything with your network adapter is set to default, (ie - Jumbo frames are turned off, etc)? Also, try forcing the speed of your NIC to full gigabit (instead of auto-negotiate). Finally, if you could post the output of ping and tracert, that would be great! Link to comment
woolooloo Posted September 16, 2010 Author Share Posted September 16, 2010 First, it looks like all the dropped packets were from HTPC-->unRAID. The TX buffers on that NIC were pretty high (1024), I've scaled that back to 128 and it has substantially reduced them, but not eliminated them. I do not see any dropped packets from HTPC<--->Desktop. Even with the dropped packets between those two machines, the results I am seeing are consistent with the results from my desktop. My desktop definitely does not have jumbo frames enabled, just double-checked. My HTPC is an Intel board with an integrated Intel NIC, I don't see any settings for jumbo frames at all unless they name it something else and nothing is jumping out at me. Can you give me instructions on how to run a tracert? I don't see the tracert or traceroute anywhere. Ping unRAID-->HTPC: PING gambit (192.168.0.108) 56(84) bytes of data. 64 bytes from 192.168.0.108: icmp_seq=1 ttl=128 time=0.322 ms 64 bytes from 192.168.0.108: icmp_seq=2 ttl=128 time=0.255 ms 64 bytes from 192.168.0.108: icmp_seq=3 ttl=128 time=0.267 ms 64 bytes from 192.168.0.108: icmp_seq=4 ttl=128 time=0.212 ms 64 bytes from 192.168.0.108: icmp_seq=5 ttl=128 time=0.219 ms ^C --- gambit ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4036ms rtt min/avg/max/mdev = 0.212/0.255/0.322/0.039 ms Ping HTPC-->unRAID: Pinging watchtower [192.168.0.12] with 32 bytes of data: Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Ping statistics for 192.168.0.12: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms Ping unRAID-->Desktop: PING beast (192.168.0.109) 56(84) bytes of data. 64 bytes from 192.168.0.109: icmp_seq=1 ttl=128 time=0.321 ms 64 bytes from 192.168.0.109: icmp_seq=2 ttl=128 time=0.290 ms 64 bytes from 192.168.0.109: icmp_seq=3 ttl=128 time=0.303 ms 64 bytes from 192.168.0.109: icmp_seq=4 ttl=128 time=0.311 ms 64 bytes from 192.168.0.109: icmp_seq=5 ttl=128 time=0.275 ms 64 bytes from 192.168.0.109: icmp_seq=6 ttl=128 time=0.289 ms 64 bytes from 192.168.0.109: icmp_seq=7 ttl=128 time=0.302 ms 64 bytes from 192.168.0.109: icmp_seq=8 ttl=128 time=0.311 ms 64 bytes from 192.168.0.109: icmp_seq=9 ttl=128 time=0.305 ms 64 bytes from 192.168.0.109: icmp_seq=10 ttl=128 time=0.302 ms 64 bytes from 192.168.0.109: icmp_seq=11 ttl=128 time=0.289 ms 64 bytes from 192.168.0.109: icmp_seq=12 ttl=128 time=0.301 ms ^C --- beast ping statistics --- 12 packets transmitted, 12 received, 0% packet loss, time 11098ms rtt min/avg/max/mdev = 0.275/0.299/0.321/0.026 ms Ping Desktop--->unRAID: Pinging watchtower [192.168.0.12] with 32 bytes of data: Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Reply from 192.168.0.12: bytes=32 time<1ms TTL=64 Ping statistics for 192.168.0.12: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms I've got to run and pick up my daughter, but I'll set to full gigabit when I get back. Thanks. Link to comment
woolooloo Posted September 17, 2010 Author Share Posted September 17, 2010 I found traceroute6 in /usr/bin, but can't seem to get it to work, it keeps telling me "unknown host" even though I can ping the host (my desktop) using the same name. So I also found tracepath in there and it is apparently somewhat equivalent, here are the results: root@WatchTower:/usr/bin# tracepath -n beast 1: 192.168.0.12 0.057ms pmtu 1500 1: 192.168.0.109 0.307ms reached Resume: pmtu 1500 hops 1 back 1 I did change my desktop's NIC to be 1000 full duplex instead of auto and interestingly, my cache disk now appears to be running full speed - 80-90MB/s. Yea! Also interestingly, changing it back to auto still has the cache disk running quick, so I'm not sure if it was changing that setting that somehow got things rolling or what, but that problem appears to have resolved itself. My read speed still seems stuck at 20MB/s, regardless of auto negotiation or manually setting it. I couldn't get tracepath to complete until I disabled Norton firewall on my desktop, but that does not appear to have affected the read speed. I've tried reading off of the cache drive, different disks, and user shares, they all settle around 20MB/s. Link to comment
Spritzup Posted September 17, 2010 Share Posted September 17, 2010 If I had to hazard a guess, I would say that there is an issue with your network somewhere. For shits and giggles, try running a tracert from your desktop to your WD Live and post those results. Also, try running a tracert from your desktop to google, and then run the same thing from your unraid box. Finally, if it's not to much trouble, could you describe how your network is hooked up please? I'm a bit rusty with my networking, but I will do my best to help you. If it turns out to be an unraid problem, I'm sure one of those guru's will pop in and send me scurrying back to my n00b hole Link to comment
woolooloo Posted September 17, 2010 Author Share Posted September 17, 2010 I guess I'm just not sure why I can send desktop<--->HTPC at ~120MB/s even though the HTPC sits on the same switch as the unRAID. Unless it is a network issue on the unRAID itself. First off, I'm attaching my syslog. I'm also attaching a picture of network topology, 3 gigabit switches, one is central and connected to my router, and two in other locations. I can't think of any BIOS settings that affect the onboard NIC. I guess it could be something with the driver. I don't have anything my go script that would change anything to do with the NIC, I normally run with blockdev --setra 2048 for all the drives but I've tried disabling that without effect. I've disabled cache_dirs just to rule it out. I guess I mostly don't understand how I can send to the cache drive at 80-90MB/s, but I can only read at 20MB/s. It seems like if it were a bad cable or switch or NIC that it would limit the speed in both directions... Here is Desktop-->WD Live Tracing route to WDTVLIVE [192.168.0.44] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms WDTVLIVE [192.168.0.44] Trace complete. Here is unRAID-->WD live root@WatchTower:/usr/bin# tracepath 192.168.0.44 1: 192.168.0.12 (192.168.0.12) 0.060ms pmtu 1500 1: 192.168.0.44 (192.168.0.44) 0.417ms reached Resume: pmtu 1500 hops 1 back 1 Here is Desktop-->Google Tracing route to www.l.google.com [74.125.65.105] over a maximum of 30 hops: 1 <1 ms <1 ms <1 ms 192.168.0.1 2 1 ms <1 ms <1 ms 192.168.1.254 3 20 ms 20 ms 20 ms 99-173-172-3.lightspeed.chrlnc.sbcglobal.net [99 .173.172.3] 4 20 ms 20 ms 20 ms 99.133.204.6 5 * * * Request timed out. 6 * * * Request timed out. 7 20 ms 20 ms 20 ms 70.159.208.248 8 26 ms 25 ms 25 ms 12.81.34.58 9 26 ms 25 ms 25 ms ixc00int-pos-6-0-0.bellsouth.net [65.83.239.67] 10 25 ms 26 ms 25 ms 12.81.46.1 11 26 ms 27 ms 25 ms 12.81.56.4 12 25 ms 25 ms 25 ms 12.81.56.15 13 32 ms 32 ms 32 ms 74.175.192.94 14 33 ms 32 ms 33 ms cr1.rlgnc.ip.att.net [12.123.152.10] 15 33 ms 33 ms 33 ms cr2.attga.ip.att.net [12.122.30.82] 16 32 ms 32 ms 32 ms 12.123.22.129 17 * * * Request timed out. 18 68 ms 60 ms 65 ms 72.14.233.56 19 59 ms 60 ms 59 ms 209.85.254.249 20 59 ms * * 209.85.253.209 21 60 ms 60 ms 60 ms gx-in-f105.1e100.net [74.125.65.105] Trace complete. Here is unRAID-->Google root@WatchTower:/usr/bin# tracepath www.google.com 1: 192.168.0.12 (192.168.0.12) 0.052ms pmtu 1500 1: 192.168.0.1 (192.168.0.1) 0.269ms 2: 192.168.1.254 (192.168.1.254) 1.320ms 3: 99-173-172-3.lightspeed.chrlnc.sbcglobal.net (99.173.172.3) 23.346ms 4: 99.133.204.6 (99.133.204.6) 23.550ms 5: no reply 6: no reply 7: 70.159.208.248 (70.159.208.248) asymm 8 23.198ms 8: 12.81.34.58 (12.81.34.58) asymm 17 29.067ms 9: ixc00int-pos-6-0-0.bellsouth.net (65.83.239.67) asymm 16 29.090ms 10: 12.81.46.1 (12.81.46.1) asymm 15 29.184ms 11: 12.81.56.4 (12.81.56.4) asymm 14 28.744ms 12: 12.81.56.15 (12.81.56.15) asymm 13 28.685ms 13: 74.175.192.94 (74.175.192.94) asymm 18 35.497ms 14: cr1.rlgnc.ip.att.net (12.123.152.10) asymm 18 36.546ms 15: cr2.attga.ip.att.net (12.122.30.82) asymm 17 36.584ms 16: 12.123.22.129 (12.123.22.129) asymm 17 48.327ms 17: no reply ... 31: no reply Too many hops: pmtu 1500 Resume: pmtu 1500 One other test I did was an internal copy from one of the disks to the cache drive, I had to time it myself, but it seemed to be about 70MB/s, so the problem doesn't appear to be disk limited, it does appear to be network limited. My syslog and a jpg of my network topology are attached. syslog.txt Link to comment
Chris Pollard Posted September 17, 2010 Share Posted September 17, 2010 Make sure everything is set to auto and make sure all of your cables, connectors, sockets etc are cat6. Another test to rule out the network would be to connect the WD back to back with unRAID using a x-over. Link to comment
Spritzup Posted September 17, 2010 Share Posted September 17, 2010 I'm assuming that your gigabit switches are unmanaged. I wouldn't worry about all your cables being Cat6, so long as they're good quality Cat5e (and they're not in an area of high electrical interference). Also, you're probably better off forcing everything that can run as gigabit to gigabit. I would try switching the port that your unraid server is connected to and see if that resolves the issue. If it doesn't, try using the same port and cable that your HTPC system uses, but connect it to your unraid box. If that doesn't resolve the issue, then the problem lies with your network configuration and/or network card on your unraid box. Isn't troubleshooting fun!? Link to comment
woolooloo Posted September 17, 2010 Author Share Posted September 17, 2010 I haven't had much time to spend on it, but I've done a couple things - I changed the cable that connects the unRAID to the switch, and I used a different port on the switch. Doesn't seem to have helped. I've recently used the new cable on my laptop to transfer some large files and it was working properly. I read a few other threads on here for people with poor network performance that turned out to be the cable, but in those cases it was also negotiating at a lower rate and mine seems to be negotiating at 1000 full. I need to look at how to turn off auto-negotiation on the unRAID and then I'll do that. Still, 20MB/s is over the 100Mbps speed, so it would imply that it has running 1000Mbps although obviously restrained somehow. I still don't understand how only the transmit speed is slow and the receive speed is 4x faster and probably disk limited. Is there a way to increase the transmit buffers like I can on my Win7 NICs? Can I upgrade or change the driver for the NIC in unRAID? BTW, my entire house was freshly wired 2 years ago with Cat5e, I've since run some new lines and used Cat6 for those, but in my previous house I never had problems running gigabit over Cat5e. Link to comment
Rajahal Posted September 18, 2010 Share Posted September 18, 2010 Is there any other traffic on your network? You should be performing these tests with no other network traffic. Link to comment
woolooloo Posted September 18, 2010 Author Share Posted September 18, 2010 No, I don't have any bit torrents or anything going. My wife has been at work or out during most of this so there isn't even any web-browsing going on at the time. Also, I've repeated the test probably dozens of times while testing various (hopeful) solutions and don't see any variation, it is always 18-20MB/s moving my test files (random 4-8GB ISOs). I've tried pulling from various disks including ones from each controller (MB, PCIe, and PCI), same result. Very consistent. Link to comment
woolooloo Posted September 18, 2010 Author Share Posted September 18, 2010 Here's some new info, reading through other threads trying to get ideas I kept seeing teracopy mentioned, so I downloaded a copy, and using that I am getting read performance ~41MB/s, so double what I am seeing copying the file through Windows Explorer. Ok, so I know Windows Explorer isn't the most efficient file copy method, but I just copied a 9GB TV show off my HTPC using Windows Explorer in the same fashion and it was 80-90MB/s even though it is currently recording and using its disk. So: 1) What accounts for the performance difference? If there is a network hardware problem, it seems like it should affect Teracopy as well. 2) Is 41MB/s read considered reasonable? I would have thought it would have been higher, at least 60. I can write to my cache drive almost twice as fast as this, it seems like I should be able to read at an equivalent speed. And here is where it gets weird (at least to me). If I try to copy that same 9GB TV show from HTPC-->desktop that was 80-90MB/s using Windows Explorer, it is 20MB/s using Teracopy! If Teracopy is supposed to increase copy performance, why is it getting < 1/4 the performance of Windows Explorer when going from Win7 to Win7? Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.