Joe L. Posted August 3, 2011 Share Posted August 3, 2011 Thanks Joe... follow-up question on Parity drive, if I may. Parity is a WD20EARS drive, jumperless, and was installed factory fresh into the array. Did not preclear it prior to activating it as the parity drive. Now that I'm adding 2 more jumperless WD20EARS drives, which are being precleared with the -A option as I write this, I have to rebuild my parity. Should I take this opportunity and change the parity drive to be 4K aligned, or just leave well enough alone? 1. When you add a pre-cleared drive to an array it does NOT have to rebuild parity. 2. If performance is sufficient for you, leave well enough alone. You can type: fdisk -lu /dev/sdX (where sdX = the three letter device for your parity drive) on your parity drive to see if the partition starts on sector 64 or sector 63. The starting point is by default on sector 64 on the later 5.X series, and by default on sector 63 on the 4.7 and prior. Link to comment
montery Posted August 3, 2011 Share Posted August 3, 2011 1. When you add a pre-cleared drive to an array it does NOT have to rebuild parity. 2. If performance is sufficient for you, leave well enough alone. Woops! I had fumbled around and started a parity re-build when the new drives where installed prior to getting the preclear started. I think my parity is pretty much requiring a full rebuild now. And performance is never sufficient! Although realistically I've got 8 5400 rpm drives (or better) SATA II drives hanging off two PCI SATA cards, I'm not sure how much more performance I can squeeze out of this beast. But, that's a topic for another thread. Thanks again Joe! Link to comment
Blade Posted August 4, 2011 Author Share Posted August 4, 2011 What is with the thread jacking. You should be starting your own thread for this. Link to comment
cyrnel Posted August 4, 2011 Share Posted August 4, 2011 Some Intel nics do default to a power-saving mode. Can you show us the output from "ethtool -e eth0"? Link to comment
lionelhutz Posted August 4, 2011 Share Posted August 4, 2011 What is with the thread jacking. You should be starting your own thread for this. AGREED! Link to comment
Blade Posted August 4, 2011 Author Share Posted August 4, 2011 Some Intel nics do default to a power-saving mode. Can you show us the output from "ethtool -e eth0"? There is an output of that on the first page. Here it is again: root@Tower3:~# ethtool eth0 Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 1000Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: g Current message level: 0x00000007 (7) Link detected: yes Link to comment
cyrnel Posted August 4, 2011 Share Posted August 4, 2011 I think we're missing the "-e" part. Output should be a hex dump from the eeprom. Link to comment
Blade Posted August 4, 2011 Author Share Posted August 4, 2011 My bad... I will need to do that once I get home around 3:15 PM EST Thx Link to comment
Blade Posted August 4, 2011 Author Share Posted August 4, 2011 Here is the output root@Tower3:/boot/scripts# ethtool -e eth0 Offset Values ------ ------ 0x0000 00 1b 21 ab 59 ca 10 02 ff ff 00 10 ff ff ff ff 0x0010 02 c8 05 35 0b 64 76 13 86 80 7c 10 86 80 84 b2 0x0020 dd 20 55 55 00 00 90 2f 00 32 12 00 20 1e 12 00 0x0030 20 1e 12 00 20 1e 12 00 20 1e 09 00 00 02 00 00 0x0040 0c 00 a6 93 0b 28 00 00 00 04 ff ff ff ff ff ff 0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06 0x0060 00 01 00 40 16 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 45 34 Link to comment
cyrnel Posted August 4, 2011 Share Posted August 4, 2011 It does look like that's a susceptible version (82573E-based) and power-saving is enabled. No promises, but their fix - disabling default power-saving mode - means running this script against eth0: #!/bin/bash if [ -z "$1" ]; then echo "Usage: $0 \<interface\>" echo " i.e. $0 eth0" exit 1 fi if ! ifconfig $1 > /dev/null; then exit 1 fi dev=$(ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}') case $dev in 0x108b8086) echo "$1: is a \"82573V Gigabit Ethernet Controller\"" ;; 0x108c8086) echo "$1: is a \"82573E Gigabit Ethernet Controller\"" ;; 0x109a8086) echo "$1: is a \"82573L Gigabit Ethernet Controller\"" ;; *) echo "No appropriate hardware found for this fixup" exit 1 ;; esac echo "This fixup is applicable to your hardware" var=$(ethtool -e $1 | grep 0x0010 | awk '{print $16}') new=$(echo ${var:0:1}`echo ${var:1} | tr '02468ace' '13579bdf'`) if [ ${var:0:1}${var:1} == $new ]; then echo "Your eeprom is up to date, no changes were made" exit 2 fi echo "executing command: ethtool -E $1 magic $dev offset 0x1e value 0x$new" ethtool -E $1 magic $dev offset 0x1e value 0x$new echo "Change made. You *MUST* reboot your machine before changes take effect!" If you haven't done this kind of thing: (or, pardon if it's obvious) 1) copy everything in the code section above 2) open a shell 3) cd /boot 4) cat > intelscript.sh 5) paste 6) ctrl-d 7) ./intelscript.sh eth0 It should tell you what it's doing and that you need to reboot before testing. Link to comment
dgaschk Posted August 4, 2011 Share Posted August 4, 2011 You may need "chmod +x intelscript.sh" between steep 6 and 7. Link to comment
Blade Posted August 5, 2011 Author Share Posted August 5, 2011 Thank you. I will run this once my 2 drives are done pre-clearing Link to comment
Blade Posted August 6, 2011 Author Share Posted August 6, 2011 When I run the script I get this message: root@Tower3:/boot# ./intelscript.sh eth0 No appropriate hardware found for this fixup I ran the one command at the start of the script and this is what it showed: root@Tower3:/boot# ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}' Cannot get driver information: No such device root@Tower3:/boot# Link to comment
cyrnel Posted August 6, 2011 Share Posted August 6, 2011 That should have been: ethtool -e eth0 | grep 0x0010 | awk '{print "0x"$13$12$15$14}' Since the line wasn't run from within the script the $1 didn't contain anything, so there was no interface for ethtool to query. Make sense? In any case, looking at what you provided before, the card's identifier should be "0x107c8086" which is not one of those affected. I have no idea why I read 7c as 8c. That's why we have computers do this stuff. Time to take a fresh look at this from the start. Too many similar problems... I have to take off now but will be back after dinner. Link to comment
Blade Posted August 6, 2011 Author Share Posted August 6, 2011 I appreciate it. This NIC card is killing me. It is acting very slow Link to comment
cyrnel Posted August 6, 2011 Share Posted August 6, 2011 Reading from the top, the link errors and behavior you describe, especially the sleepy interactive (putty) response, make me think of problems like I mentioned before: Cable length, quality, where it's routed, switch ports, etc. The Intel cards try to be very smart and adapt to cable quality and conditions. The way it's negotiating Gb yet dropping connections and packets suggests to me the conditions are borderline or could be changing during the tests. I know you changed the cable. How long are they? CAT 5 or what? You already tried a different port at the switch. Is it a different switch from the clients you're testing from? Ten computers suggests there might be more than one involved and therefore inter-switch thangs. If so, can you move to another switch? It might be a pain, but can you move the system to the location of a known-working client or server? That'd put to rest my paranoid images of 500M cables and whatever interference. Link to comment
dgaschk Posted August 6, 2011 Share Posted August 6, 2011 Hook up the server directly to a client. Enable DHCP on the client. Link to comment
cyrnel Posted August 6, 2011 Share Posted August 6, 2011 Getting my head out of it a bit, can you show us: ethtool -k eth0 and cat /proc/interrupts Link to comment
Blade Posted August 6, 2011 Author Share Posted August 6, 2011 The system is right beside the other 2 towers that are working great. The cable is only 5 feet long or so. root@Tower3:~# ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp segmentation offload: on udp fragmentation offload: off generic segmentation offload: on root@Tower3:~# root@Tower3:~# cat /proc/interrupts CPU0 0: 27 IO-APIC-edge timer 1: 2 IO-APIC-edge i8042 7: 0 IO-APIC-edge parport0 9: 0 IO-APIC-fasteoi acpi 12: 3 IO-APIC-edge i8042 17: 2293 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3 18: 4 IO-APIC-fasteoi ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7 21: 118455550 IO-APIC-fasteoi eth0 26: 84908275 PCI-MSI-edge ahci NMI: 0 Non-maskable interrupts LOC: 15159755 Local timer interrupts SPU: 0 Spurious interrupts PMI: 0 Performance monitoring interrupts PND: 0 Performance pending work RES: 0 Rescheduling interrupts CAL: 0 Function call interrupts TLB: 0 TLB shootdowns TRM: 0 Thermal event interrupts THR: 0 Threshold APIC interrupts MCE: 0 Machine check exceptions MCP: 506 Machine check polls ERR: 0 MIS: 0 root@Tower3:~# Link to comment
cyrnel Posted August 6, 2011 Share Posted August 6, 2011 Nothing to do there. You could try disabling TSO though it's a reach. ethtool -K eth0 tso off Verify it takes with the same ethtool -k eth0. We haven't looked at the other systems. Are you using jumbo frames? You could enable it for e1000 with: ifconfig eth0 mtu 9000 up (Using your preferred mtu in place of 9000) If none of this helps then, short of another adapter, I'm fresh out of ideas. You could try the onboard Realtek to see how it behaves. Some 8111Es have worked for some people, though "worked" has meant better than nothing. The comparison to your existing Intel card might be useful. Link to comment
Blade Posted August 6, 2011 Author Share Posted August 6, 2011 I am not using jumbo frames on any other machine. The onlboard realtek was causing unraid to hang after a while. I have no isse with buying a different card. Can you tell which one I should get? Link to comment
cyrnel Posted August 6, 2011 Share Posted August 6, 2011 Intel is the go-to for these things. Yours is the first I've seen act like this where there wasn't other broken hardware involved. (cabling, switch/router) Have you tried another switch/router, just to test? It's feasible the card's adaptation sees something it doesn't like and is being too smart for your own good. Or, directly connect a client as dgaschk suggested. If you eliminate this I suppose I'd go for another Intel card but from another source. Tech problems often travel in lots. Link to comment
Blade Posted August 6, 2011 Author Share Posted August 6, 2011 Yes I find it strange as well since it is plugged into the same switch as the other 2 towers. I even tried changing ports on the switch. It looks like I am going to have to go with another card. You did say that power saving is on on my card. Is there anyway to turn it off. Link to comment
cyrnel Posted August 7, 2011 Share Posted August 7, 2011 Your card isn't supposed to be affected by the power-saving problem. I suppose you could try the script we used before, disabling the identifier test to see if the ethtool -E made any difference. I haven't tried it. Did you try disabling TSO from a couple messages ago? At least try to eliminate that switch. There can be compatibility problems. Intel's cards can, by attempting to compensate, have problems with borderline hardware where dumber cards work fine. Link to comment
Blade Posted August 7, 2011 Author Share Posted August 7, 2011 OK I just disabled the TSO and I will see what happens. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.