Jump to content

[SOLVED] 4K aligned vs not


Recommended Posts

Thanks Joe... follow-up question on Parity drive, if I may.

 

Parity is a WD20EARS drive, jumperless, and was installed factory fresh into the array.  Did not preclear it prior to activating it as the parity drive.

 

Now that I'm adding 2 more jumperless WD20EARS drives, which are being precleared with the -A option as I write this, I have to rebuild my parity.  Should I take this opportunity and change the parity drive to be 4K aligned, or just leave well enough alone?

 

 

1.  When you add a pre-cleared drive to an array it does NOT have to rebuild parity. 

2.  If performance is sufficient for you, leave well enough alone.

 

You can type:

fdisk -lu /dev/sdX

(where sdX = the three letter device for your parity drive)

on your parity drive to see if the partition starts on sector 64 or sector 63.  The starting point is by default on sector 64 on the later 5.X series, and by default on sector 63 on the 4.7 and prior.

Link to comment
  • Replies 57
  • Created
  • Last Reply
1.  When you add a pre-cleared drive to an array it does NOT have to rebuild parity.

2.  If performance is sufficient for you, leave well enough alone.

 

Woops!  I had fumbled around and started a parity re-build when the new drives where installed prior to getting the preclear started.  I think my parity is pretty much requiring a full rebuild now.

 

And performance is never sufficient!  Although realistically I've got 8 5400 rpm drives (or better) SATA II drives hanging off two PCI SATA cards, I'm not sure how much more performance I can squeeze out of this beast.  But, that's a topic for another thread.

 

Thanks again Joe!

Link to comment

Some Intel nics do default to a power-saving mode. Can you show us the output from "ethtool -e eth0"?

 

There is an output of that on the first page. Here it is again:

 

root@Tower3:~# ethtool eth0

Settings for eth0:

        Supported ports: [ TP ]

        Supported link modes:  10baseT/Half 10baseT/Full

                                100baseT/Half 100baseT/Full

                                1000baseT/Full

        Supports auto-negotiation: Yes

        Advertised link modes:  10baseT/Half 10baseT/Full

                                100baseT/Half 100baseT/Full

                                1000baseT/Full

        Advertised auto-negotiation: Yes

        Speed: 1000Mb/s

        Duplex: Full

        Port: Twisted Pair

        PHYAD: 0

        Transceiver: internal

        Auto-negotiation: on

        Supports Wake-on: umbg

        Wake-on: g

        Current message level: 0x00000007 (7)

        Link detected: yes

 

 

 

 

Link to comment

Here is the output

 

root@Tower3:/boot/scripts# ethtool -e eth0

Offset          Values

------          ------

0x0000          00 1b 21 ab 59 ca 10 02 ff ff 00 10 ff ff ff ff

0x0010          02 c8 05 35 0b 64 76 13 86 80 7c 10 86 80 84 b2

0x0020          dd 20 55 55 00 00 90 2f 00 32 12 00 20 1e 12 00

0x0030          20 1e 12 00 20 1e 12 00 20 1e 09 00 00 02 00 00

0x0040          0c 00 a6 93 0b 28 00 00 00 04 ff ff ff ff ff ff

0x0050          ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06

0x0060          00 01 00 40 16 12 07 40 ff ff ff ff ff ff ff ff

0x0070          ff ff ff ff ff ff ff ff ff ff ff ff ff ff 45 34

 

Link to comment

It does look like that's a susceptible version (82573E-based) and power-saving is enabled.

 

No promises, but their fix - disabling default power-saving mode - means running this script against eth0:

 

#!/bin/bash

if [ -z "$1" ]; then
echo "Usage: $0 \<interface\>"
echo "       i.e. $0 eth0"
exit 1
fi

if ! ifconfig $1 > /dev/null; then
exit 1
fi

dev=$(ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}')

case $dev in
0x108b8086)
	echo "$1: is a \"82573V Gigabit Ethernet Controller\""
	;;
0x108c8086)
	echo "$1: is a \"82573E Gigabit Ethernet Controller\""
	;;
0x109a8086)
	echo "$1: is a \"82573L Gigabit Ethernet Controller\""
	;;
*)
	echo "No appropriate hardware found for this fixup"
	exit 1
	;;
esac

echo "This fixup is applicable to your hardware"

var=$(ethtool -e $1 | grep 0x0010 | awk '{print $16}')
new=$(echo ${var:0:1}`echo ${var:1} | tr '02468ace' '13579bdf'`)

if [ ${var:0:1}${var:1} == $new ]; then
echo "Your eeprom is up to date, no changes were made"
exit 2
fi

echo "executing command: ethtool -E $1 magic $dev offset 0x1e value 0x$new"
ethtool -E $1 magic $dev offset 0x1e value 0x$new

echo "Change made. You *MUST* reboot your machine before changes take effect!"

 

If you haven't done this kind of thing: (or, pardon if it's obvious)

 

1) copy everything in the code section above

2) open a shell

3) cd /boot

4) cat > intelscript.sh

5) paste

6) ctrl-d

7) ./intelscript.sh eth0

 

It should tell you what it's doing and that you need to reboot before testing.

 

 

Link to comment

When I run the script I get this message:

 

root@Tower3:/boot# ./intelscript.sh eth0

No appropriate hardware found for this fixup

 

 

I ran the one command at the start of the script and this is what it showed:

 

root@Tower3:/boot# ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}'

Cannot get driver information: No such device

root@Tower3:/boot#

 

Link to comment

That should have been:

 

  ethtool -e eth0 | grep 0x0010 | awk '{print "0x"$13$12$15$14}'

 

Since the line wasn't run from within the script the $1 didn't contain anything, so there was no interface for ethtool to query. Make sense?

 

In any case, looking at what you provided before, the card's identifier should be "0x107c8086" which is not one of those affected. I have no idea why I read 7c as 8c. That's why we have computers do this stuff.

 

Time to take a fresh look at this from the start. Too many similar problems...

 

I have to take off now but will be back after dinner.

Link to comment

Reading from the top, the link errors and behavior you describe, especially the sleepy interactive (putty) response, make me think of problems like I mentioned before: Cable length, quality, where it's routed, switch ports, etc. The Intel cards try to be very smart and adapt to cable quality and conditions. The way it's negotiating Gb yet dropping connections and packets suggests to me the conditions are borderline or could be changing during the tests.

 

I know you changed the cable. How long are they? CAT 5 or what?

 

You already tried a different port at the switch. Is it a different switch from the clients you're testing from? Ten computers suggests there might be more than one involved and therefore inter-switch thangs. If so, can you move to another switch?

 

It might be a pain, but can you move the system to the location of a known-working client or server? That'd put to rest my paranoid images of 500M cables and whatever interference.

Link to comment

The system is right beside the other 2 towers that are working great. The cable is only 5 feet long or so.

 

root@Tower3:~# ethtool -k eth0

Offload parameters for eth0:

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp segmentation offload: on

udp fragmentation offload: off

generic segmentation offload: on

root@Tower3:~#

 

 

root@Tower3:~# cat /proc/interrupts

          CPU0

  0:        27  IO-APIC-edge      timer

  1:          2  IO-APIC-edge      i8042

  7:          0  IO-APIC-edge      parport0

  9:          0  IO-APIC-fasteoi  acpi

12:          3  IO-APIC-edge      i8042

17:      2293  IO-APIC-fasteoi  ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3

18:          4  IO-APIC-fasteoi  ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7

21:  118455550  IO-APIC-fasteoi  eth0

26:  84908275  PCI-MSI-edge      ahci

NMI:          0  Non-maskable interrupts

LOC:  15159755  Local timer interrupts

SPU:          0  Spurious interrupts

PMI:          0  Performance monitoring interrupts

PND:          0  Performance pending work

RES:          0  Rescheduling interrupts

CAL:          0  Function call interrupts

TLB:          0  TLB shootdowns

TRM:          0  Thermal event interrupts

THR:          0  Threshold APIC interrupts

MCE:          0  Machine check exceptions

MCP:        506  Machine check polls

ERR:          0

MIS:          0

root@Tower3:~#

 

Link to comment

Nothing to do there. You could try disabling TSO though it's a reach.

 

  ethtool -K eth0 tso off

 

Verify it takes with the same ethtool -k eth0.

 

We haven't looked at the other systems. Are you using jumbo frames? You could enable it for e1000 with:

 

  ifconfig eth0 mtu 9000 up

 

(Using your preferred mtu in place of 9000)

 

If none of this helps then, short of another adapter, I'm fresh out of ideas. You could try the onboard Realtek to see how it behaves. Some 8111Es have worked for some people, though "worked" has meant better than nothing. The comparison to your existing Intel card might be useful.

Link to comment

Intel is the go-to for these things. Yours is the first I've seen act like this where there wasn't other broken hardware involved. (cabling, switch/router) Have you tried another switch/router, just to test? It's feasible the card's adaptation sees something it doesn't like and is being too smart for your own good. Or, directly connect a client as dgaschk suggested.

 

If you eliminate this I suppose I'd go for another Intel card but from another source. Tech problems often travel in lots.

Link to comment

Yes I find it strange as well since it is plugged into the same switch as the other 2 towers. I even tried changing ports on the switch.

It looks like I am going to have to go with another card.

 

You did say that power saving is on on my card. Is there anyway to turn it off.

 

Link to comment

Your card isn't supposed to be affected by the power-saving problem. I suppose you could try the script we used before, disabling the identifier test to see if the ethtool -E made any difference. I haven't tried it.

 

Did you try disabling TSO from a couple messages ago?

 

At least try to eliminate that switch. There can be compatibility problems. Intel's cards can, by attempting to compensate, have problems with borderline hardware where dumber cards work fine.

Link to comment

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...