[SOLVED] 4K aligned vs not

Joe L. · August 3, 2011

Thanks Joe... follow-up question on Parity drive, if I may.

Parity is a WD20EARS drive, jumperless, and was installed factory fresh into the array. Did not preclear it prior to activating it as the parity drive.

Now that I'm adding 2 more jumperless WD20EARS drives, which are being precleared with the -A option as I write this, I have to rebuild my parity. Should I take this opportunity and change the parity drive to be 4K aligned, or just leave well enough alone?

1. When you add a pre-cleared drive to an array it does NOT have to rebuild parity.

2. If performance is sufficient for you, leave well enough alone.

You can type:

fdisk -lu /dev/sdX

(where sdX = the three letter device for your parity drive)

on your parity drive to see if the partition starts on sector 64 or sector 63. The starting point is by default on sector 64 on the later 5.X series, and by default on sector 63 on the 4.7 and prior.

montery · August 3, 2011

1. When you add a pre-cleared drive to an array it does NOT have to rebuild parity.
2. If performance is sufficient for you, leave well enough alone.

Woops! I had fumbled around and started a parity re-build when the new drives where installed prior to getting the preclear started. I think my parity is pretty much requiring a full rebuild now.

And performance is never sufficient! Although realistically I've got 8 5400 rpm drives (or better) SATA II drives hanging off two PCI SATA cards, I'm not sure how much more performance I can squeeze out of this beast. But, that's a topic for another thread.

Thanks again Joe!

Blade · August 4, 2011

What is with the thread jacking.

You should be starting your own thread for this.

cyrnel · August 4, 2011

Some Intel nics do default to a power-saving mode. Can you show us the output from "ethtool -e eth0"?

lionelhutz · August 4, 2011

What is with the thread jacking.

You should be starting your own thread for this.

AGREED!

Blade · August 4, 2011

Some Intel nics do default to a power-saving mode. Can you show us the output from "ethtool -e eth0"?

There is an output of that on the first page. Here it is again:

root@Tower3:~# ethtool eth0

Settings for eth0:

Supported ports: [ TP ]

Supported link modes: 10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Full

Supports auto-negotiation: Yes

Advertised link modes: 10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Full

Advertised auto-negotiation: Yes

Speed: 1000Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: internal

Auto-negotiation: on

Supports Wake-on: umbg

Wake-on: g

Current message level: 0x00000007 (7)

Link detected: yes

cyrnel · August 4, 2011

I think we're missing the "-e" part. Output should be a hex dump from the eeprom.

Blade · August 4, 2011

My bad... I will need to do that once I get home around 3:15 PM EST

Thx

Blade · August 4, 2011

Here is the output

root@Tower3:/boot/scripts# ethtool -e eth0

Offset Values

------ ------

0x0000 00 1b 21 ab 59 ca 10 02 ff ff 00 10 ff ff ff ff

0x0010 02 c8 05 35 0b 64 76 13 86 80 7c 10 86 80 84 b2

0x0020 dd 20 55 55 00 00 90 2f 00 32 12 00 20 1e 12 00

0x0030 20 1e 12 00 20 1e 12 00 20 1e 09 00 00 02 00 00

0x0040 0c 00 a6 93 0b 28 00 00 00 04 ff ff ff ff ff ff

0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 02 06

0x0060 00 01 00 40 16 12 07 40 ff ff ff ff ff ff ff ff

0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 45 34

cyrnel · August 4, 2011

It does look like that's a susceptible version (82573E-based) and power-saving is enabled.

No promises, but their fix - disabling default power-saving mode - means running this script against eth0:

#!/bin/bash

if [ -z "$1" ]; then
echo "Usage: $0 \<interface\>"
echo "       i.e. $0 eth0"
exit 1
fi

if ! ifconfig $1 > /dev/null; then
exit 1
fi

dev=$(ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}')

case $dev in
0x108b8086)
	echo "$1: is a \"82573V Gigabit Ethernet Controller\""
	;;
0x108c8086)
	echo "$1: is a \"82573E Gigabit Ethernet Controller\""
	;;
0x109a8086)
	echo "$1: is a \"82573L Gigabit Ethernet Controller\""
	;;
*)
	echo "No appropriate hardware found for this fixup"
	exit 1
	;;
esac

echo "This fixup is applicable to your hardware"

var=$(ethtool -e $1 | grep 0x0010 | awk '{print $16}')
new=$(echo ${var:0:1}`echo ${var:1} | tr '02468ace' '13579bdf'`)

if [ ${var:0:1}${var:1} == $new ]; then
echo "Your eeprom is up to date, no changes were made"
exit 2
fi

echo "executing command: ethtool -E $1 magic $dev offset 0x1e value 0x$new"
ethtool -E $1 magic $dev offset 0x1e value 0x$new

echo "Change made. You *MUST* reboot your machine before changes take effect!"

If you haven't done this kind of thing: (or, pardon if it's obvious)

1) copy everything in the code section above

2) open a shell

3) cd /boot

4) cat > intelscript.sh

5) paste

6) ctrl-d

7) ./intelscript.sh eth0

It should tell you what it's doing and that you need to reboot before testing.

dgaschk · August 4, 2011

You may need "chmod +x intelscript.sh" between steep 6 and 7.

Blade · August 5, 2011

Thank you.

I will run this once my 2 drives are done pre-clearing

Blade · August 6, 2011

When I run the script I get this message:

root@Tower3:/boot# ./intelscript.sh eth0

No appropriate hardware found for this fixup

I ran the one command at the start of the script and this is what it showed:

root@Tower3:/boot# ethtool -e $1 | grep 0x0010 | awk '{print "0x"$13$12$15$14}'

Cannot get driver information: No such device

root@Tower3:/boot#

cyrnel · August 6, 2011

That should have been:

ethtool -e eth0 | grep 0x0010 | awk '{print "0x"$13$12$15$14}'

Since the line wasn't run from within the script the $1 didn't contain anything, so there was no interface for ethtool to query. Make sense?

In any case, looking at what you provided before, the card's identifier should be "0x107c8086" which is not one of those affected. I have no idea why I read 7c as 8c. That's why we have computers do this stuff.

Time to take a fresh look at this from the start. Too many similar problems...

I have to take off now but will be back after dinner.

Blade · August 6, 2011

I appreciate it.

This NIC card is killing me. It is acting very slow

cyrnel · August 6, 2011

Reading from the top, the link errors and behavior you describe, especially the sleepy interactive (putty) response, make me think of problems like I mentioned before: Cable length, quality, where it's routed, switch ports, etc. The Intel cards try to be very smart and adapt to cable quality and conditions. The way it's negotiating Gb yet dropping connections and packets suggests to me the conditions are borderline or could be changing during the tests.

I know you changed the cable. How long are they? CAT 5 or what?

You already tried a different port at the switch. Is it a different switch from the clients you're testing from? Ten computers suggests there might be more than one involved and therefore inter-switch thangs. If so, can you move to another switch?

It might be a pain, but can you move the system to the location of a known-working client or server? That'd put to rest my paranoid images of 500M cables and whatever interference.

dgaschk · August 6, 2011

Hook up the server directly to a client. Enable DHCP on the client.

cyrnel · August 6, 2011

Getting my head out of it a bit, can you show us:

ethtool -k eth0

and

cat /proc/interrupts

Blade · August 6, 2011

The system is right beside the other 2 towers that are working great. The cable is only 5 feet long or so.

root@Tower3:~# ethtool -k eth0

Offload parameters for eth0:

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp segmentation offload: on

udp fragmentation offload: off

generic segmentation offload: on

root@Tower3:~#

root@Tower3:~# cat /proc/interrupts

CPU0

0: 27 IO-APIC-edge timer

1: 2 IO-APIC-edge i8042

7: 0 IO-APIC-edge parport0

9: 0 IO-APIC-fasteoi acpi

12: 3 IO-APIC-edge i8042

17: 2293 IO-APIC-fasteoi ehci_hcd:usb1, ehci_hcd:usb2, ehci_hcd:usb3

18: 4 IO-APIC-fasteoi ohci_hcd:usb4, ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7

21: 118455550 IO-APIC-fasteoi eth0

26: 84908275 PCI-MSI-edge ahci

NMI: 0 Non-maskable interrupts

LOC: 15159755 Local timer interrupts

SPU: 0 Spurious interrupts

PMI: 0 Performance monitoring interrupts

PND: 0 Performance pending work

RES: 0 Rescheduling interrupts

CAL: 0 Function call interrupts

TLB: 0 TLB shootdowns

TRM: 0 Thermal event interrupts

THR: 0 Threshold APIC interrupts

MCE: 0 Machine check exceptions

MCP: 506 Machine check polls

ERR: 0

MIS: 0

root@Tower3:~#

cyrnel · August 6, 2011

Nothing to do there. You could try disabling TSO though it's a reach.

ethtool -K eth0 tso off

Verify it takes with the same ethtool -k eth0.

We haven't looked at the other systems. Are you using jumbo frames? You could enable it for e1000 with:

ifconfig eth0 mtu 9000 up

(Using your preferred mtu in place of 9000)

If none of this helps then, short of another adapter, I'm fresh out of ideas. You could try the onboard Realtek to see how it behaves. Some 8111Es have worked for some people, though "worked" has meant better than nothing. The comparison to your existing Intel card might be useful.

Blade · August 6, 2011

I am not using jumbo frames on any other machine.

The onlboard realtek was causing unraid to hang after a while.

I have no isse with buying a different card. Can you tell which one I should get?

cyrnel · August 6, 2011

Intel is the go-to for these things. Yours is the first I've seen act like this where there wasn't other broken hardware involved. (cabling, switch/router) Have you tried another switch/router, just to test? It's feasible the card's adaptation sees something it doesn't like and is being too smart for your own good. Or, directly connect a client as dgaschk suggested.

If you eliminate this I suppose I'd go for another Intel card but from another source. Tech problems often travel in lots.

Blade · August 6, 2011

Yes I find it strange as well since it is plugged into the same switch as the other 2 towers. I even tried changing ports on the switch.

It looks like I am going to have to go with another card.

You did say that power saving is on on my card. Is there anyway to turn it off.

cyrnel · August 7, 2011

Your card isn't supposed to be affected by the power-saving problem. I suppose you could try the script we used before, disabling the identifier test to see if the ethtool -E made any difference. I haven't tried it.

Did you try disabling TSO from a couple messages ago?

At least try to eliminate that switch. There can be compatibility problems. Intel's cards can, by attempting to compensate, have problems with borderline hardware where dumber cards work fine.

Blade · August 7, 2011

OK I just disabled the TSO and I will see what happens.

[SOLVED] 4K aligned vs not

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Archived