Jump to content

Was: Re: Positive feedback on performance (RC6-test2)


ix400

Recommended Posts

Hi,

 

unfortunately I do have problems with my Unraid server. I have an 11 disc array, based on a Supermicro CSEE and a Saslp-MV8. It seems that the server is not fast enough anymore, resulting in XBMC buffering events. It seems that especially the last disc I have added is affected. However, the disc shows no smart errors (it's a 20ears).

 

I also have the feeling, that the buffering only occurs in the first minutes of the movies.

 

What can I do? I checked the cables, connected the HDD to a different port, tried older release candidates (right now I'm back on RC6), upgraded to 4GB Ram, but nothing helped.

 

Can anyone help me?

 

Chris

Link to comment

You did not mention the version of unRAID you are running (there are two "rc6" versions), nor did you supply a system log for analysis.

 

Without those, nobody can offer much guidance.  I have the same C2SEE in my newer unRAID server, it is plenty fast enough... In fact, I have an old IntelCeleron on a PCI based IntelMB that has been in service for about 5 years now, and it is plenty fast enough for XBMC.  Something else is involved, and it is likely networking related.  (what specific switches/routers are involved)  Do you use factory made cables, or attach the ends on yourself? 

 

Is there a wireless link involved?  Are you connecting at 100MB/s, or 1000MB/s?  Are there errors on the LAN?

these commands:

ifconfig eth0

ethtool eth0

will give some clues.

 

When you installed the EARS disk, did you partition it to start the partition on sector 63, or sector 64?

Did you install a hardware jumper on the drive?

 

This will tell you how it is partitioned:

fdisk -lu /dev/sdX

(where sdX = the three letter drive designation)

 

 

Link to comment

Hi Joe,

 

I'm currently running version 5.0-rc6-r8168-test. However, the prob remained when I rolled back to rc3 and beta14. Hence, it seems to be not directly related the unraid version.

 

I forgot to mention that the files (BD-rips) play nicely when I use my iMac as server, which is connected to the same hub than my unraid server. I also forgot to mention that I can fluently play very high bit rate files from my unraid server (no buffering at all), but these files are not located on the problematic disc in the array.

 

Here some more details:

 

root@Neon:~# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:30:48:b2:0f:e6 

          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:1527363 errors:0 dropped:7 overruns:0 frame:0

          TX packets:1527042 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:100914677 (96.2 MiB)  TX bytes:157228331 (149.9 MiB)

          Interrupt:43 Base address:0xa000

 

root@Neon:~# ethtool eth0

Settings for eth0:

Supported ports: [ TP ]

Supported link modes:  10baseT/Half 10baseT/Full

                        100baseT/Half 100baseT/Full

                        1000baseT/Full

Supports auto-negotiation: Yes

Advertised link modes:  10baseT/Half 10baseT/Full

                        100baseT/Half 100baseT/Full

                        1000baseT/Full

Advertised pause frame use: Symmetric Receive-only

Advertised auto-negotiation: Yes

Speed: 1000Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: internal

Auto-negotiation: on

MDI-X: Unknown

Supports Wake-on: pumbg

Wake-on: g

Current message level: 0x00000033 (51)

Link detected: yes

 

 

I will provide the fdisk -lu /dev/sdX information later, since I'm running a filesystem check right now and I don't want to disturb this process:

 

oot@Neon:~# reiserfsck /dev/md10

reiserfsck 3.6.21 (2009 www.namesys.com)

 

*************************************************************

** If you are using the latest reiserfsprogs and  it fails **

** please  email bug reports to [email protected], **

** providing  as  much  information  as  possible --  your **

** hardware,  kernel,  patches,  settings,  all reiserfsck **

** messages  (including version),  the reiserfsck logfile, **

** check  the  syslog file  for  any  related information. **

** If you would like advice on using this program, support **

** is available  for $25 at  www.namesys.com/support.html. **

*************************************************************

 

Will read-only check consistency of the filesystem on /dev/md10

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --check started at Sun Aug 12 13:55:57 2012

###########

Replaying journal: Done.

Reiserfs journal '/dev/md10' in blocks [18..8211]: 0 transactions replayed

Checking internal tree.. \/  8 (of  11-/ 83 (of 135//156 (of 170/bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (877) to the block (488378624)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (878) to the block (488378625)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (879) to the block (488378626)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (880) to the block (488378627)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (881) to the block (488378628)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (882) to the block (488378629)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (883) to the block (488378630)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (884) to the block (488378631)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (885) to the block (488378632)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (886) to the block (488378633)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (887) to the block (488378634)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (888) to the block (488378635)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (889) to the block (488378636)

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (890) to the block (488378637)

finished                             

Comparing bitmaps..finished

Checking Semantic tree:

 

 

 

Thanks for your help,

 

Chris

Link to comment

Hi,

 

I fixed the errors with the following command:

 

 

reiserfsck --fix-fixable /dev/md10

reiserfsck 3.6.21 (2009 www.namesys.com)

 

*************************************************************

** If you are using the latest reiserfsprogs and  it fails **

** please  email bug reports to [email protected], **

** providing  as  much  information  as  possible --  your **

** hardware,  kernel,  patches,  settings,  all reiserfsck **

** messages  (including version),  the reiserfsck logfile, **

** check  the  syslog file  for  any  related information. **

** If you would like advice on using this program, support **

** is available  for $25 at  www.namesys.com/support.html. **

*************************************************************

 

Will check consistency of the filesystem on /dev/md10

and will fix what can be fixed without --rebuild-tree

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --fix-fixable started at Sun Aug 12 16:05:53 2012

###########

Replaying journal: Done.

Reiserfs journal '/dev/md10' in blocks [18..8211]: 0 transactions replayed

Checking internal tree.. \/  8 (of  11-/ 83 (of 135//156 (of 170/bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (877) to the block (488378624) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (878) to the block (488378625) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (879) to the block (488378626) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (880) to the block (488378627) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (881) to the block (488378628) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (882) to the block (488378629) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (883) to the block (488378630) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (884) to the block (488378631) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (885) to the block (488378632) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (886) to the block (488378633) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (887) to the block (488378634) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (888) to the block (488378635) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (889) to the block (488378636) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (890) to the block (488378637) - zeroed

finished                             

Comparing bitmaps..finished

Checking Semantic tree:

finished                                                                     

No corruptions found

There are on the filesystem:

Leaves 290442

Internal nodes 1750

Directories 260

Other files 310

Data block pointers 293862538 (0 of them are zero)

Safe links 1

###########

reiserfsck finished at Sun Aug 12 17:17:13 2012

 

... it seems that this has helped. I observe less stuttering now.

 

Here's the info on the block assignment:

 

 

root@Neon:~# fdisk -lu /dev/sdd

 

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util fdisk doesn't support GPT. Use GNU Parted.

 

 

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes

1 heads, 63 sectors/track, 62016336 cylinders, total 3907029168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000

 

  Device Boot      Start        End      Blocks  Id  System

/dev/sdd1              64  3907029167  1953514552  83  Linux

Partition 1 does not end on cylinder boundary.

 

... is there something else wrong with that disc that might have caused the file system errors?

 

How can the file system errors cause stuttering video in XBMC?

 

Chris

Link to comment

Hi,

 

I fixed the errors with the following command:

 

 

reiserfsck --fix-fixable /dev/md10

reiserfsck 3.6.21 (2009 www.namesys.com)

 

*************************************************************

** If you are using the latest reiserfsprogs and  it fails **

** please  email bug reports to [email protected], **

** providing  as  much  information  as  possible --  your **

** hardware,  kernel,  patches,  settings,  all reiserfsck **

** messages  (including version),  the reiserfsck logfile, **

** check  the  syslog file  for  any  related information. **

** If you would like advice on using this program, support **

** is available  for $25 at  www.namesys.com/support.html. **

*************************************************************

 

Will check consistency of the filesystem on /dev/md10

and will fix what can be fixed without --rebuild-tree

Will put log info to 'stdout'

 

Do you want to run this program?[N/Yes] (note need to type Yes if you do):Yes

###########

reiserfsck --fix-fixable started at Sun Aug 12 16:05:53 2012

###########

Replaying journal: Done.

Reiserfs journal '/dev/md10' in blocks [18..8211]: 0 transactions replayed

Checking internal tree.. \/  8 (of  11-/ 83 (of 135//156 (of 170/bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (877) to the block (488378624) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (878) to the block (488378625) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (879) to the block (488378626) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (880) to the block (488378627) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (881) to the block (488378628) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (882) to the block (488378629) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (883) to the block (488378630) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (884) to the block (488378631) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (885) to the block (488378632) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (886) to the block (488378633) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (887) to the block (488378634) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (888) to the block (488378635) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (889) to the block (488378636) - zeroed

bad_indirect_item: block 309821473: The item [256 310 0x549ae001 IND (1)] has the bad pointer (890) to the block (488378637) - zeroed

finished                             

Comparing bitmaps..finished

Checking Semantic tree:

finished                                                                     

No corruptions found

There are on the filesystem:

Leaves 290442

Internal nodes 1750

Directories 260

Other files 310

Data block pointers 293862538 (0 of them are zero)

Safe links 1

###########

reiserfsck finished at Sun Aug 12 17:17:13 2012

 

... it seems that this has helped. I observe less stuttering now.

 

Here's the info on the block assignment:

 

 

root@Neon:~# fdisk -lu /dev/sdd

 

WARNING: GPT (GUID Partition Table) detected on '/dev/sdd'! The util fdisk doesn't support GPT. Use GNU Parted.

 

 

Disk /dev/sdd: 2000.4 GB, 2000398934016 bytes

1 heads, 63 sectors/track, 62016336 cylinders, total 3907029168 sectors

Units = sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk identifier: 0x00000000

 

  Device Boot      Start        End      Blocks  Id  System

/dev/sdd1              64  3907029167  1953514552  83  Linux

Partition 1 does not end on cylinder boundary.

 

... is there something else wrong with that disc that might have caused the file system errors?

 

How can the file system errors cause stuttering video in XBMC?

 

Chris

 

First, unRAiD must wait until the hard disk has finally decided that it can't read the data and that the hard disk's error correcting schemes can't reconstruct the data.  If the disk finally decides that it cannot be read the data, the data (that can't be read)  is delivered by unRAID using a parity calculation by reading all of the other disks in the array and using that information to computer the missing data.  This may require spinning up all of the other disks in the array.  Hopefully, you can see that all of this is going to take much, much longer than simply reading the data off the data disk. 

Link to comment

... makes sense.

 

Now that reiserfsck has corrected the error, do I still have to run a parity correction?

 

Chris

 

Absolutely.  First, be sure that you can read the files that previously had issues with.  Next, I would run a non-correcting parity check overnight to see that all of the problems were fixed by reiserfsck.  IF you find any, you still have problems with the data disk.  The reason, I say non-correcting is that if necessary, you could replace or rebuild the defective disk using the data on the current parity disk.  If you correct parity with a bad disk, you may find yourself in a situation where the data on the parity disk is 'corrupted' by the bad  disk and recovery may now be impossible. 

 

If you have no errors in a non-correcting parity check, you are good to go.  If you do, check back and ask for more help.

Link to comment

Thanks. Before the parity check, should I also reiserfsck all other discs in the array, including the parity disc? Wouldn't be that the cleanest way?

 

Chris

there is no file system on the parity disk, therefore, reiserfsck is not appropriate on it... ever.

 

If you are running riserfsck on the /dev/mdX devices that hold your data, then parity is automatically updated as appropriate. You do not need to do anything else.

 

However, if you ran reiserfsck on the physical disk partitions themselves (/dev/sdX1, etc) then you do need to run a correcting parity check to get parity into sync.

Link to comment

Okay, I corrected all drives with the  /dev/mdX command. Now I have started an non-correcting parity check, and after checking about 10% of the array already 156 parity errors have been found.

 

I stopped the ckecking at that point, since I wanted to ask you if I shall restart the checking in the correcting mode.  ?

 

:'(

 

Hope I can get this back working properly.

 

Chris

Link to comment

I have to add that I also checked the network connection, and I'm not loosing packets during transfer. But some movies still buffer, and now I got these parity errors during a non-ncorrective check. Something is wrong.

 

It would be nice if someone could help me.

 

Chris

Link to comment

... and I wanted to ask if it is okay that the packet transfer test via ifconfig isonly done for a second or so. When I type in the corresponding command I almost immediately get the result. Maybe I used the wrong switch for the command?

 

Chris

Link to comment

... here's what I typed plus the result:

 

 

root@Neon:~# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:30:48:b2:0f:e6 

          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:261 errors:0 dropped:0 overruns:0 frame:0

          TX packets:278 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:22061 (21.5 KiB)  TX bytes:139535 (136.2 KiB)

          Interrupt:43 Base address:0xa000

 

Is this the right command to check for lost packets during network transfer?

 

Chris

Link to comment

Thank, I was just wondering why the "measurement" for dropped packets was so fast.

 

I now have repaired the parity in a correcting check. 158 errors have been found and corrected. Here's the log:

 

https://dl.dropbox.com/u/7782292/unRAIND-log.rtf

 

And here's some information about my Network-Setup:

 

root@Neon:~# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:30:48:b2:0f:e6 

          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:504 errors:0 dropped:0 overruns:0 frame:0

          TX packets:606 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:45450 (44.3 KiB)  TX bytes:520462 (508.2 KiB)

          Interrupt:43

 

root@Neon:~# ethtool eth0                                                                                         

Settings for eth0:

Supported ports: [ TP ]

Supported link modes:  10baseT/Half 10baseT/Full

                        100baseT/Half 100baseT/Full

                        1000baseT/Full

Supports auto-negotiation: Yes

Advertised link modes:  10baseT/Half 10baseT/Full

                        100baseT/Half 100baseT/Full

                        1000baseT/Full

Advertised pause frame use: Symmetric Receive-only

Advertised auto-negotiation: Yes

Speed: 1000Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: internal

Auto-negotiation: on

MDI-X: Unknown

Supports Wake-on: pumbg

Wake-on: g

Current message level: 0x00000033 (51)

Link detected: yes

root@Neon:~# ping -c 5 google.com

PING google.com (209.85.148.139) 56(84) bytes of data.

64 bytes from fra07s07-in-f139.1e100.net (209.85.148.139): icmp_req=1 ttl=54 time=7.96 ms

64 bytes from fra07s07-in-f139.1e100.net (209.85.148.139): icmp_req=2 ttl=54 time=7.78 ms

64 bytes from fra07s07-in-f139.1e100.net (209.85.148.139): icmp_req=3 ttl=54 time=7.66 ms

64 bytes from fra07s07-in-f139.1e100.net (209.85.148.139): icmp_req=4 ttl=54 time=7.72 ms

64 bytes from fra07s07-in-f139.1e100.net (209.85.148.139): icmp_req=5 ttl=54 time=9.47 ms

 

--- google.com ping statistics ---

5 packets transmitted, 5 received, 0% packet loss, time 4085ms

rtt min/avg/max/mdev = 7.666/8.122/9.478/0.685 ms

 

Parity is okay now, network seems also okay.

 

But I still get bufferings during video play back once in a while. The video files are not damaged, they can be played withoud buffering from the local disc.

 

I swapped drive positions within the server, no benefit.

 

I upgraded the RAM from 1 to 4 GB, no effect.

 

I also tried playing around with jumbo frames and different read-ahead cache sizes. Unfortunately with no luck.

 

What else can I do? Playing around with the "disk device I/O scheduler mode" ?

 

Maybe the CPU is too slow, it's a dual core Celeron with 2.5 GHz.

 

Honestly, I'm about to switch from unRAID to something else. It's frustrating.  :(

 

Chris

Link to comment

Found something:

 

Sometimes when I check "ifconfig eth0" I see one dropped packet (marked in red):

 

root@Neon:~#  ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:30:48:b2:0f:e6 

          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:1877445 errors:0 dropped:1 overruns:0 frame:0

          TX packets:2141480 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:1610836871 (1.5 GiB)  TX bytes:2294939418 (2.1 GiB)

          Interrupt:43 Base address:0x8000

 

Is there a way to extend the dropped packet measurement to a longer time?

 

Is this one dropped packet once in a while responsible for my buffering problems?

 

Chris

Link to comment

Thanks, now I understand. Didn't know that these are counters.

 

What I did now:

 

I restarted the server, played some movie files, had two buffering events, then I ran ifconfig:

 

ifconfig   

eth0      Link encap:Ethernet  HWaddr 00:30:48:b2:0f:e6 

          inet addr:192.168.1.120  Bcast:192.168.1.255  Mask:255.255.255.0

          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

          RX packets:856065 errors:0 dropped:0 overruns:0 frame:0

          TX packets:1684814 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:60593497 (57.7 MiB)  TX bytes:2376662192 (2.2 GiB)

          Interrupt:43 Base address:0x8000

 

... no dropped packets during the buffering.

 

Hence, it doesn't seem to be a network problem.

 

?

 

Chris

Link to comment

... worked. Thanks!  :)

 

I think during buffering I observe high %wa values. Goes down again after buffering is complete.

 

But in general, the processor is more than 97% idle during streaming, even in high bit rate sequences.

 

The buffering issue seems to be more frequent directly after starting the movie playback, lets say in the first 2 minutes.

 

Chris

Link to comment

I did some more testing. It seems that the prob occurs only with one disc. The movie data seems to be okay, since the problem is not always reproducible.

 

I guess the hard drive has a problem then. However, the "smart view" page of unMenu doesn't show any problems for this disc.

 

Probably I have to replace it nevertheless.

 

Chris

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...