unRAID Server release 4.5-beta8 available


limetech

Recommended Posts

Its all right for you guys, my server is 24 days into level 2 m/b testing, I have to wait another week before I can try b8, I was 14 days into level 2 testing when b7 shipped but was weak and succumbed into temptation.

 

Cant wait to try it.

 

Has anyone teted parity check and parity creation speeds?

Link to comment
  • Replies 144
  • Created
  • Last Reply

Top Posters In This Topic

None of these tunables have anything to do with the Cache Drive; they are 'tweaks' for the low-level unraid array driver only.

 

First, you can create a file in the 'config' directory of the Flash called 'extra.cfg'.  This file is read when the array is Started and may be used to define the values (or override default values) of these tunables.  Here are the current defaults:

 

md_num_stripes=1280

md_write_limit=768

md_sync_window=288

 

md_num_stripes defines the maximum number of active 'stripes' the driver can maintain.  You can think of a 'stripe' as a 4K I/O request.  For example, let's say you are reading a large (many megabytes) file from the server.  Well the driver will "see" a series of (largely) sequential 4K read requests.  The driver will determine which disk the read is for and then pass that 4K read request to the disk driver.  Well the most such requests would be 1280.  If there are 1280 requests already active and another request comes in, then the process making that request has to wait until a stripe become free (as a result of previous I/O finishing).

 

Bottom line is this: the greater this number is, the more I/O can be queued down into the disk drives.  However each 'stripe' requires memory in the amount of (4096 x highest disk number in array).  So if you have 20 disks, each stripe will require 81920 bytes of memory; multiplied by 1280 = over 104MB.  The default value was chosen to maximize performance in systems with only 512MB of RAM.  If you have more RAM then you can experiment with higher values.  If you go too high and the system starts running out of memory, what will happen is 'random' processes will start getting killed (not good).

 

md_write_limit specifies the maximum number of stripes which will be used for write.  This is because if you starting writing a large file, there can easily be 'md_num_stripes' requests immediately queued.  This will cause reads to different disks to suffer greatly.  So you want to pick a number here that is large but which also leaves some stripes available for reads.  You will find there is a point of diminishing returns.  If you set the number low, say 32, then writes will be very slow.  If you leave it at default, writes will be much faster.  If you set it to say 1000, writes will be a "little bit" faster.  By increasing both md_num_stripes and md_write_limit you might get 10% more performance than default values.  The best you can get is going to be something less than 50% the raw speed of the disk - if you get above 33% let me know :)

 

md_sync_window specifies the most number of parity-sync stripes active during a parity-sync or parity-check operation.  Again, the larger this number, the faster parity-sync will run, with diminishing returns at some point (due mainly to saturating PCI and/or PCIe buses).

 

You want to make sure the sum of md_write_limit+md_sync_window < md_num_stripes so that reads do not get starved if you starting writing a large file while a parity-sync/check is in process.

 

Hopefully I will be able to write up a more thorough document on this subject along with some of theory behind it.

 

Note: if you change the values of these tunables via the 'extra.cfg' file, you must Stop the array, then Start the array for the values to take effect.  The md_write_limit and md_sync_window may be dynamically changed by typing these commands in a telnet session:

 

mdcmd set md_write_limit <value>

mdcmd set md_sync_window <value>

 

 

Any suggestions as to where in the Wiki this post should be copied to?

 

Link to comment

Anyone else having issues with realtek 8169's running gigabit in this (or maybe a previous) beta release?

 

I started out using an 8139 (100mbit) and had no troubles so bought a 8169.  Connected to the same test hub (100mbit half duplex) I have no troubles and it's rock solid.  But if I pull the cable from my cheap hub (coming from my dell gigabit switch) and plug it straight into the 8169, I'm good for about for 4G of transfer and then - poof.  It's not there.  The server thinks it's connected, the link and activity light are active but no response from remote connections.  Disconnecting and reconnecting resets everything.  Going back to the 100mbit link it returns to stability.

 

I'll see if I can't trap something good in the syslog.  But I noted that Beta7 swapped the 8169 driver so I thought I'd ask.

 

Well my Realtek NIC (on-board PCIe 8168) was working OK at gigabit speed with 4.5b6, but now after upgrading to 4.5b8, it will not negotiate a gigabit connection. Seems limited to 100mbps. ethtool -s eth0 speed 1000 wouldn't force a gigabit connection either.

 

Went back to 4.5b6 and NIC negotiates gigabit connection as before.

Link to comment
(Maybe "buggy" is not the right word. But certainly far from "optimized")

There's still more to be desired about the unraid driver, especially about its cpu usage.

 

I'd prefer to think unraid was previously optimized for low spec machines and is now optimized for more modern hardware. Least that is what marketing dept said.  8)

Link to comment

Tom,

 

can I request that Joes excellent cache drive script be added to beta 9. Id love to properly have it beta trialled to see if it works as well for the masses as it does for me.

 

We can always remove it beta 10 if it proves unreliable for the complete unRAID userbase... thats the beauty of betas :)

 

FYI beta 8 upgrade seemless and so far none of the weird samba dropouts I was seeing before.

 

Do you want a 5.x release that provides smoother integration of user contributed plug-ins?  Then I have to stop adding features to 4.5 series and get it done.

Link to comment

Tom, with these new configuration capabilities would it be worthwhile for some of us to upgrade RAM? I have always had a decent amount in my machines but RAM is fairly cheap these days - any advantage to say 4gig with appropriate config changes?

 

Well all of my testing has been mainly with 512MB of RAM since it has to work with that.  I noticed that once the size of the driver request pool goes beyond 2000 requests (md_num_stripes=2000) and write limit past around 1500 (md_write_limit=1500), then performance gain is minimal.  Perhaps with very fast processor and more ram (1G or more) you can squeak out more performance - that is what I'm looking for from from the community.

Link to comment

Has anyone teted parity check and parity creation speeds?

 

I performed a parity check and posted the results in this thread. It was roughly the same speed as beta 6 and beta 7.

 

I have not tested parity creation speeds.

 

That is correct.  With a small number of drives you can increase md_sync_window and get close to the raw speed of the slowest drive.  However, once the number of drives in the array gets over a certain value, which depends on your h/w, parity-sync/check will start to saturate the various buses on your motherboard.  For example, if you have 4 drives on a 4-port PCI controller, parity sync/check will get limited to around 33MB/sec because the PCI bus is only 133MB/sec being shared by 4 devices.  Similar bottlenecks exist in northbridge/southbridge link.

Link to comment

Anyone else having issues with realtek 8169's running gigabit in this (or maybe a previous) beta release?

 

I started out using an 8139 (100mbit) and had no troubles so bought a 8169.  Connected to the same test hub (100mbit half duplex) I have no troubles and it's rock solid.  But if I pull the cable from my cheap hub (coming from my dell gigabit switch) and plug it straight into the 8169, I'm good for about for 4G of transfer and then - poof.  It's not there.  The server thinks it's connected, the link and activity light are active but no response from remote connections.  Disconnecting and reconnecting resets everything.  Going back to the 100mbit link it returns to stability.

 

I'll see if I can't trap something good in the syslog.  But I noted that Beta7 swapped the 8169 driver so I thought I'd ask.

 

Well my Realtek NIC (on-board PCIe 8168) was working OK at gigabit speed with 4.5b6, but now after upgrading to 4.5b8, it will not negotiate a gigabit connection. Seems limited to 100mbps. ethtool -s eth0 speed 1000 wouldn't force a gigabit connection either.

 

Went back to 4.5b6 and NIC negotiates gigabit connection as before.

 

I'll look into this.

Link to comment

I'll look into this.

 

Thanks much. I know Realtek has been updating the drivers for Vista very frequently, apparently they have trouble optimizing them. Suspect this may also be the case for Linux and this kernel may just have a bad one. Appreciate anything you can do, so I can take advantage of the latest improvements in unRAID without having to purchase a new NIC. (Leaves more funds for me to upgrade to Plus.   :) )

Link to comment

I'll look into this.

 

Thanks much. I know Realtek has been updating the drivers for Vista very frequently, apparently they have trouble optimizing them. Suspect this may also be the case for Linux and this kernel may just have a bad one. Appreciate anything you can do, so I can take advantage of the latest improvements in unRAID without having to purchase a new NIC. (Leaves more funds for me to upgrade to Plus.   :) )

 

What motherboard are you using.  From the syslog it appears you have a Marvell NIC, not Realtek.

Link to comment

I'll look into this.

 

Thanks much. I know Realtek has been updating the drivers for Vista very frequently, apparently they have trouble optimizing them. Suspect this may also be the case for Linux and this kernel may just have a bad one. Appreciate anything you can do, so I can take advantage of the latest improvements in unRAID without having to purchase a new NIC. (Leaves more funds for me to upgrade to Plus.   :) )

 

What motherboard are you using.  From the syslog it appears you have a Marvell NIC, not Realtek.

 

DUH! :-[  Asus A8R32MVP Deluxe which does have Marvell network controllers. I have a Realtek in my workstation's motherboard!

 

Got myself confused.

 

Sorry to be an idiot customer who complicates a problem more than necessary! ???

Link to comment

I'll look into this.

 

Thanks much. I know Realtek has been updating the drivers for Vista very frequently, apparently they have trouble optimizing them. Suspect this may also be the case for Linux and this kernel may just have a bad one. Appreciate anything you can do, so I can take advantage of the latest improvements in unRAID without having to purchase a new NIC. (Leaves more funds for me to upgrade to Plus.   :) )

 

What motherboard are you using.  From the syslog it appears you have a Marvell NIC, not Realtek.

 

DUH! :-[  Asus A8R32MVP Deluxe which does have Marvell network controllers. I have a Realtek in my workstation's motherboard!

 

Got myself confused.

 

Sorry to be an idiot customer who complicates a problem more than necessary! ???

 

No worries.  That m/b has two Marvell NIC's.  I guess you have one of them disabled - can you switch over to the other one as a test?

Link to comment

I'll look into this.

 

Thanks much. I know Realtek has been updating the drivers for Vista very frequently, apparently they have trouble optimizing them. Suspect this may also be the case for Linux and this kernel may just have a bad one. Appreciate anything you can do, so I can take advantage of the latest improvements in unRAID without having to purchase a new NIC. (Leaves more funds for me to upgrade to Plus.   :) )

 

What motherboard are you using.  From the syslog it appears you have a Marvell NIC, not Realtek.

 

DUH! :-[  Asus A8R32MVP Deluxe which does have Marvell network controllers. I have a Realtek in my workstation's motherboard!

 

Got myself confused.

 

Sorry to be an idiot customer who complicates a problem more than necessary! ???

 

No worries.  That m/b has two Marvell NIC's.  I guess you have one of them disabled - can you switch over to the other one as a test?

 

Yes. The secondary one is PCI not PCIe, but I'll go ahead and test it with 4.5-b8 if that will provide you with a useful data point...

 

Well, the secondary NIC (888001 PCI Gigabit) would not negotiate a gigabit connection with beta 6 but it does with beta 8!

 

Of course I'd rather be using the primary NIC (88E8053 PCIe Gigabit) which displays the opposite behavior. (Edit: I especially don't want to use the secondary NIC since I discovered that WOL does not seem to work with it.:(

 

Nothing like complicating a problem is there? ???

Link to comment

But I noted that Beta7 swapped the 8169 driver so I thought I'd ask.

I don't believe there were any changes to the r8169 driver in v4.5-beta7.  Did you perhaps mean v4.5-beta8?

 

Well my Realtek NIC (on-board PCIe 8168) was working OK at gigabit speed with 4.5b6, but now after upgrading to 4.5b8, it will not negotiate a gigabit connection. Seems limited to 100mbps. ethtool -s eth0 speed 1000 wouldn't force a gigabit connection either.

 

Went back to 4.5b6 and NIC negotiates gigabit connection as before.

Your NIC info indicated a dropped packet and a frame error, and this was at the slower speed of 100 Mbps.  Are you sure of the quality of your network cable (Cat5e or Cat6)?  It may be that a newer driver is more reliable at determining which speed will be safer for you.  Gigabit speed negotiation may have found too many errors, that the earlier version allowed to pass.

Link to comment

RobJ -

 

I may have caused the confusion here.  I thought I had read in the release notes for Beta7 that the realtek driver had been replaced with a driver from the manufacturer.  Of course being monday morning I can't find that reference now.  I apologize for any confusion.  (It does appear that realtek has been problematic in the past)

 

Rob

Link to comment

But I noted that Beta7 swapped the 8169 driver so I thought I'd ask.

I don't believe there were any changes to the r8169 driver in v4.5-beta7.  Did you perhaps mean v4.5-beta8?

 

Well my Realtek NIC (on-board PCIe 8168) was working OK at gigabit speed with 4.5b6, but now after upgrading to 4.5b8, it will not negotiate a gigabit connection. Seems limited to 100mbps. ethtool -s eth0 speed 1000 wouldn't force a gigabit connection either.

 

Went back to 4.5b6 and NIC negotiates gigabit connection as before.

Your NIC info indicated a dropped packet and a frame error, and this was at the slower speed of 100 Mbps.  Are you sure of the quality of your network cable (Cat5e or Cat6)?  It may be that a newer driver is more reliable at determining which speed will be safer for you.  Gigabit speed negotiation may have found too many errors, that the earlier version allowed to pass.

I think the dropped packet and frame error occurred when I issued the ethtool -s eth0 speed 1000 while in the middle of a file transfer. Normally there have been no network issues with 1000Mbps and beta 6, so I think the cables (one is Cat5e and the rest Cat6) are fine. The failure to negotiate 1000Mbps only occurs with beta 8.

 

Here is the info from the secondary NIC connected at 1000Mbps. It does work under beta 8. Unfortunately, it is not PCIe and also I find that WOL does not want to work with this NIC, even though it looks like it should.

NIC info (from ethtool)

Settings for eth0:
Supported ports: [ TP ]
Supported link modes:   10baseT/Half 10baseT/Full 
                        100baseT/Half 100baseT/Full 
                        1000baseT/Half 1000baseT/Full 
Supports auto-negotiation: Yes
Advertised link modes:  10baseT/Half 10baseT/Full 
                        100baseT/Half 100baseT/Full 
                        1000baseT/Half 1000baseT/Full 
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: pg
Wake-on: g
Current message level: 0x00000037 (55)
Link detected: yes

NIC driver info (from ethtool -i)

driver: skge
version: 1.13
firmware-version: N/A
bus-info: 0000:01:14.0

Ethernet config info (from ifconfig)

eth0      Link encap:Ethernet  HWaddr 00:17:31:25:bb:ee  
          inet addr:192.168.15.250  Bcast:192.168.15.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:45 errors:0 dropped:0 overruns:0 frame:0
          TX packets:77 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:4700 (4.5 KiB)  TX bytes:65711 (64.1 KiB)
          Interrupt:21 

 

Summary of Performance and Driver Info Obtained So Far

4.5-Beta64.5-Beta8

NIC1000Mb/sDriver/Vs1000Mb/sDriver/Vs

Marvell 88E8053 PCIeYessky2/1.22No*sky2/1.23

Marvell 888001 PCINoskyge/1.13Yesskyge/1.13

NOTE: Driver is unchanged from beta 6 to beta 8 for secondary NIC, yet the latter enables gigabit for this NIC while the former does not.

Must be some other factor in play.

 

*Edit: Determined that only with both NICs enabled will the primary NIC negotiate a 1000Mb/s connection with beta 8.

Link to comment

I have a realtek on-board NIC and I am having the 100megs on a 1000meg port problem.  The cable is a manufactured cable (not home made), it is a Cat5 or 5e, I can't tell, but it was working at gigabit with beta 6 (not sure about 7).  I have attached a syslog, you can see the link go up/down several times, but just during initial boot.  I ordered a new cat6 cable just in case the consensus is that.

 

Thanks!

Link to comment

Background: I've been using unRAID for 3 or so months.  I had a drive failure three weeks ago, spent a week backing my data up [off of parity] and then waited around for my replacement drive to come in.  It has.  I was using beta6 before, but switched to beta8 this morning hoping that the below issue would have found its way out. 

 

I am a plus license holder.

 

I'm now in the process of moving everything back onto the array.

 

I haven't reported an issue before, so please forgive me if I'm not reporting the info correctly.

 

Current configuration:

Motherboard: Intel DQ45CB

Processor: Celeron 430

SATA Controller: Rosewill RC-218 [http://www.newegg.com/Product/Product.aspx?Item=N82E16816132018]

Data Drives: WDC_WD20EADS, 2@ ST31500341AS

Parity Drive: HDS722020ALA330 (currently not enabled as I am migrating my data back onto the array.  Once the data has been transferred, I will be enabling this parity drive)

 

Cabling: All drives are hanging off of the RC-218.

unRAID config: vanilla, no hdparm or kernel params.

 

Errors, repeated literally every second -- this ata :

Nov 10 10:09:38 soundwave kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x100000 action 0x6 frozen

Nov 10 10:09:38 soundwave kernel: ata2.00: edma_err_cause=00000020 pp_flags=00000001, SError=00100000

Nov 10 10:09:38 soundwave kernel: ata2: SError: { Dispar }

Nov 10 10:09:38 soundwave kernel: ata2.00: cmd 35/00:a0:f7:6b:2f/00:01:47:00:00/e0 tag 0 dma 212992 out

Nov 10 10:09:38 soundwave kernel:          res d0/00:a0:f7:6b:2f/00:01:47:00:00/e0 Emask 0x12 (ATA bus error)

Nov 10 10:09:38 soundwave kernel: ata2.00: status: { Busy }

Nov 10 10:09:38 soundwave kernel: ata2: hard resetting link

Nov 10 10:09:39 soundwave kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Nov 10 10:09:39 soundwave kernel: ata2.00: configured for UDMA/33

Nov 10 10:09:39 soundwave kernel: ata2: EH complete

 

{ beta8 appears to be about 60% faster on this device than beta6 was, but it's currently hovering around ~20Mbps write speeds from usb (to other drives, i get the expected 35-40Mbps). [beta6 was around ~12Mbps] -- but this may just be coincidence }

 

i'd like a little guidance regarding my next steps [once i'm home && the current copy has finished]:

1) try a different sata cable

2) put the "problem" drive directly on the mobo.

3) try a different mobo

 

ie, i'm wondering if there is a step 0 before the above to try with some kernel parameter/hdparm setting/etc that i haven't tried. 

 

oh, and obviously, this type of i/o conflict is causing load to stay through the roof -- single file copy load to this drive causes load in excess of 4.

 

help/advice, please?

Link to comment

Cache drive working now, but when I restart smb doesn't start. I can telnet in and start manually and it works fine. Syslog attached.

 

Is this repeatable, that is, do you see this every time you reboot the server?  If so, please hook up a monitor, re-boot, and tell me if you see any error messages output near the end of the boot process (there are certain messages that only show up on the console & not in the system log).

Link to comment

Errors, repeated literally every second -- this ata :

Nov 10 10:09:38 soundwave kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x100000 action 0x6 frozen

Nov 10 10:09:38 soundwave kernel: ata2.00: edma_err_cause=00000020 pp_flags=00000001, SError=00100000

Nov 10 10:09:38 soundwave kernel: ata2: SError: { Dispar }

Nov 10 10:09:38 soundwave kernel: ata2.00: cmd 35/00:a0:f7:6b:2f/00:01:47:00:00/e0 tag 0 dma 212992 out

Nov 10 10:09:38 soundwave kernel:          res d0/00:a0:f7:6b:2f/00:01:47:00:00/e0 Emask 0x12 (ATA bus error)

Nov 10 10:09:38 soundwave kernel: ata2.00: status: { Busy }

Nov 10 10:09:38 soundwave kernel: ata2: hard resetting link

Nov 10 10:09:39 soundwave kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Nov 10 10:09:39 soundwave kernel: ata2.00: configured for UDMA/33

Nov 10 10:09:39 soundwave kernel: ata2: EH complete

 

First thing to do is see if the problem follows the hard drive, or stays with a controller port.  You can do this by swapping the hard drive on the 'bad' port with one on a 'good' port.  When you boot server, unRAID OS will notice two drives swapped around - you can just click Start and it will record their new positions and start the array.  Now check the system log to see if the problem stays with the hard drive or the port.

 

If hard drive, then you have your answer.  If stays with port, then you can try replacing the cable.  If still errors, then try moving the controller to different slot.  If still errors, then probably bad controller.

 

One other possible culprit: you didn't mention what power supply you are using.  If above troubleshooting does not deduce where the problem lies, then next step would be to change the PSU.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.