unRAID Server Release 5.0-rc13 Available


Recommended Posts

  • Replies 341
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

And, you should REBOOT with the old release before upgrading.

 

My machines are off during night, they were booted just some minutes ago. These two plugins, ok, my fault. I thought between RC12a and RC13 it would work.

 

I need to wait for the parity check (50 hours to go). So in three days I can reboot with RC12a and try again without these two plugins...

 

Regards

 

50 hours for a parity check  :o  WOW!

I thought mine were bad at 18.

Link to comment

Has this release added an additional data drive to the PLUS version? I now have the option for 1 Parity, 6 Data and 1 Cache drive.

LOL yep that snuck through, it will be corrected in next release.  Explanation: this release -rc13 had all the code in place to support "cache pool", that is, you can assign a storage device (HDD or SDD) to either the "array" or to the "pool".  In addition you could configure the "pool" to use btrfs as the file system.  So if you want, you could assign all disks to array, or all disks to pool or some combination of both.

Unfortunately, this resulted in quite a few changes to the webGui which broke a number of plugins, particularly Simple Features.  So I've ripped all that out until I can coordinate with plugin authors.  This was an "executive decision" that delayed the release by at least a week.

 

You aren't going to remove the 6th data drive are you? I hope not, I've just spent the last 2 hours putting a new drive into my server and neatly arranging all of the cabling.

Link to comment

Hey Tom,

 

Can you please speak to this change in the change log:

 

- shfs: fixed removexattr() for directories to look at all disks even if xattr not present on some

 

Thanks

 

This applies to the operation of the removexattr() function and "setfattr -x" command.

 

What the operation is supposed to do is remove the given extended attribute from the file/directory.

 

There was a bug in shfs (user share file system) that applied to removing an extended attribute from a directory (removing from file works ok).

 

To remove an extended attribute from a directory, what shfs has to do is remove the extended attribute from each same-named directory on all the disks.

 

The bug was that as soon as shfs encountered a directory that didn't have that extended attribute in the first place, it would immediately terminate the operation, not looking at any other disks for that directory.

 

The fix was to detect this condition of not exiting the loop because of ENOATTR and continue on to check for directory on other disks.

Link to comment

Second main issue this addresses is NFS stale file handles.  If you want to use NFS with user shares, then you need to do some set up explained here:

http://lime-technology.com/wiki/index.php?title=Plugin/webGui/NFS

 

There is a better fix for this which will go into a future release.

 

I am suddenly experiencing what I assume is the stale NFS file handles issue when I never did with any previous betas/RCs.

 

I upgraded to RC13 and both of my OpenELEC boxes can not access my Movies or TV shares (shared via NFS).  I tried the fix described above (extra.cfg) but it did not help.  I reverted back to RC12a and both OpenELEC boxes can once again reach my shares.

 

I have attached both RC12a and RC13 syslogs.  I have NO plugins installed.

 

John

RC12a_syslog.txt

Link to comment

Second main issue this addresses is NFS stale file handles.  If you want to use NFS with user shares, then you need to do some set up explained here:

http://lime-technology.com/wiki/index.php?title=Plugin/webGui/NFS

 

There is a better fix for this which will go into a future release.

 

I am suddenly experiencing what I assume is the stale NFS file handles issue when I never did with any previous betas/RCs.

 

I upgraded to RC13 and both of my OpenELEC boxes can not access my Movies or TV shares (shared via NFS).  I tried the fix described above (extra.cfg) but it did not help.  I reverted back to RC12a and both OpenELEC boxes can once again reach my shares.

 

I have attached both RC12a and RC13 syslogs.  I have NO plugins installed.

 

John

 

And the RC13 syslog (couldn't attach both to the same post)...

 

RC13_syslog.txt

Link to comment

Hey Tom,

 

Can you please speak to this change in the change log:

 

- shfs: fixed removexattr() for directories to look at all disks even if xattr not present on some

 

Thanks

 

This applies to the operation of the removexattr() function and "setfattr -x" command.

 

What the operation is supposed to do is remove the given extended attribute from the file/directory.

 

There was a bug in shfs (user share file system) that applied to removing an extended attribute from a directory (removing from file works ok).

 

To remove an extended attribute from a directory, what shfs has to do is remove the extended attribute from each same-named directory on all the disks.

 

The bug was that as soon as shfs encountered a directory that didn't have that extended attribute in the first place, it would immediately terminate the operation, not looking at any other disks for that directory.

 

The fix was to detect this condition of not exiting the loop because of ENOATTR and continue on to check for directory on other disks.

 

Thanks,

 

So is it possible (or maybe you have already confirmed in your testing) that when e.g. running AFP via unRAID and moving say an ".AppleDB" or ".AppleDouble" folder from one disk to another in unRAID in prior releases could or did cause the corruption or damage (for a lack of better words) of extended attributes (e.g. user.org.netatalk.supports-eas.1qmk2g",0) failed: Exec format error). As it should have removed (deleted) the xattr from the source disk and rewrite (create a new, which it does) xattr to the destination disk?

 

I am still chasing down the AFP issues in my spare time. And have learned quite a bit in the journey. This is the last piece to the puzzle, so AFP too can have a write up (sort of a do's and dont's) like you did for NFS.

 

 

 

Link to comment

I upgraded from rc12a to rc13 in my second unRAID setup around half an hour ago. I've been transferring content to it from my first unRAID(which is running 4.7). I noticed that the network transfer rates seem to be more consistent than they were with rc12a. I'm using a PCI express network card. A TP-Link TG-3468.

 

I had switched to that instead of using the motherboard network port but my results from each were similar with rc12a. I guess I should try the motherboard network port again and see if the speeds with rc13 are also more consistent with the MB network port like they are with the PCI express NIC.

Link to comment

I am receiving the same errors as peter_sm when trying to compile headers to make a VirtualBox package. The errors occur at the "make oldconfig" command. I have attached a log showing each step from a fresh boot up to "make oldconfig && make" in the same thread VirtualBox for unRAID.

 

Was able to successfully run "make oldconfig && make" after some help from piotrasd & nars (& have now compiled VirtualBox for RC13). For anyone needing to compile headers for RC13 (or any linux kernel 3.9 and greater), bc-1.06.95 is now required.

 

Not sure what other programs need to compile special headers, but know there are some & wanted to share the fix (extended details of what I did can be found in VirtualBox for unRAID thread. Also updated the wiki page Installing_VirtualBox_in_unRAID with this information.

Link to comment

Tom, had this posted in 5.0rc forum & was instructed to post over here...

 

Alright, quick question for the guru's.  I'm doing a new build with old server grade equipment. The build is an old HP Proliant DL380 G5. The onboard network is a dual port broadcom netxtreme.

Here's the issue, the network works fine under build 4.7 but will not issue an IP under either 5.12 or 5.13. I have tried ifconfig for both eth0 & eth1, disabling both ports individually in the BIOS, disabling usb 2.0.

I'm attaching syslogs for all three builds.

Thanks in advance

Jason

 

Here is it loading the driver:

 

Code: [select]

Jun  4 15:22:43 Tower kernel: bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.2.3 (June 27, 2012)

 

 

And here'a where the problem occurs:

 

Code: [select]

Jun  4 15:22:43 Tower logger: /etc/rc.d/rc.inet1:  /sbin/ifconfig lo 127.0.0.1

Jun  4 15:22:43 Tower logger: /etc/rc.d/rc.inet1:  /sbin/route add -net 127.0.0.0 netmask 255.0.0.0 lo

Jun  4 15:22:43 Tower logger: /etc/rc.d/rc.inet1:  Polling for DHCP server on interface eth0:

Jun  4 15:22:43 Tower logger: /etc/rc.d/rc.inet1:  /sbin/dhcpcd -t 10 -h Tower -L eth0

Jun  4 15:22:43 Tower dhcpcd[1082]: version 5.2.7 starting

Jun  4 15:22:43 Tower kernel: bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.3.fw"

Jun  4 15:22:43 Tower dhcpcd[1082]: eth0: up_interface: No such file or directory

Jun  4 15:22:43 Tower dhcpcd[1082]: forked to background, child pid 1084

Jun  4 15:22:43 Tower dhcpcd[1084]: eth0: waiting for carrier

 

 

Looks like a kernel issue based on this: www.spinics.net/lists/netdev/msg219373.html

 

Post this in the RC-13 thread so Tom sees it and maybe he can look into it further.

syslog4-7.txt

syslog5-12.txt

syslog5-13.txt

Link to comment

Similar problem as  hawihoney - When issuing  a stop command the array just hangs and the web interface locks up.

 

syslog attached

syslog2 - from telnet as the stop command was issued

screen.jpg - photo from attached monitor

 

I don't believe that I have any plugins on the system.

 

All was fine on 5-0 beta 12a

 

Syslog shows that there was an unclean shutdown previously, with the added issue that the flash drive file system appears to be corrupt.  Take it to another machine where you can run Check Disk on it.  As others have said, it is best to have a clean shut down, before upgrading.

 

If you upgraded the files on the flash drive on a different machine, you may have pulled it before the file system was completely closed.  Always make sure it is safe by using the appropriate 'Safe to remove device' tool.

Link to comment

So far so good.  Not that I'm running anything high maintenance but I'm seeing 130+mb/s parity check thru 12%.  That is at least as good as rc12 (honestly can't remember) and likely better.  My tunables are default.  I have a built-in realtek but I'm running an Intel NIC right now.  Anyway, just adding to the count of positive responses.  Oh yeah and not using NSF either so I can't comment on that :(

Link to comment
Please follow directions here,

 

Had already been done (back in March).

 

... make "stale file handle" happen, and then post your system log.

 

syslog attached.  The mkvmerge was performed at 18:15 - 18:16.  The stale file handle was reported when I attempted to display my Movies share, immediately afterwards.  At 18:18 I started a telnet session to display the mounts.  I can't see anything in the syslog which is going to help you.

 

I would be prepared to accept that this is a fault in ubuntu/mkvmerge .... except that this sequence of events works without any problem when running rc4 - rc10.  The problem only occurs with versions prior to rc4 and newer versions since rc10.

 

As a matter of interest, by this morning, the Movies share can be accessed without problems - I guess that something had performed an automatic umount in the meantime.

 

Note: I forgot to mention in these instructions to stop/start the array after creating the extra.cfg file in order for the change to take effect.

 

The system had been rebooted to enable the rc13 update.

syslog.zip

Link to comment

As a matter of interest, by this morning, the Movies share can be accessed without problems - I guess that something had performed an automatic umount in the meantime.

What's the final status then?  Now working?

Probably the sequence that needs to happen is something like this:

1. un-mount all NFS shares on each client machine

2. create the extra.cfg file with the line shfsExtra="-o noforget"

3. stop/start array for step 2 to take effect

4. re-connect (re-mount) shares on client machines.

 

Now as long as server is not reset, shutdown, or array stop/started, NFS should not get stale file handles.  But if server is reset/shutdown/stopped, then you should also un-mount/re-mount shares on client side as well.  Probably should use 'soft' mount option on clients as well.

Link to comment

Tom#1,

 

Has the issue where disks could not be added to the array unless precleared been resolved in this release?

(I saw no mention of that fix in the release notes)

 

Is there a reason you did not set the oom_score_adj for emhttp to -1000 in rc13?  I saw it is still set to "0", making emhttp as likely a candidate as any to be killed in an OOM situation. 

 

Joe L.

Link to comment

As a matter of interest, by this morning, the Movies share can be accessed without problems - I guess that something had performed an automatic umount in the meantime.

What's the final status then?  Now working?

 

Working ... until I do another mkvmerge, whereupon the destination directory (my user share) disappears from directory listings of tower/mnt/user and direct attempts to mount it result in a stale nfs file handle error again.  Several seconds pass between the commands to open the directory, and the error message being displayed.  At some later time, without any deliberate action on my part, the user share becomes accessible once again.

 

Edit to add:

The Movies folder still appears when I do "ls /net/tower/mnt/user" from command line, but still produces the stale file handle error when I do "ls /net/tower/mnt/user/Movies".  The reports above relate to using nautilus as the file/directory viewer.

 

Edit 2:

I've come back to my Ubuntu desktop thirty minutes later (after eating lunch) and the "ls /net/tower/mnt/user/Movies" now succeeds.

Link to comment

Now after a fresh install of RC13 on two original LimeTech machines (MD1500/LL and MD1510/LL) I can confirm that RC13 does not stop my array anymore. Fresh reboot, no plugins, not even unMENU - nothing.

 

Whenever I stop the array the following happens in that order (on both machines):

 

* I hit STOP on the WebGUI

* WebGUI shows some messages (spinup, sync, unmounting)

* On unmounting this process ends, there's no further action

* If I try to refresh the page (with the browser) the WebGUI is not found

* If I start PUTTY at that point I can not enter my id/password. Themachine responds with the first line but there's no line to enter my userid.

* --> At that point I can't do anything. No WebGUI, no Telnet.

* Both machines are still running so I have to hard reset with the power button. This leads to a parity check.

 

I canceled parity check, copied RC12a back to both USB sticks and tried to stop the array. Same procedure as shown above.

 

Now after a second (or third) reboot RC12a is back in the game again and I can start and stop the array without any problems.

 

Sorry, no logs. I can't get my hands on unRAID as soon I hit STOP in the WebGUI.

 

Regards

 

Link to comment

Now after a fresh install of RC13 on two original LimeTech machines (MD1500/LL and MD1510/LL) I can confirm that RC13 does not stop my array anymore. Fresh reboot, no plugins, not even unMENU - nothing.

[...]

Sorry, no logs. I can't get my hands on unRAID as soon I hit STOP in the WebGUI.

 

...looks like the root-fs gets unmounted too.

What about establishing a telnet session before you hit the stop button?...maybe it will continue to work.

..or enter a "tail -f <path-to-logfile> in that telnet session and take at least a screenshot of the last lines or enable trace/logging in your telnet client (damn, I cant remember where the syslog resides.../var/syslog, /var/messages, ....don't have unRAID running since a long time ;-)

Link to comment

What about establishing a telnet session before you hit the stop button?...maybe it will continue to work.

 

I wrote that in my first post: I once had an open Telnet session during shutdown. The first machine had a problem with the WebGUI, Telnet and was unresponsible. So I tried to shutdown the second machine with /sbin/powerdown. This windows was flooted by spindown messages. These messages were scrolling thru my Telnet window. So it became unresponsible too.

 

As always I'm kicking my own ass that I didn't take screenshots, etc.. But whenever one of my two machines has a problem my brain is off because I do have 0.0% knowledge of Linux, Hardware, etc..

 

Seems that I'm stuck on RC12a. It's working happily since months. It's ok for me.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.