unRAID Server Release 5.0-rc13 Available


Recommended Posts

Without getting into this too much, I'll just list the reason I use NFS on my OpenELEC box: the CPU utilization is noticeably lower and the throughput higher. Doesn't matter on most OpenELEC machines, but is important for the ones with weaker CPUs such as the ARM-based ones. This is also true of many media streamers that use ARM CPUs.

 

 

These are also excellent points. I also don't think telling people to just point directly to this disks when using NFS is a very good answer when user shares are one of the product's main selling points.

Link to comment
  • Replies 341
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I think we are going to need to start a separate discussion about NFS...

 

Fair enough ... where should we take it?

 

What I'm going to have to do is disable NFS access for user shares (but not disk shares).

 

Provided that data integrity is not compromised, I don't see that it's really necessary to completely disable NFS access to user shares.

 

In a nutshell the issue is that NFS requires a persistent "handle" associated with every file system object that is unique and doesn't change as long as that object exists

 

Agreed, so far.

 

, and is never re-used once the object is deleted, and persists across server resets.

 

I cannot believe that all implementations of NFS meet these criteria.  For instance, my Ubuntu desktop machines share files via NFS and I'm pretty certain that they don't have a persistent database of used file handles.  I would absolutely NOT expect a share to be accessible across a server reset.  Logically, the share vanishes when the server is stopped - why should it magically reappear when the server restarts?  This is precisely one of the situations which the 'stale file handle' error is there for.  In such a circumstance, the client should expect to discard the mount of the remote share, and remount it.  The only real problem here is how to ensure that where the server re-uses a file handle, a client isn't still expecting to refer to a different file.

 

The 'noforget' option talked about meets these requirements, at cost of incrementally increasing memory footprint, with the exception of "persistence across server resets", which is a killer, unfortunately.

 

But this is, demonstrably, not true.  I am definitely not resetting my server between starting my mkvmerge process and attempting to re-display the share directory.

 

PeterB I know you are going to ask, "why does it work with -rc10"?  The answer is, "by accident".  Eventually it won't work, or worse (meaning client tries to read fileA and gets contents of fileB instead).

 

Okay, now I understand (in principle, if not in detail)!

 

How to workaround?  Use SMB instead of NFS for user share access.  SMB (actually SMB2, soon SMB3) is pretty good these days, so I would ask, why use NFS with OpenELEC?  I use SMB for all media clients, including OpenELEC, without any issues.

 

Actually, I've never had a problem with OpenELEC/NFS (until rc13, where something seems to be broken), but have only ever seen a stale nfs file handle error when writing to the share from my Ubuntu desktop.

 

[Actually for OpenELEC and XBMC you could associate all your disk shares with Movies or TV, etc if you're really keen on NFS.]

 

Yes, the xbmc/OpenELEC sources file could be modified in this way, but would have to be amended on each client whenever a new drive is added to the share.  But, as I say, xbmc had never been a problem for me, so please don't disable nfs for user shares.

 

Can this issue be fixed? Of course all issues can be fixed and all features can be implemented - it's just a matter of time and resources.  For example, look at AFP.  AFP is very similar to NFS at the protocol level.  Instead of "file handles", AFP uses "cnid's", but it's the exact same concept (actually it's even far more restrictive than NFS handles).  How does netatalk solve this?  By using a full-blown database to store cnid-to-filename mappings.  Can I do this with user shares to support NFS?  Sure, but does it make sense for me to put off all other development to implement this feature?  Hard to justify it.

 

Yes, I understand the constraints on your time.  However, as another has suggested - making existing features work acceptable should have priority over adding new features.

 

There are a couple other possible solutions:

a) use NFSv4 which has concept of "volatile" file handle.  Read about it here:

http://docs.oracle.com/cd/E19082-01/819-1634/rfsrefer-137/index.html

 

If NFSv4 'fixes' the problem, what is involved in getting v4 working in unRAID?  I have found that I have to force my Ubuntu machines to use v3 for the unRAID shares in order to make them work at all.

 

b) use a user-space NFS server.  There are a few user-space NFS server projects out there, without a lot of traction though.

 

I would think that is totally unnecessary, and would only introduce other headaches for you.

 

Here's what I ask.  After reading this, go outside and kick something and then let's talk about this in an objective manner.  Probably the solution is to use NFSv4.

 

No need to kick anything ... I am perfectly calm about this issue, only keen to understand and assist in any way possible.

 

I think that you are suggesting that it is unwise to carry on using rc10 just because it appears to work, so I will move to rc12 and accept that I have to work around the stale handle problem for now (either by waiting, or forcing umount/mount).  However, I do have a serious plea - can you ensure that OpenELEC can read NFS user shares before releasing 5.0 final?

Link to comment

UnRAID 4.7 is the "official" release even today, and is a rock-solid product as a storage server, which is what it's sold as.   

 

Unless you want 3TB+ support.....

As for issues with various plugins and add-ons ... NOT your problem !!    Folks will always modify technology gadgets ... but doing so is at THEIR risk -- not the providers.    Consider how many phones, routers, game consoles, etc. that have developed entire communities dedicated to modifying them to do other things !!    Nobody expects to go back to the manufacturer to resolve any problems these mods produce  :)

 

Add-ons from version 5 onward are supported. No third party software needs to be installed to add the ability to install add-ons. Hell one was even made by Tom. Everything you mentioned is modifed to install and run core unsupported code. (Also remember a lot of the time the mods fix problems the manufacturer neglects to fix) Also this product gets compared to a NAS where plugins are supported which directly competes with unraid. If anything unraid should try and cater to addon developers and help insure the addons they produce are compatible. (which sounds like that is happening, ie Simple Features) Its extremely fortunate that the community has created all these addons to greatly enhance the usefulness of unraid. Without these addons it would be a significantly less valuable product. I would venture to guess most people here use addons. If not in unraid itself then on a VM along side or within unraid.

Link to comment

Tom#1,

 

Is there a reason you did not set the oom_score_adj for emhttp to -1000 in rc13?  I saw it is still set to "0", making emhttp as likely a candidate as any to be killed in an OOM situation. 

 

Joe L.

 

Didn't see a response to this one...Tom?

That's trivial to put into your 'go' script and/or implemented via a plugin, not something necessarily I want to put as a 'hardcoded' policy.  I guess I could be talked into it.

Link to comment

UnRAID 4.7 is the "official" release even today, and is a rock-solid product as a storage server, which is what it's sold as.   

 

Unless you want 3TB+ support.....

As for issues with various plugins and add-ons ... NOT your problem !!    Folks will always modify technology gadgets ... but doing so is at THEIR risk -- not the providers.    Consider how many phones, routers, game consoles, etc. that have developed entire communities dedicated to modifying them to do other things !!    Nobody expects to go back to the manufacturer to resolve any problems these mods produce  :)

 

Add-ons from version 5 onward are supported. No third party software needs to be installed to add the ability to install add-ons. Hell one was even made by Tom. Everything you mentioned is modifed to install and run core unsupported code. (Also remember a lot of the time the mods fix problems the manufacturer neglects to fix) Also this product gets compared to a NAS where plugins are supported which directly competes with unraid. If anything unraid should try and cater to addon developers and help insure the addons they produce are compatible. (which sounds like that is happening, ie Simple Features) Its extremely fortunate that the community has created all these addons to greatly enhance the usefulness of unraid. Without these addons it would be a significantly less valuable product. I would venture to guess most people here use addons. If not in unraid itself then on a VM along side or within unraid.

I agree completely with the 'add-on' policy that they should be supported 100%.  I only ask that in this never-fricking-ending -rc phase that one should disable plugins, run the new release to verify basic stuff works for you, then start enabling plugins.  This way if there is an issue it would be easier to identify where the issue starts happening.

Link to comment

Tom#1,

 

Is there a reason you did not set the oom_score_adj for emhttp to -1000 in rc13?  I saw it is still set to "0", making emhttp as likely a candidate as any to be killed in an OOM situation. 

 

Joe L.

Didn't see a response to this one...Tom?

That's trivial to put into your 'go' script and/or implemented via a plugin, not something necessarily I want to put as a 'hardcoded' policy.  I guess I could be talked into it.

Since emhttp cannot be re-started without core-dumping in the 5.X series (or, at least not without un-mounting all the disks), it is best if emhttp is not the best candidate to be killed in an out-of-memory condition.   

Once emhttp is killed, there is no easy way for most unRAID users to stop the array cleanly

 

If an add-on used all the memory, best let the add-on be killed first.

 

Yes, it can be a plugin to be optionally installed, or even a line in the config/go script.

(I can accept either solution since it does not need to be compiled into emhttp.)

Link to comment

I cannot believe that all implementations of NFS meet these criteria.  For instance, my Ubuntu desktop machines share files via NFS and I'm pretty certain that they don't have a persistent database of used file handles.

They do, they're called "inodes".  An inode is an on-disk data structure that (for file objedts) lists the set of blocks comprising a file along with other info such as permission bits.  For directory objects an inode lists the set of file names that are in that directory.  Each file name entry in a directory inode includes also the "inode number" of the file it refers to.  Given an inode number, the kernel can rapidly read that inode into memory and thus know what blocks comprise the file.  As you traverse various directories, the directory and file inodes get cached in the linux inode cache for quick access.  When NFS looks up a file, the kernel builds a "handle" that includes the file's inode number in the handle.  This way when the kernel receives a request to access a certain file by file handle it can just use the inode number stored in the handle to quickly find information about the file on disk. [by the way, what if a client takes a NFS handle and changes the part where the inode is stored and replaces it with a different inode and then asks the server for that file - what happens? well this is a well know security issue with NFS - the kernel may indeed return the file contents of the spoofed inode number, but I digress...]

  I would absolutely NOT expect a share to be accessible across a server reset.  Logically, the share vanishes when the server is stopped - why should it magically reappear when the server restarts?

Actually that's not true.  If a server resets while a client still has an NFS mount active, and then comes back before the client starts accessing files again, everything just works - the client doesn't even know the server reset.  This is one of the design features of NFS.

This is precisely one of the situations which the 'stale file handle' error is there for.

"stale file handles" are meant to handle the case where one client, say deletes a file, for which another client has a cached file handle for on the client-side.  It's more-or-less the equivalent of "file no longer exists".

In such a circumstance, the client should expect to discard the mount of the remote share, and remount it.  The only real problem here is how to ensure that where the server re-uses a file handle, a client isn't still expecting to refer to a different file.

Re-mounting occurs because of buggy NFS client/server code, it should not have to happen if everything is working according to the spec.  The handle-reuse is solved via "generation" number also passed in the handle.

The 'noforget' option talked about meets these requirements, at cost of incrementally increasing memory footprint, with the exception of "persistence across server resets", which is a killer, unfortunately.

 

But this is, demonstrably, not true.  I am definitely not resetting my server between starting my mkvmerge process and attempting to re-display the share directory.

Well I haven't been able to reproduce exactly the symptoms you see, which is what makes this issue a little baffling.  I did make a change around -rc10 or so, where I use the upper bits of the 'st_ino' field passed in a statf() call as a bitmask of disks that the object is on (to identify duplicates).  My theory is that somewhere in your code stack there is something that is using this field to construct a file handle, but I haven't spent the time to verify this.

 

So I'm not ready to give up on NFS yet, sorry for my little rant earlier.  To elaborate a little bit about the actual issue... refer back to the explanation above about inodes.  The user share file system (shfs) is not a "real" file system in the sense that ext3 or reiserfs or ntfs is a "real" file system.  shfs is a "stacked" file system, meaning it "looks like" a file system to the linux kernel, but it doesn't maintain any on-disk data structures of it's own.  As you traverse directories in /mnt/user "inodes" are created on-the-fly in memory and left in memory (this is done by FUSE).  When NFS needs a handle, FUSE returns the in-memory inode-number.  But fuse will age these "inode" structures and discard them over time.  This is what causes the "stale file handle" because if NFS request comes in for an in-memory FUSE inode that has been discarded, FUSE has no choice but to say, "damn, that inode is gone".  The "noforget" option tells fuse to not discard the in-memory inodes.  But suppose there's a server reset?  Well now all the inodes are gone.

Link to comment

the webgui becomes inaccessible after waking from sleep mode.

Wake from sleep IS NOT a supported feature of unRAID.

Huh? If S3 sleep is well supported feature in Slackware, then why did you decide that it is not supported in unRAID?  And forgive me for going a little off topic, but there is something that  I've been meaning to ask you for a long time:  Why do you feel so strongly that you have to shoot down any hint of a possible bug in unRaid?  If you were getting paid to do that, then I would understand.  Otherwise it's just strange.  The thing is, you aren't doing unRaid any service that way.

S3 sleep in the past was problematic because of "buggy" motherboard bios.  I didn't want to get into a situation of having to write all kinds of specialized modules to support every motherboard out there.  These days the situation is much better and I agree that S3 sleep support should be "standard".

Link to comment

My drives will not spin down. I did a manual spin down and my syslog is attached. In addition, the webgui becomes inaccessible after waking from sleep mode.

Wake from sleep IS NOT a supported feature of unRAID.

 

Yes I know, but it is still a regression. Also, I feel it is not as big of an issue as the drives not spinning down.

 

I should point out that unraid itself is still accessible after waking from sleep. I can access user shares and telnet in. This one could also be a bug in the webgui that is unrelated to the unraid core.

Link to comment

Huh? If S3 sleep is well supported feature in Slackware, then why did you decide that it is not supported in unRAID?  And forgive me for going a little off topic, but there is something that  I've been meaning to ask you for a long time:  Why do you feel so strongly that you have to shoot down any hint of a possible bug in unRaid?  If you were getting paid to do that, then I would understand.  Otherwise it's just strange.  The thing is, you aren't doing unRaid any service that way.

I don't much pay attention to Slackware proper, but I know from the unRAID side that sleep has never been a supported function.  If it was I would imagine that it would be somewhere in the webGUI so that it could be easily accessed.  I am not shooting down a bug, since that functionality has never been supported.  Any sleep functionality is something that user has played with and managed to make work for them at that point in time.

 

I had no intention of making my reply come across as me yelling. The caps for those couple words was to emphasize the "is not" part of the statement.  If I was yelling the entire sentence would have been in caps.

Link to comment

My drives will not spin down. I did a manual spin down and my syslog is attached. In addition, the webgui becomes inaccessible after waking from sleep mode.

Wake from sleep IS NOT a supported feature of unRAID.

 

Yes I know, but it is still a regression. Also, I feel it is not as big of an issue as the drives not spinning down.

It may be a regression on your particular motherboard but it may still be working fine on others.

 

I played with sleep on my dev/test server and could never get it to work correctly.  I gave up after a while because I had better things to mess with and if I really wanted to save money, shutting the server down was the easier thing to do.  I got tired of turing the thing off and my dev/test server runs all the time now.

Link to comment

Tom#1,

 

Is there a reason you did not set the oom_score_adj for emhttp to -1000 in rc13?  I saw it is still set to "0", making emhttp as likely a candidate as any to be killed in an OOM situation. 

 

Joe L.

 

Didn't see a response to this one...Tom?

That's trivial to put into your 'go' script and/or implemented via a plugin, not something necessarily I want to put as a 'hardcoded' policy.  I guess I could be talked into it.

 

Agreed, its in my go script. But I think its more to help out users that aren't aware, and will provide a bit of protection like Joe mentioned. Coming already set would be better then having to have users add this themselves. Maybe just have it as part of the default go script? Though is there really any negative to having it hardcored?

Link to comment

Huh? If S3 sleep is well supported feature in Slackware, then why did you decide that it is not supported in unRAID?  And forgive me for going a little off topic, but there is something that  I've been meaning to ask you for a long time:  Why do you feel so strongly that you have to shoot down any hint of a possible bug in unRaid?  If you were getting paid to do that, then I would understand.  Otherwise it's just strange.  The thing is, you aren't doing unRaid any service that way.

I don't much pay attention to Slackware proper, but I know from the unRAID side that sleep has never been a supported function.  If it was I would imagine that it would be somewhere in the webGUI so that it could be easily accessed.  I am not shooting down a bug, since that functionality has never been supported.  Any sleep functionality is something that user has played with and managed to make work for them at that point in time.

 

I had no intention of making my reply come across as me yelling. The caps for those couple words was to emphasize the "is not" part of the statement.  If I was yelling the entire sentence would have been in caps.

 

My question to you was meant to be more general, not just about that s3 issue.  I've been reading your posts for quite awhile now.  You have this desire to "defend" unRaid at all cost, and to fight and defeat any mention of shortcomings.  Don't take it too hard, I am sure that you are an intelligent person who can take a hint.  So leave it at that.

 

Forum posts and email are notorious for not conveying nuances and intentions correctly.  If I can just get most of these nagging problems solved then most of this goes away.  It's like a sports team - when a team is losing every negative thing is magnified, some deservedly, some not.  But when the team is winning lots of stuff is overlooked.  Sorry the limetech "team" has had a bad streak, but rest assured we are trying to take steps to turn it around.

Link to comment

The amount of communication, and detailed responses lately have helped greatly I'm sure. Personally I was never upset with the lack of communication but not knowing what was going on made it feel like the progress was slow or non existent. Things are definitely looking up, though personally haven't had major problems with unraid.

 

Keep up the good work Tom x2!

Link to comment

... lesson on use of inodes ...

 

Thanks for your patience in correcting my understanding.  Clearly, if I had nothing else to do, I should go away and read all the nfs rfcs!

 

 

The 'noforget' option talked about meets these requirements, at cost of incrementally increasing memory footprint, with the exception of "persistence across server resets", which is a killer, unfortunately.

 

But this is, demonstrably, not true.  I am definitely not resetting my server between starting my mkvmerge process and attempting to re-display the share directory.

Well I haven't been able to reproduce exactly the symptoms you see, which is what makes this issue a little baffling.  I did make a change around -rc10 or so, where I use the upper bits of the 'st_ino' field passed in a statf() call as a bitmask of disks that the object is on (to identify duplicates).  My theory is that somewhere in your code stack there is something that is using this field to construct a file handle, but I haven't spent the time to verify this.

 

Okay, if that change was definitely between rc10 and rc11, then that could be an explanation.

In your efforts to reproduce, are actually running mkvmerge?  Would it help if I describe, in precise detail, exactly what I do?

 

So I'm not ready to give up on NFS yet, sorry for my little rant earlier.

 

Hey, no problem.

 

To elaborate a little bit about the actual issue...

 

[snip]

 

The "noforget" option tells fuse to not discard the in-memory inodes.  But suppose there's a server reset?  Well now all the inodes are gone.

 

Well, provided that the expectation that inodes are preserved doesn't pose a risk of data corruption, I really can't get too worked up about problems when the server gets reset.  What does frustrate me is when I access a share, perform an operation, and then can no longer access the share.

 

Any comments on the difficulty in adding NFSv4 support to unRAID.

 

Any comments on the inability of OpenELEC to access user shares on rc13?  At least two of us have this problem.

Link to comment

Any comments on the difficulty in adding NFSv4 support to unRAID.

I'm looking at that now.

 

Any comments on the inability of OpenELEC to access user shares on rc13?  At least two of us have this problem.

Tomorrow I'll set up an OpenELEC client and try out NFS.  Most of my usual testing is just using linux command line to connect as in:

 

mount 192.168.1.10:/mnt/user/Movies /data/Movies

 

No other options.. all works fine (with 'noforget' set).

 

Also, in that 'extra.cfg' file, you could use this instead:

 

shfsExtra="=o remember=xxx"

 

where "xxx" is a timeout in seconds.  After no inode access for "timeout" seconds, FUSE will mark inode for possible reuse.  So in last -rc it was set to 330, which is 5 1/2 minutes.  Why this number?  Typical NFS clients will only cache NFS handles for no longer than 5 minutes.  You can also set 'xxx' to -1 which is exactly equivalent to using "-o noforget".

Link to comment

My drives will not spin down. I did a manual spin down and my syslog is attached. In addition, the webgui becomes inaccessible after waking from sleep mode.

Wake from sleep IS NOT a supported feature of unRAID.

 

Yes I know, but it is still a regression. Also, I feel it is not as big of an issue as the drives not spinning down.

It may be a regression on your particular motherboard but it may still be working fine on others.

 

I played with sleep on my dev/test server and could never get it to work correctly.  I gave up after a while because I had better things to mess with and if I really wanted to save money, shutting the server down was the easier thing to do.  I got tired of turing the thing off and my dev/test server runs all the time now.

 

Yes of course, it most likely is specific to my motherboard due to some kernel problem or something outside of Tom's control.

Link to comment

Tom#1,

 

Is there a reason you did not set the oom_score_adj for emhttp to -1000 in rc13?  I saw it is still set to "0", making emhttp as likely a candidate as any to be killed in an OOM situation. 

 

Joe L.

 

Didn't see a response to this one...Tom?

That's trivial to put into your 'go' script and/or implemented via a plugin, not something necessarily I want to put as a 'hardcoded' policy.  I guess I could be talked into it.

 

Agreed, its in my go script. But I think its more to help out users that aren't aware, and will provide a bit of protection like Joe mentioned. Coming already set would be better then having to have users add this themselves. Maybe just have it as part of the default go script? Though is there really any negative to having it hardcored?

 

Right so IF Tom adds the one liner code to the go script. It would be a hardcoding to the go file that comes with the source distribution, and it anyone wanted to comment it out because they were test out something they could. But otherwise it's pre-set in the go script and not buried and hardcoded in unRAID itself. Would that make sense/ acceptable to all?

Link to comment

Huh? If S3 sleep is well supported feature in Slackware, then why did you decide that it is not supported in unRAID?  And forgive me for going a little off topic, but there is something that  I've been meaning to ask you for a long time:  Why do you feel so strongly that you have to shoot down any hint of a possible bug in unRaid?  If you were getting paid to do that, then I would understand.  Otherwise it's just strange.  The thing is, you aren't doing unRaid any service that way.

I don't much pay attention to Slackware proper, but I know from the unRAID side that sleep has never been a supported function.  If it was I would imagine that it would be somewhere in the webGUI so that it could be easily accessed.  I am not shooting down a bug, since that functionality has never been supported.  Any sleep functionality is something that user has played with and managed to make work for them at that point in time.

 

I had no intention of making my reply come across as me yelling. The caps for those couple words was to emphasize the "is not" part of the statement.  If I was yelling the entire sentence would have been in caps.

 

My question to you was meant to be more general, not just about that s3 issue.  I've been reading your posts for quite awhile now.  You have this desire to "defend" unRaid at all cost, and to fight and defeat any mention of shortcomings.  Don't take it too hard, I am sure that you are an intelligent person who can take a hint.  So leave it at that.

 

Forum posts and email are notorious for not conveying nuances and intentions correctly.  If I can just get most of these nagging problems solved then most of this goes away.  It's like a sports team - when a team is losing every negative thing is magnified, some deservedly, some not.  But when the team is winning lots of stuff is overlooked.  Sorry the limetech "team" has had a bad streak, but rest assured we are trying to take steps to turn it around.

This is not at you Tom, just a general statement so I don't have to pick on only prostuff1's posting. It because of quick swatting remarks, without thought to the fact there are various scenarios that are important to each person here. Power consumption because it's expensive in there town/country etc. limit space. Particular protocol, small array, largest array possible, no-add ons wanted, a bunch of add-ons wanted. Physical/ virtual unRaid and the list goes on. So that is why It goes a longggg way that in these sensitive request, complaint, questions, etc either yourself or T2 answer. Or a link is provided to your written supported page by a mod or dye-hard. The people need to hear from the king/president, not the army all the time  ;)

 

I am sure a common ground can be found. I know everyone appreciates the help they receive on the support threads by the experienced and mods. But new issue, etc. need ur acknowledgement as valid or not, even if u don't plan on making a change to something.

Link to comment

and the screen capture

 

 

I'm having this same issue sporadically when stopping the array. The browser interface and console becomes non-responsive and I get the same type of info in the console as this screen capture. I am forced to power off. Upon reboot, the browser interface reports an unclean shutdown of course. Info from the console indicates that volume sda1 (the flash drive) was not properly unmounted and some data may be corrupt. I have included my syslog report.

syslog_2013-06-06_01.09.07.txt

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.