unRAID Server Release 5.0-rc13 Available


Recommended Posts

  • Replies 341
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Working ... until I do another mkvmerge, whereupon the destination directory (my user share) disappears from directory listings of tower/mnt/user and direct attempts to mount it result in a stale nfs file handle error again.  Several seconds pass between the commands to open the directory, and the error message being displayed.  At some later time, without any deliberate action on my part, the user share becomes accessible once again.

If you can somehow elaborate on this I would appreciate. For example, can you provide the mkvmerge command being executed that shows all input & output filepaths?

Link to comment

Tom#1,

 

Has the issue where disks could not be added to the array unless precleared been resolved in this release?

(I saw no mention of that fix in the release notes)

What are you referring to here?

 

Several folks have been unable to add drives to v5 RC12a whereby they can't add a new drive -- it won't start the clear correctly ... just loops back to the screen where it's ready to start.    But if they Pre-Clear the drives (using Joe L's script) they'll add with no problem.    Here are a few examples (there are many more scattered throughout the forum):

 

http://lime-technology.com/forum/index.php?topic=27142.0

 

http://lime-technology.com/forum/index.php?topic=27123.0

 

http://lime-technology.com/forum/index.php?topic=27214.0

Link to comment

@limetech:

 

Tom, may I suggest something that probably won't take more that 10 minutes of your time?

 

I am sure you've automated yourself with a script that builds you a kernel. As a test, can you build one targeting 64-bit, without changing anything on your driver. If it builds, just slip it in with  the same bzroot with all the 32-bit stuff there without any changes and pack it.  I'm sure it will boot unRaid up well.  We all may be seriously surprized what difference it will make for everybody that has more than 4GB on their system.  But if you think it will take you more than 10 minutes then forget about it.

Not that simple, because not only does kernel need to to be 64-bit, but every package installed in the system needs to be as well.  There is a library out there that lets 32-bit packages run on a 64-bit platform, but has it's own config issues.  That said, I have set up a 64-bit system and it works pretty well.  It's on my roadmap (which I will be posting on the wiki).

Link to comment

Tom#1,

 

Has the issue where disks could not be added to the array unless precleared been resolved in this release?

(I saw no mention of that fix in the release notes)

What are you referring to here?

Thank you, I'll take a look at those.

 

Several folks have been unable to add drives to v5 RC12a whereby they can't add a new drive -- it won't start the clear correctly ... just loops back to the screen where it's ready to start.    But if they Pre-Clear the drives (using Joe L's script) they'll add with no problem.    Here are a few examples (there are many more scattered throughout the forum):

 

http://lime-technology.com/forum/index.php?topic=27142.0

 

http://lime-technology.com/forum/index.php?topic=27123.0

 

http://lime-technology.com/forum/index.php?topic=27214.0

Link to comment

Now after a fresh install of RC13 on two original LimeTech machines (MD1500/LL and MD1510/LL) I can confirm that RC13 does not stop my array anymore. Fresh reboot, no plugins, not even unMENU - nothing.

 

Whenever I stop the array the following happens in that order (on both machines):

 

* I hit STOP on the WebGUI

* WebGUI shows some messages (spinup, sync, unmounting)

* On unmounting this process ends, there's no further action

* If I try to refresh the page (with the browser) the WebGUI is not found

* If I start PUTTY at that point I can not enter my id/password. Themachine responds with the first line but there's no line to enter my userid.

* --> At that point I can't do anything. No WebGUI, no Telnet.

* Both machines are still running so I have to hard reset with the power button. This leads to a parity check.

 

I canceled parity check, copied RC12a back to both USB sticks and tried to stop the array. Same procedure as shown above.

 

Now after a second (or third) reboot RC12a is back in the game again and I can start and stop the array without any problems.

 

Sorry, no logs. I can't get my hands on unRAID as soon I hit STOP in the WebGUI.

 

Regards

 

I am having the same issues as well.  I'm currently reverting back to RC12a.

Link to comment

 

 

GFOviedo, can you please help us get to the bottom of that problem?

If so, here's what you can do:

 

1. Download the little script that's attached to my signature, and run it on your server.

2. Try to reproduce the problem on your server.

3. Once your server crashes, and you reboot it, look for the syslog saved in the "syslogs" folder on your USB flash disk.

4. Post that syslog here so we can inspect it and see what the problem may have been.

 

Thanks.

 

Too late, I already revert it back to RC12a, and currently conducting a parity check.  Wife wasn't very happy about this since I was at work one of the days the family and friends wanted to use the server.  Luckly she found the hard copy of the movie.

Link to comment

I know this has been brought up before... 

Why don't install a bug tracker system (bugzilla and mantis come to mind)?

Following open issues in this thread is becoming ridiculous, not only for users, but also and especially for tom

 

Sent from my GT-I9100 using Tapatalk 2

 

 

Link to comment

I know this has been brought up before... 

Why don't install a bug tracker system (bugzilla and mantis come to mind)?

Following open issues in this thread is becoming ridiculous, not only for users, but also and especially for tom

 

Sent from my GT-I9100 using Tapatalk 2

 

 

I have to agree. I meant to post this earlier. Trying to use a forum for bug tracking just doesn't work. I know Tom started a forum section that only he and mods can start new topics in but I think a real bug tracking system would help keep issues more organized for Toms own sanity.

Link to comment

I know that this is way too late in the day to comment, but it may be a sensible move for the future.

 

Rather than the users treating this Beta and RC software as final, It would help a lot if they did actually help with the bug tracking / solving rather than just saying me too.

 

To that end even something simple like including the :-

 

 

1. Download the little script that's attached to my signature, and run it on your server.

2. Try to reproduce the problem on your server.

3. Once your server crashes, and you reboot it, look for the syslog saved in the "syslogs" folder on your USB flash disk.

4. Post that syslog here so we can inspect it and see what the problem may have been.

 

Thanks.

 

in the beta / rc's would be helpful. That way it would only be a case of posting the file up after the problem had happened rather than recreating it.

 

We all choose to run the beta software, it is our responsibility to give Tom and the team as much information as possible back to solve the problems.

 

Don't get me wrong, this isn't a complaint about anybody, and the 'me too's' do help in so far as giving some indication as to the scale of the problem. It's just that the more information given the easier the problem is to track down.

 

Good luck Tom, I don't envy you.

 

Link to comment

I'm using High Water for the method to fill the drives in my array. Isn't that supposed to fill each drive halfway? When I was on rc12a my 3TB drives would go down to 1.47, 1.46 etc. But since putting rc13 on my 2TB have been getting data put on them But instead of going down to 1TB they are only going down to 1.47, 1.46TB also before moving on to putting data on the next drive.

 

Also I did get a chance to shut down my unRAID again that is on rc13. This time, instead of powering down from the web gui I pushed the power button on the case to initiate a shutdown. That shut down properly and came back up without any issues.

Link to comment

I'm using High Water for the method to fill the drives in my array. Isn't that supposed to fill each drive halfway? When I was on rc12a my 3TB drives would go down to 1.47, 1.46 etc. But since putting rc13 on my 2TB have been getting data put on them But instead of going down to 1TB they are only going down to 1.47, 1.46TB also before moving on to putting data on the next drive.

 

Also I did get a chance to shut down my unRAID again that is on rc13. This time, instead of powering down from the web gui I pushed the power button on the case to initiate a shutdown. That shut down properly and came back up without any issues.

 

Split level also impacts which drive gets written to.

 

Sent from a phone, sorry for any typos

 

 

Link to comment

I'm using High Water for the method to fill the drives in my array. Isn't that supposed to fill each drive halfway? When I was on rc12a my 3TB drives would go down to 1.47, 1.46 etc. But since putting rc13 on my 2TB have been getting data put on them But instead of going down to 1TB they are only going down to 1.47, 1.46TB also before moving on to putting data on the next drive.

AS I understand it the switch point is half the size of the LARGEST drive (and then subsequent halves of that value), not each individual drive so you are getting expected behaviour since you have 3TB drives in the system.    Having said that I would agree that if it worked on each drive individually so that they ended up the same % filled it would make a lot of sense.

Link to comment

I now have a problem with AFP.  In short, after a clean reboot, I cannot connect to my unraid server from my imac via AFP.  This worked fine in the past.  I have found that if I disable AFP and then enable again, it starts working and I can connect.  It seems to work until I reboot, where I have to disable and then enable again.

 

I don't have time to grab the syslog right now, but wanted to see if anyone else has this same problem.

 

Running the OS X maintenance script fixed this for me. In my case it was a caching issue. Paste the below command into Terminal, it will require your admin password:

 

sudo periodic daily weekly monthly

 

I assume I run this on my mac and not the unraid server, correct?

Link to comment

Guys FYI on the NFS Stale handles issue .... I think there is a tangent going on here ..... most of the

 

"emhttp: get_filesystem_status: statfs: /mnt/user/ftp Transport endpoint is not connected"

 

error messages are reported by *plex* media center plugin users!

 

The problem occurs whether you run NFS or not!

 

I'm wondering what ever made you think that there is a connection between the nfs stale file handles and the transport endpoint problems. ???

Link to comment

I now have a problem with AFP.  In short, after a clean reboot, I cannot connect to my unraid server from my imac via AFP.  This worked fine in the past.  I have found that if I disable AFP and then enable again, it starts working and I can connect.  It seems to work until I reboot, where I have to disable and then enable again.

 

I don't have time to grab the syslog right now, but wanted to see if anyone else has this same problem.

 

Running the OS X maintenance script fixed this for me. In my case it was a caching issue. Paste the below command into Terminal, it will require your admin password:

 

sudo periodic daily weekly monthly

 

I assume I run this on my mac and not the unraid server, correct?

 

Yes. Run it on the Mac. It will have no effect if run on the server.

Link to comment

Guys FYI on the NFS Stale handles issue .... I think there is a tangent going on here ..... most of the

 

"emhttp: get_filesystem_status: statfs: /mnt/user/ftp Transport endpoint is not connected"

 

error messages are reported by *plex* media center plugin users!

 

The problem occurs whether you run NFS or not!

 

I'm wondering what ever made you think that there is a connection between the nfs stale file handles and the transport endpoint problems. ???

 

To be fair ... I was thinking in my head about some plex forum posts trying to link the unRAID Plex media server crash with NFS fuse related issues and accessing large number of files.

 

In Retrospect these *could* be 2 unrelated issues with fuse that result in loss of NFS Access;

 

1). Accessing user share folders with large numbers of files via NFS  => fuse triggers stale_file_handles => NFS loss of access.

2). Accessing user share folders with large numbers of files (via Remote Plex Clients or local http Consol) That triggers plex media server running in unRAID user space to access the files locally => fuse goes la la and NFS mounts to the user shares  become unavailable as a side effect.

 

I can trigger the problem with the following load in 5min 58 sec where I believe that PID 15887 was plex media server

 

  21178 shfs_getattr: ,PID: 15887

  20206 shfs_write:         ,PID: 19517

  20206 shfs_getxattr: ,PID: 19517

  18418 shfs_getxattr: ,PID: 15887

  8900 shfs_read:         ,PID: 15887

  7292 shfs_release: ,PID: 0

  5640 shfs_getattr: ,PID: 19490

  4914 shfs_getattr: ,PID: 19344

  4560 shfs_open:         ,PID: 15887

  4555 shfs_flush:         ,PID: 15887

  3517 shfs_getattr: ,PID: 19350

  3154 shfs_getattr: ,PID: 15942

  2376 shfs_read:         ,PID: 19490

  2284 shfs_open:         ,PID: 19490

  2284 shfs_flush:         ,PID: 19490

  2208 shfs_readdir: ,PID: 15887

    812 shfs_readdir: ,PID: 19490

 

I have emailed the log file to Tom  (thanks for your help m8 its appreciated).

 

I'll download the source for fuse and start to educate myself about it;

 

One thing I have spotted is that just before fuse appears to barf .. there is a lot of the following syslog entries;

 

Jun  7 11:52:38 unRAID shfs/user: assign_disk_high_water: disk1 size 122092910 free 68001250

Jun  7 11:52:38 unRAID shfs/user: assign_disk_high_water: disk2 size 122092910 free 93972738

Jun  7 11:52:38 unRAID shfs/user: assign_disk: disk_path: /mnt/disk1

 

#####################################

UPDATE: SOLVED ?   

 

It looks like its *not* Plex Media Servers access to the Folders with large number of files in it that triggers the plex\fuse problem.

 

Its Plex Media servers "own" access to its own local cache that triggers the problem when accessing the phototranscoder cache;

 

plex/tmp/Library/Application Support/Plex Media Server/Cache/PhotoTranscoder/

 

It seems to be a problem when plex tries to write to the fuse user share

 

"shfs_create: real_path"

 

This is what happens just before the NFS\Fuse loss of access and Plex hang\crash, all this activity happened in 1 second in the plex transcoder cache folder (user share)

 

    10 shfs/user:,shfs_create:,pid:

    10 shfs/user:,shfs_create:,real_path:

    16 shfs/user:,shfs_flush:,pid:

    31 shfs/user:,shfs_getattr:,lookup:

    64 shfs/user:,shfs_getattr:,pid:

      9 shfs/user:,shfs_getxattr:,getxattr:

      9 shfs/user:,shfs_getxattr:,pid:

    10 shfs/user:,shfs_open:,pid:

    16 shfs/user:,shfs_release:,pid:

      6 shfs/user:,shfs_rename:,pid:

    10 shfs/user:,shfs_truncate:,pid:

      9 shfs/user:,shfs_write:,pid:

 

>> When I configure plex media server's library and temp folders at the native disk mount

 

/mnt/disk1/plex/tmp/Library                instead of /mnt/user/plex/tmp/Library

/mnt/disk1/plex/tmp                            instead of /mnt/user/plex/tmp

 

I can no longer recreate the problem of fuse user shares going la la and killing the NFS mounts ! Yea :-)

 

All I have to do now is run fuse in debug and see why plex is able to killl the user share with seemingly such little activity !!!!

 

Link to comment
I also ditched the Realtek-supplied r8168/9 drivers again

 

Can we elaborate on this?  Does this mean the drivers that were added to support all older realtek devices such as mine are now not supported again?

 

Back when, I had an issue where the server would simply stop responding and require a hard reset.  This was eventually tracked to the NIC and since then have been OK.  However, I was planning to remove the add-in card to return to the on-board realtek to reduce the need for the extra card.  However, now I am confused whether or not that is a good idea.

Link to comment

I also ditched the Realtek-supplied r8168/9 drivers again

 

Can we elaborate on this?  Does this mean the drivers that were added to support all older realtek devices such as mine are now not supported again?

 

Back when, I had an issue where the server would simply stop responding and require a hard reset.  This was eventually tracked to the NIC and since then have been OK.  However, I was planning to remove the add-in card to return to the on-board realtek to reduce the need for the extra card.  However, now I am confused whether or not that is a good idea.

 

Sure.  The Realtek-written drivers supplied here (http://code.google.com/p/r8168/) do not compile against the 3.9 kernel tree.  Realtek needs to correct the compilation bug and issue a new release, but I can't wait for that.  In looking at the kernel change logs between 3.4.x and 3.9.x there are several Realtek NIC driver changes, so I went back to the linux-written drivers.  The best way to determine if your NIC still works is to try it.

Link to comment

I'm using High Water for the method to fill the drives in my array. Isn't that supposed to fill each drive halfway? When I was on rc12a my 3TB drives would go down to 1.47, 1.46 etc. But since putting rc13 on my 2TB have been getting data put on them But instead of going down to 1TB they are only going down to 1.47, 1.46TB also before moving on to putting data on the next drive.

AS I understand it the switch point is half the size of the LARGEST drive (and then subsequent halves of that value), not each individual drive so you are getting expected behaviour since you have 3TB drives in the system.    Having said that I would agree that if it worked on each drive individually so that they ended up the same % filled it would make a lot of sense.

 

Ok Thanks. I was thinking it was half of each individual drive. i didn't realize it was based on the largest drive. And this is the first time I've ever ever used anything over 2TB.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.