unRAID Server Release 5.0-rc6-r8168-test Available


Recommended Posts

hi Joe,

 

So would you recommend that I move to rc6 TEST?  If that release fixed issues for you and you have the same mb as me, then perhaps I should use that build.

 

So that I understand, when you did have the issue, your network connection would actually drop?  I don't think I've ever had that issue where my server dropped off the network.

 

Thanks

Link to comment
  • Replies 257
  • Created
  • Last Reply

Top Posters In This Topic

hi Joe,

 

So would you recommend that I move to rc6 TEST?  If that release fixed issues for you and you have the same mb as me, then perhaps I should use that build.

 

So that I understand, when you did have the issue, your network connection would actually drop?  I don't think I've ever had that issue where my server dropped off the network.

 

Thanks

The network connection would fail and whatever I was copying would stop.

 

Yes, I'd use the new "5.0-rc6-r8168-test" version if I were you.

Link to comment

I upgraded to this version today in the hopes that I would be able to use NFS instead of AFP from my Mac Mini, since I'm having issues with AFP: http://lime-technology.com/forum/index.php?topic=21155.0 ...  previously I was getting the "stale file handle" message all the time trying to use NFS.

 

Now I'm seeing some even more bizarre behavior with NFS, were directories become inaccessible or disappear for no obvious reason.  In this example, "content" is the name of my Mac Mini and "thedisk" is the name of the unRAID box.  I'm running this from the command line.  Here, I see the (correct) list of subdirectories when I do a standard "ls", then if I try to cd into one, it says "No such file or directory":

 

content:ExampleShow mmccurdy$ pwd
/Volumes/thedisk-nfs/TV/ExampleShow
content:ExampleShow mmccurdy$ ls
Season 1	Season 2	Season 3
content:ExampleShow mmccurdy$ cd Season\ 1
-bash: cd: Season 1: No such file or directory
content:ExampleShow mmccurdy$ ls -l
ls: Season 1: No such file or directory
ls: Season 2: No such file or directory
total 1
drwxr-xr-x  1 1000  _lpoperator  112 Jul 18 05:05 Season 3
content:ExampleShow mmccurdy$ ls
Season 1	Season 2	Season 3

 

I am using automount to mount "TV," here is the line item from "mount" on the Mac Mini:

 

192.168.1.97:/mnt/user/TV on /Volumes/thedisk-nfs/TV (nfs, nodev, nosuid, automounted, nobrowse)

 

.. and here is the line item form /etc/exports on the unRAID box:

 

/mnt/user/TV -async,no_subtree_check,fsid=102 *(rw,insecure,anonuid=1000,anongid=100,all_squash)

 

Everything will be working fine one minute, then suddenly I'll do something like try to enter (cd into) some directory, and I get the "No such file or directory" error.

 

Here is how the directories appear on the unRAID box itself:

root@thedisk:/mnt/user/TV/ExampleShow# ls -l
total 0
drwxrwsr-x 1 mmccurdy users  80 2012-07-11 12:07 Season\ 1/
drwxrwsr-x 1 mmccurdy users  80 2012-07-11 12:10 Season\ 2/
drwxr-xr-x 1 mmccurdy users 112 2012-07-18 05:05 Season\ 3/

 

There is nothing at all suspicious in the syslog when this is happening.  At least I'm no longer getting stale file handles I guess.  If no one else is having this problem, I guess it could be a config issue on my end, but it just seems odd that it would work fine one second and then not the next.

Link to comment

Ok, I'm going to move to this version and I would like to use a new flash drive.  I have a Kingston 64 (super crazy read / write speeds) and I thought I would do a mostly fresh installation.  Allow me to define mostly. 

 

I want to use the make_bootable.bat and start with all fresh files from the AiO.  The only thing I want to take with me is the config directory, as it has my share, disk, and other config files.  I'm already running v5 rc5.  I just thought I would interduce a new flash drive for the new release.  My existing flash drive is from 2009 and has been in my unraid server the entire time.

 

I formatted the flash drive as instructed, leave format as default which is exFAT.  I then copied all the files from the AiO to the flash.  Set volume name to UNRAID.  I go to a Command prompt (yes I ran as Admin).  When I run the batch file, it confirms everything and issues the syslinux.exe command. I then get a error that says...This doesn't look like a Valid FAT file system, then errrors out and ends.  Did I miss something?

 

 

Link to comment

Ok, I'm going to move to this version and I would like to use a new flash drive.  I have a Kingston 64 (super crazy read / write speeds) and I thought I would do a mostly fresh installation.  Allow me to define mostly. 

 

I want to use the make_bootable.bat and start with all fresh files from the AiO.  The only thing I want to take with me is the config directory, as it has my share, disk, and other config files.  I'm already running v5 rc5.  I just thought I would interduce a new flash drive for the new release.  My existing flash drive is from 2009 and has been in my unraid server the entire time.

 

I formatted the flash drive as instructed, leave format as default which is exFAT.  I then copied all the files from the AiO to the flash.  Set volume name to UNRAID.  I go to a Command prompt (yes I ran as Admin).  When I run the batch file, it confirms everything and issues the syslinux.exe command. I then get a error that says...This doesn't look like a Valid FAT file system, then errrors out and ends.  Did I miss something?

 

ExFAT is not FAT. FAT is required.

Link to comment

Ok, I'm going to move to this version and I would like to use a new flash drive.  I have a Kingston 64 (super crazy read / write speeds) and I thought I would do a mostly fresh installation.  Allow me to define mostly. 

 

I want to use the make_bootable.bat and start with all fresh files from the AiO.  The only thing I want to take with me is the config directory, as it has my share, disk, and other config files.  I'm already running v5 rc5.  I just thought I would interduce a new flash drive for the new release.  My existing flash drive is from 2009 and has been in my unraid server the entire time.

 

I formatted the flash drive as instructed, leave format as default which is exFAT.  I then copied all the files from the AiO to the flash.  Set volume name to UNRAID.  I go to a Command prompt (yes I ran as Admin).  When I run the batch file, it confirms everything and issues the syslinux.exe command. I then get a error that says...This doesn't look like a Valid FAT file system, then errrors out and ends.  Did I miss something?

 

ExFAT is not FAT. FAT is required.

 

Ah, I didn't remember that.  Man, I setup my original flash drive back in 2009 and have drank way too many beers since then...LOL

 

Ok, so my options in Win7 64bit OS to format this flash drive is exFAT or NTFS.  No FAT option exists.  I was expected to see FAT32 listed as a option at least, wtf.  I guess I will need to find another system that will allow me to format in FAT.

Link to comment

Just a quick note to let Tom know that I recently upgraded from 4.7 to 5.0-RC6-R8168-Test, and I've been up a week and have had zero issues.  My build is in my sig. 

 

For those that care, Parity Check dropped from ~53 MB/s to ~48 MB/s.

 

Thank you Tom.

Link to comment

I don't have the HP Format tool.

 

Ok, how about FAT32?  I was able to use my imac to format it to fat32.  If that won't work, I'll go find the HP tool.

 

Thanks

 

update:  the make_bootable batch file worked with it being FAT32.  I've copied everything and as soon as my parity check is complete, I'll reboot using this flash and this new release.  Almost there....

Link to comment

I'm running v5 rc6 test now, no issues with the new build.

 

I did have a weird thing happen when I tried to use my new flash drive.  Re above, I prepared the flash drive and then copied all the files from my existing flash drive to the new one.  I then replaced the two main files bzroot and  bzimage with the files from v5 rc6 test.  The system booted normally but after the server was running, the ARray was not running and it only saw 2 of my 9 drives.  So I just did a Shutdown, put the old flash drive back in (updated those files with v5 rc6) and it came up fine, Array was running and it sees all drives.  The new flash drive should have worked, unless there is something weird about having a fat32 64gb size drive.  I even made sure that my config files were the ones from my old (working) flash drive. 

 

I'd like to understand why the new flash drive didn't work. 

 

Syslog attached.

 

Thanks!

syslog-20120724-184538_newflash.zip

Link to comment

Build: exact replica of Raj's Greenleaf-Technology.com's 20-drive budget tower: 2xAOC-SASLP-MV8, Realtek 8111DL

 

Up on r8168-test for well over a week with no troubles (except the drive swap issue mentioned earlier, which may or may not be a r8168-related issue), until last night - I've been transferring files almost non-stop since the system went live, parity speeds are normal, and I hadn't hit any of the reported errors until I tried transferring another block of 500gb or so and the transfer hung ("Windows cannot access file [on r8168-machine side, not transferring side]"), at which point I closed the error windows and checked the webgui.  Webgui seemed to lock, but finally refreshed after about 8 minutes.  I decided to wait until morning to mess with the system, and just tried stopping the array about 30 minutes ago; webgui is still locked in a loading pattern.

 

Unless I see a recommendation otherwise, I'm going for a hard shutdown at the 90-minute wait mark, and I'll provide a log as soon as I have everything up and running again.

 

EDIT: Syslog attached via Telnet - that's still responsive...

syslog.zip

Link to comment

Jul 25 14:34:35 Tower2 kernel: flush-9:4: page allocation failure: order:1, mode:0x8020

Jul 25 14:34:35 Tower2 kernel: Pid: 21691, comm: flush-9:4 Tainted: G          O 3.4.4-unRAID #1

Jul 25 14:34:35 Tower2 kernel: Call Trace:

Jul 25 14:34:35 Tower2 kernel:  [<c1062e3a>] warn_alloc_failed+0xbd/0xcf

Jul 25 14:34:35 Tower2 kernel:  [<c10636b2>] __alloc_pages_nodemask+0x47c/0x4a5

Jul 25 14:34:35 Tower2 kernel:  [<c1005858>] dma_generic_alloc_coherent+0x64/0xcc

Jul 25 14:34:35 Tower2 kernel:  [<c107f030>] ? T.603+0x23/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c107f0c0>] T.603+0xb3/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c10057f4>] ? dma_set_mask+0x37/0x37

......

Jul 25 14:34:35 Tower2 kernel: mvsas 0000:02:00.0: mvsas prep failed[0]!

Jul 25 14:34:35 Tower2 kernel: flush-9:4: page allocation failure: order:1, mode:0x8020

Jul 25 14:34:35 Tower2 kernel: Pid: 21691, comm: flush-9:4 Tainted: G          O 3.4.4-unRAID #1

Jul 25 14:34:35 Tower2 kernel: Call Trace:

Jul 25 14:34:35 Tower2 kernel:  [<c1062e3a>] warn_alloc_failed+0xbd/0xcf

Jul 25 14:34:35 Tower2 kernel:  [<c10636b2>] __alloc_pages_nodemask+0x47c/0x4a5

Jul 25 14:34:35 Tower2 kernel:  [<c1005858>] dma_generic_alloc_coherent+0x64/0xcc

Jul 25 14:34:35 Tower2 kernel:  [<c107f030>] ? T.603+0x23/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c107f0c0>] T.603+0xb3/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c10057f4>] ? dma_set_mask+0x37/0x37

 

Running a preclear as well?

Jul 25 15:40:44 Tower2 preclear_disk-diff[29666]: == invoked as: ./preclear_disk.sh -A /dev/sdl

Link to comment

I'd given that a shot already, but received "drive is busy" errors, so figured hard power was my only option - is there a way to manually override whatever process is causing the "busy" message?  Or are the "drive is busy" errors indicative of something even more serious?

 

EDIT: Should probably mention I've received the errors after "fuser -mvk" & "kill PID" attempts, in case anyone was going to suggest I try those steps.

Link to comment

"lsof /mnt" gives me nothing - just bumps me back to the command line, no spaces or anything...  What's weird is when I go back to "fuser -mvk" after the "lsof /mnt", it still shows the "busy" process, but when I type "kill [PID]," it comes back with "No such process".

 

EDIT: Hard rebooted, running a non-modification parity check now; if anything turns out strange, I'll let everyone know.

Link to comment

also to check any processes preventing disks from unmounting.

To identify processes holding a disk busy you can type:

fuser -mv /mnt/disk* /mnt/user/*

 

To terminate processes holding a disk busy you can type (example is for disk1):

fuser -mvk /mnt/disk1

 

or you can individually terminate individual process IDs by typing

kill PID

(where PID = the numberic process ID as printed by the prior fuser -mv command)

Link to comment

Jul 25 14:34:35 Tower2 kernel: flush-9:4: page allocation failure: order:1, mode:0x8020

Jul 25 14:34:35 Tower2 kernel: Pid: 21691, comm: flush-9:4 Tainted: G          O 3.4.4-unRAID #1

Jul 25 14:34:35 Tower2 kernel: Call Trace:

Jul 25 14:34:35 Tower2 kernel:  [<c1062e3a>] warn_alloc_failed+0xbd/0xcf

Jul 25 14:34:35 Tower2 kernel:  [<c10636b2>] __alloc_pages_nodemask+0x47c/0x4a5

Jul 25 14:34:35 Tower2 kernel:  [<c1005858>] dma_generic_alloc_coherent+0x64/0xcc

Jul 25 14:34:35 Tower2 kernel:  [<c107f030>] ? T.603+0x23/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c107f0c0>] T.603+0xb3/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c10057f4>] ? dma_set_mask+0x37/0x37

......

Jul 25 14:34:35 Tower2 kernel: mvsas 0000:02:00.0: mvsas prep failed[0]!

Jul 25 14:34:35 Tower2 kernel: flush-9:4: page allocation failure: order:1, mode:0x8020

Jul 25 14:34:35 Tower2 kernel: Pid: 21691, comm: flush-9:4 Tainted: G          O 3.4.4-unRAID #1

Jul 25 14:34:35 Tower2 kernel: Call Trace:

Jul 25 14:34:35 Tower2 kernel:  [<c1062e3a>] warn_alloc_failed+0xbd/0xcf

Jul 25 14:34:35 Tower2 kernel:  [<c10636b2>] __alloc_pages_nodemask+0x47c/0x4a5

Jul 25 14:34:35 Tower2 kernel:  [<c1005858>] dma_generic_alloc_coherent+0x64/0xcc

Jul 25 14:34:35 Tower2 kernel:  [<c107f030>] ? T.603+0x23/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c107f0c0>] T.603+0xb3/0x126

Jul 25 14:34:35 Tower2 kernel:  [<c10057f4>] ? dma_set_mask+0x37/0x37

 

Running a preclear as well?

Jul 25 15:40:44 Tower2 preclear_disk-diff[29666]: == invoked as: ./preclear_disk.sh -A /dev/sdl

 

 

I've got the same, I was preclearing one drive.

On the earlier unRAID version 4.7, I was preclearing 5 drives at the same time with no issue.

 

 

Jul 24 10:59:55 Tower kernel: Call Trace:

Jul 24 10:59:55 Tower kernel:  [<c1062e3a>] warn_alloc_failed+0xbd/0xcf

Jul 24 10:59:55 Tower kernel:  [<c10636b2>] __alloc_pages_nodemask+0x47c/0x4a5

Jul 24 10:59:55 Tower kernel:  [<c103937e>] ? up+0x2b/0x2f

Jul 24 10:59:55 Tower kernel:  [<c1005858>] dma_generic_alloc_coherent+0x64/0xcc

Jul 24 10:59:55 Tower kernel:  [<c107f030>] ? T.603+0x23/0x126

Jul 24 10:59:55 Tower kernel:  [<c107f0c0>] T.603+0xb3/0x126

Jul 24 10:59:55 Tower kernel:  [<c10057f4>] ? dma_set_mask+0x37/0x37

Jul 24 10:59:55 Tower kernel:  [<c107f186>] dma_pool_alloc+0x53/0x109

Jul 24 10:59:55 Tower kernel:  [<c103b171>] ? ttwu_do_wakeup+0xf/0xaa

Jul 24 10:59:55 Tower kernel:  [<f849ca92>] mvs_task_prep+0x1b2/0x373 [mvsas]

Jul 24 10:59:55 Tower kernel:  [<c105fde7>] ? mempool_alloc_slab+0xe/0x10

Jul 24 10:59:55 Tower kernel:  [<f849cc91>] mvs_task_exec+0x3e/0x93 [mvsas]

Jul 24 10:59:55 Tower kernel:  [<f849d39a>] mvs_queue_command+0x26/0x35 [mvsas]

Jul 24 10:59:55 Tower kernel:  [<f84814b0>] sas_ata_qc_issue+0x1a3/0x1fa [libsas]

Jul 24 10:59:55 Tower kernel:  [<c123b46f>] ata_qc_issue+0x273/0x291

Jul 24 10:59:55 Tower kernel:  [<c123eb41>] ata_scsi_translate+0xbf/0xed

Jul 24 10:59:55 Tower kernel:  [<c1240a66>] ? ata_scsiop_mode_sense+0x257/0x257

Jul 24 10:59:55 Tower kernel:  [<c124135e>] ata_sas_queuecmd+0x17e/0x1ac

Jul 24 10:59:55 Tower kernel:  [<f848085d>] sas_queuecommand+0x79/0x1c4 [libsas]

Jul 24 10:59:55 Tower kernel:  [<c122bdaf>] scsi_dispatch_cmd+0xfa/0x125

Jul 24 10:59:55 Tower kernel:  [<c12300a0>] scsi_request_fn+0x269/0x384

Jul 24 10:59:55 Tower kernel:  [<c1190aa3>] __blk_run_queue+0x14/0x16

Jul 24 10:59:55 Tower kernel:  [<c119222b>] queue_unplugged+0x2d/0x39

Jul 24 10:59:55 Tower kernel:  [<c119238b>] blk_flush_plug_list+0x154/0x160

Jul 24 10:59:55 Tower kernel:  [<c11923a4>] blk_finish_plug+0xd/0x28

Jul 24 10:59:55 Tower kernel:  [<c10652ba>] read_pages+0x9d/0xa7

Jul 24 10:59:55 Tower kernel:  [<c10653a5>] __do_page_cache_readahead+0xe1/0xfa

Jul 24 10:59:55 Tower kernel:  [<c10653d5>] ra_submit+0x17/0x1c

Jul 24 10:59:55 Tower kernel:  [<c1065600>] ondemand_readahead+0x17c/0x188

Jul 24 10:59:55 Tower kernel:  [<c103f501>] ? check_preempt_wakeup+0xba/0x16f

Jul 24 10:59:55 Tower kernel:  [<c1065660>] page_cache_async_readahead+0x54/0x5f

Jul 24 10:59:55 Tower kernel:  [<c105f90f>] T.925+0xf7/0x39d

Jul 24 10:59:55 Tower kernel:  [<c105fd74>] generic_file_aio_read+0x1bf/0x1ef

Jul 24 10:59:55 Tower kernel:  [<c11f2bb4>] ? tty_wakeup+0x49/0x51

Jul 24 10:59:55 Tower kernel:  [<c1084900>] do_sync_read+0x8d/0xc8

Jul 24 10:59:55 Tower kernel:  [<c11a5e30>] ? rb_erase+0xed/0xf5

Jul 24 10:59:55 Tower kernel:  [<c131e4bd>] ? __schedule+0x40d/0x485

Jul 24 10:59:55 Tower kernel:  [<c10a5a98>] ? block_llseek+0xb9/0xc5

Jul 24 10:59:55 Tower kernel:  [<c1085223>] vfs_read+0x88/0xfa

Jul 24 10:59:55 Tower kernel:  [<c1084873>] ? do_sync_write+0xc8/0xc8

Jul 24 10:59:55 Tower kernel:  [<c108532c>] sys_read+0x3b/0x60

Jul 24 10:59:55 Tower kernel:  [<c131efed>] syscall_call+0x7/0xb

Jul 24 10:59:55 Tower kernel: Mem-Info:

Jul 24 10:59:55 Tower kernel: DMA per-cpu:

Jul 24 10:59:55 Tower kernel: CPU    0: hi:    0, btch:  1 usd:  0

Jul 24 10:59:55 Tower kernel: CPU    1: hi:    0, btch:  1 usd:  0

Jul 24 10:59:55 Tower kernel: Normal per-cpu:

Jul 24 10:59:55 Tower kernel: CPU    0: hi:  186, btch:  31 usd: 120

Jul 24 10:59:55 Tower kernel: CPU    1: hi:  186, btch:  31 usd:  27

Jul 24 10:59:55 Tower kernel: HighMem per-cpu:

Jul 24 10:59:55 Tower kernel: CPU    0: hi:  186, btch:  31 usd: 155

Jul 24 10:59:55 Tower kernel: CPU    1: hi:  186, btch:  31 usd:  22

 

 

See my post http://lime-technology.com/forum/index.php?topic=21634.0

 

 

Link to comment

UPDATE:

 

Parity check performed at solid speeds (50-80mb/s), and came back with 0 errors.

 

I decided to try preclearing a couple of drives that may have gone bad in my other tower, and performed a proper power down on the r8168-test tower.

 

Inserted the two "dead" drives in the r-8168 tower, powered up, and unraid booted normally... until I tried accessing the webgui.  Timeout.

 

Telnet's still alive; syslog attached.

syslog.txt

Link to comment

I'm running v5 rc6 test now, no issues with the new build.

 

I did have a weird thing happen when I tried to use my new flash drive.  Re above, I prepared the flash drive and then copied all the files from my existing flash drive to the new one.  I then replaced the two main files bzroot and  bzimage with the files from v5 rc6 test.  The system booted normally but after the server was running, the ARray was not running and it only saw 2 of my 9 drives.  So I just did a Shutdown, put the old flash drive back in (updated those files with v5 rc6) and it came up fine, Array was running and it sees all drives.  The new flash drive should have worked, unless there is something weird about having a fat32 64gb size drive.  I even made sure that my config files were the ones from my old (working) flash drive. 

 

I'd like to understand why the new flash drive didn't work. 

 

Syslog attached.

 

Thanks!

I think this needs a separate thread (it's unrelated to the beta releases). 

 

Did you get a new key fie from Limetech for your new flash drive?  If not, then that is why you only see two data drives.  The web UI will report unRAID Basic rather than the version that you expect because your current key file is matched to your old flash drive.

Link to comment

Ok, I wasn't sure if this issue was related to the new rc6 release or not.  Which forum should it be in then?

 

I setup my server back in 2009 and I don't recall ever having to do anything with a key.  That said, your reply makes sense to me.  I just now need to confirm that was the issue.

 

Thanks for your reply.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.