unRAID OS version 6.3.2 Stable Release Available


limetech

Recommended Posts

On 3/1/2017 at 11:49 PM, RobJ said:

 

You are loading a huge amount of stuff, some of them clear memory hogs, like multiple java apps and Plex and more.  You would thing that with 64GB you should not be having memory issues, but they started as you said about 24 hours after booting.  At that point, page allocations were numerous but very slow, ranging from 10 to 20 seconds per process, to some time later 10 to 50 seconds, an obvious latency issue.  I believe they were because of garbage collection efforts as the memory filled up, the attempt to reorganize memory chunks to satisfy the requests.  They clearly were averaging longer and longer, so it was only a matter of time before they were going to fail to satisfy an allocation request.  The OOM (Out Of Memory) was the final straw.  While the system did carry on valiantly for quite awhile, you probably should have rebooted when the allocation issues first began.

 

My guess is, you have something with a serious memory leak.  I can't say what it is, prime suspects would be java itself, a java app, makemkv, a Plex component, or a corrupted btrfs causing this.  I would look for updates for all apps, then check the disk file systems on all drives formatted with BTRFS.  If at all possible, stop loading anything you aren't actually using.  For example, do you really need so many of the NerdPack packages?

 

If the problem continues, and I suspect it will, you will have to run without selected apps, trying different combinations, and figure out which apps are using up the memory.  You do have Cadvisor, perhaps it could be used to monitor all resource usage, see what is growing too large, and never shrinks back.  Right now, the java processes are enormous, and there are a number of them, could be suspect.

 

This is a support issue, will probably be moved to the support board.

 

 

I closed down all VMs and dockers and run another diagnostics which is attached.

 

Yesterday I rebooted and did not start Crashplan or Dropbox. Today, about 12 hours after reboot, my problems started again.

 

Mar  3 12:08:51 Tower kernel: btrfs-transacti: page allocation stalls for 11007ms, order:2, mode:0x2404040(GFP_NOFS|__GFP_COMP)
Mar  3 12:08:51 Tower kernel: CPU: 9 PID: 14971 Comm: btrfs-transacti Not tainted 4.9.10-unRAID #1

 

 

I want to go back to 6.2.4. I tried this yesterday by swapping out the bz files on the flash.

 

6.2.4 booted ok, but my dockers did not start, and only one of 1 VMs showed up.

 

The settings/dockers page said I needed to recreate my docker image,

 

libvirt.log showed the following.

 

3+0000: 16742: info : libvirt version: 1.3.1
2017-03-02 23:52:46.633+0000: 16742: info : hostname: Tower
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id

 

How do I go back to 6.2.4 and fix the VMs and dockers?

 

Thanks

 

 

tower-diagnostics-20170302-2313.zip

Link to comment

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

Link to comment
57 minutes ago, thither said:

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

Quite a few people have reported that!  In most cases it seems to occur for SuperMicro motherboards - what do you have?

Link to comment
On 03/03/2017 at 6:06 PM, al_uk said:

 

 

I closed down all VMs and dockers and run another diagnostics which is attached.

 

Yesterday I rebooted and did not start Crashplan or Dropbox. Today, about 12 hours after reboot, my problems started again.

 


Mar  3 12:08:51 Tower kernel: btrfs-transacti: page allocation stalls for 11007ms, order:2, mode:0x2404040(GFP_NOFS|__GFP_COMP)
Mar  3 12:08:51 Tower kernel: CPU: 9 PID: 14971 Comm: btrfs-transacti Not tainted 4.9.10-unRAID #1

 

 

I want to go back to 6.2.4. I tried this yesterday by swapping out the bz files on the flash.

 

6.2.4 booted ok, but my dockers did not start, and only one of 1 VMs showed up.

 

The settings/dockers page said I needed to recreate my docker image,

 

libvirt.log showed the following.

 


3+0000: 16742: info : libvirt version: 1.3.1
2017-03-02 23:52:46.633+0000: 16742: info : hostname: Tower
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id

 

How do I go back to 6.2.4 and fix the VMs and dockers?

 

Thanks

 

 

tower-diagnostics-20170302-2313.zip

 

 

Today I tried just having the VMs powered up. No dockers were started.

 

within 8 hours I was getting "tainted" problems again,

 

Any suggestions on what to try next? I have started a separate thread on how to roll back to 6.2.4.

Link to comment

Hi,

 

Background: 

I upgraded yesterday to the new 6.3.2 version (and upgraded to a new motherboard - x10SLL-F-O - in the same time) apparently w/o issues (Thank you!).

However, after a short time I started receiving plugin errors (missing files, and an endless loop from a tenacious plugin wanting to send statistics from my system).

Reading also in your note:

Limetech quote:

Plugin Authors: as posted earlier, your plugin may not function properly depending how how POST requests are handled, see:

http://lime-technology.com/forum/index.php?topic=55986.0

 

Plugin Users: please post issues you find in the appropriate Plugin Support topic.

 

And to reiterate: true plugins (not Docker containers) run as the

root

user and have full access to everything on your server: Install 3rd party plugins at your own risk.

End Limetech quote.

 

I have decided to remove all the plugins from my server - as I don't have time to hunt for compatibility issues and resolve strange behaviors. I'm running now the barebone system, and want to use docker or VMs for additional functionality.

 

Now here is my question: 

regarding plugins - I'm assuming there are no limetech endorsed and versions controlled plugins? Is that right?

 

Among those I had installed were a few which made sense to have close by  (all were strictly tools, one to create and verify checksums, another adding an expanded tool set, etc) - perhaps you can create limetech optional add-on packages which make sense to run as root, and are safe to run (version controlled).

Please comment. Thanks again.

Link to comment
12 hours ago, sambo said:

Hello,

Is it possible to add on next release the spinup information on dmesg like we have for spindown ?

Thanks for all this good work!

Although I would like to see such information I suspect it is not available.    I think the Spindown messages relate to specific events within unRAID, while the Spinups are likely to happen automatically when an access is made to the drive (without an explicit Spinup command being issued).  

 

The closest I could see is adding a message to the log on the periodic drive checks when the Spin state is found to be different to the last one logged.   Although this may mean the log message is delayed from the actual event happening it would still be useful information.

Link to comment
On 04/03/2017 at 7:04 PM, thither said:

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

 

On 04/03/2017 at 8:02 PM, itimpi said:

Quite a few people have reported that!  In most cases it seems to occur for SuperMicro motherboards - what do you have?

 

This is happening to me also after upgrade, only able to boot to GUI mode. Again, not ideal as I also run headless. MB is ASRock Z270 Pro4. 

Link to comment
9 minutes ago, JonUKRed said:

This is happening to me also after upgrade, only able to boot to GUI mode. Again, not ideal as I also run headless. MB is ASRock Z270 Pro4.

 

Please confirm for me no corruption of the 'bzroot' file has occurred.  From console or telnet/ssh please type this:

md5sum /boot/bzroot

Should return this for 6.3.2 release:

c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

 

Link to comment
2 minutes ago, limetech said:

 

Please confirm for me no corruption of the 'bzroot' file has occurred.  From console or telnet/ssh please type this:


md5sum /boot/bzroot

Should return this for 6.3.2 release:


c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

 

 

Hi there.  Yes, it does indeed return the above from.

 

root@IronCloud:~# md5sum /boot/bzroot

c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

 

Thanks, Jon

Link to comment
4 minutes ago, limetech said:

I guess for completeness can check 'em all:


5a4d270d192c0573bb78af92220e149b  bzimage
c1a14a522656426fb9e20b66a5968d1a  bzroot
f65c0917efe04edf5b91528c3c7eb1d1  bzroot-gui

 

 

root@IronCloud:~# md5sum /boot/bzroot

c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

root@IronCloud:~# md5sum /boot/bzimage

5a4d270d192c0573bb78af92220e149b  /boot/bzimage

root@IronCloud:~# md5sum /boot/bzroot-gui

f65c0917efe04edf5b91528c3c7eb1d1  /boot/bzroot-gui

 
Yes - all present and correct.  It isn't the end of the world - I just thought it very odd...
Edited by JonUKRed
Link to comment

Hey guys, just want to say that i was having this problem with an Asus Z-170 Maximus viii hero.  Tried everything and finally fixed it by updating to the latest bios on my board.  Not sure if that will help others but its worth a shot for those of us on z-170.

Link to comment
10 hours ago, limetech said:

 

Please post your syslinux.cfg file.

 

OK here is mu syslinux.cfg file.

default /syslinux/menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append initrd=/bzroot
label unRAID OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label unRAID OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Memtest86+
  kernel /memtest

 

10 hours ago, G2-91305 said:

Hey guys, just want to say that i was having this problem with an Asus Z-170 Maximus viii hero.  Tried everything and finally fixed it by updating to the latest bios on my board.  Not sure if that will help others but its worth a shot for those of us on z-170.

 

OK - issue fixed.  I have a Z270 MB and after reading the above I thought I would try something.  I knew I was running the latest BIOS as it was updated very recently and both the support site and MB told me so.  Anyhow, I reset the MB to default setting and low and behold I can now boot straight to unRAID OS without any issue.  So it wasn't that I was running out dated BIOS but a setting within BIOS that was causing the problem.

 

I will run some trial and error on MB settings to try and recreate the problem and see whether I can isolate it - If I find it I will let you all know.

 

Thanks for your help! Jon.

Link to comment
17 hours ago, limetech said:

I guess for completeness can check 'em all:


5a4d270d192c0573bb78af92220e149b  bzimage
c1a14a522656426fb9e20b66a5968d1a  bzroot
f65c0917efe04edf5b91528c3c7eb1d1  bzroot-gui

 

Just to confirm, I also see these same checksums on my Asus Z170 board, and my syslinux.cfg is the same as the one @JonUKRed posted above (and I'm also not able to boot into non-GUI mode). Don't have time for a BIOS upgrade now but I'll try it sometime in the next few days and report back.

Link to comment
7 hours ago, JonUKRed said:

 

OK here is mu syslinux.cfg file.


default /syslinux/menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append initrd=/bzroot
label unRAID OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label unRAID OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Memtest86+
  kernel /memtest

 

 

OK - issue fixed.  I have a Z270 MB and after reading the above I thought I would try something.  I knew I was running the latest BIOS as it was updated very recently and both the support site and MB told me so.  Anyhow, I reset the MB to default setting and low and behold I can now boot straight to unRAID OS without any issue.  So it wasn't that I was running out dated BIOS but a setting within BIOS that was causing the problem.

 

I will run some trial and error on MB settings to try and recreate the problem and see whether I can isolate it - If I find it I will let you all know.

 

Thanks for your help! Jon.

Glad that helped a bit man!

 

 

Link to comment
On 20/02/2017 at 0:18 AM, sakh1979 said:

After I upgraded from 6.2.4 -> 6.3.2 I started seeing this error message in the log:

 

 


Feb 18 20:53:08 Tower emhttp: err: handleRequest: getpeername: Transport endpoint is not connected
Feb 18 20:53:08 Tower emhttp: err: handleRequest: getpeername: Transport endpoint is not connected
 

 

I only seem them if I am connecting to unRAID via a Window's 10 laptop, connecting to unRAID with any other OS (Linux or OSX) does not give me this error message.

 

Is there something I can do to prevent this error from showing up?

 

I am having the same error; please see my post here: 

Have you had any progress on this issue?

Link to comment

Can anyone from LT (or anyone else) look at this thread, it's the second time I've seen this issue, when assigning a disk as parity, looks like the partition is successfully created but right after there's an invalid partition error and array won't start:

 

Mar 13 16:13:19 Tower emhttp: writing GPT on disk (sde), with partition 1 offset 64, erased: 0
Mar 13 16:13:19 Tower emhttp: shcmd (585): sgdisk -Z /dev/sde &> /dev/null
Mar 13 16:13:19 Tower kernel: sde: sde1
Mar 13 16:13:20 Tower emhttp: shcmd (586): sgdisk -o -a 64 -n 1:64:0 /dev/sde |& logger
Mar 13 16:13:21 Tower root: Creating new GPT entries.
Mar 13 16:13:21 Tower root: The operation has completed successfully.
Mar 13 16:13:21 Tower kernel: sde: sde1
Mar 13 16:13:21 Tower emhttp: shcmd (587): udevadm settle
Mar 13 16:13:21 Tower emhttp: invalid partition(s)

 

Link to comment

I just ran through a successful upgrade from a very stable 6.1.9 version to 6.3.2 via the "Plugin" update method.  Smooth as silk!

 

This machine has no dockers currently configured (but dockers are enabled) nor VM's (hardware can't handle it, CPU is an Atom quad-core, 4 GiB RAM, 19 devices in the array, single parity drive (actually a hardware RAID 1 in an enclosure off an eSATA port = parity drive) and bonded ethernet for fault-tolerance. The drives are mostly re-purposed laptop drives for power consumption reduction).).

 

System logs look clean (as in no errors), and all shares appear to be present and functioning nominally when accessed from Win 10 machines and Linux machines.

 

The only "oddity" I've noticed is with the report of the last parity check in the Main tab and the Dashboard tab.  I checked this right before performing the upgrade, and it reported 0 errors from the prior parity check which completed yesterday.  

 

Right after coming up in 6.3.2 and starting the array the UI reports this:

Last checked on Sun 12 Mar 2017 07:09:49 PM CDT (yesterday), finding errors.
Duration: 21 hours, 9 minutes, 48 seconds. Average speed: 26.3 MB/s

That is the usual parity check time and speeds for this tiny box (motherboard SATA (6 drives) + eSATA (parity) and LSI HBA SATA controller on PCIe X8 (12 drives) + 4 cache SSD's on the LSI controller also), it varies by minutes +/- every week like clockwork. Data = xfs, cache = btrfs.

 

I checked the /config/parity-checks.log file and the last entry from the last parity check is: 

Mar 12 19:09:49|76188|26.3 MB/s|0

 

I'm assuming that there have been some changes to the format of the entries in parity-checks.log that explain the odd phrasing in the UI, but wanted to make sure before I rely on the integrity of the box again.

 

Great work guys!

 

 

Link to comment

 

4 minutes ago, BoHiCa said:

 

 

I'm assuming that there have been some changes to the format of the entries in parity-checks.log that explain the odd phrasing in the UI, but wanted to make sure before I rely on the integrity of the box again.

 

I ran a Correcting Parity Check yesterday and this is the report on the Array Operation tab:

 

Last check completed on Sun 12 Mar 2017 03:09:01 PM EDT (yesterday), finding 0 errors.
Duration: 7 hours, 13 minutes, 18 seconds. Average speed: 115.4 MB/sec

Not really sure why you are seeing truncated report, but this is what I found in the parity-checks.log file:

 

2017 Mar 12 15:09:01|25998|115.4 MB/s|0|0

Did you by any chance terminate it before it completed?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.