unRAID OS version 6.3.2 Stable Release Available

al_uk · March 3, 2017

On 3/1/2017 at 11:49 PM, RobJ said:

You are loading a huge amount of stuff, some of them clear memory hogs, like multiple java apps and Plex and more. You would thing that with 64GB you should not be having memory issues, but they started as you said about 24 hours after booting. At that point, page allocations were numerous but very slow, ranging from 10 to 20 seconds per process, to some time later 10 to 50 seconds, an obvious latency issue. I believe they were because of garbage collection efforts as the memory filled up, the attempt to reorganize memory chunks to satisfy the requests. They clearly were averaging longer and longer, so it was only a matter of time before they were going to fail to satisfy an allocation request. The OOM (Out Of Memory) was the final straw. While the system did carry on valiantly for quite awhile, you probably should have rebooted when the allocation issues first began.

My guess is, you have something with a serious memory leak. I can't say what it is, prime suspects would be java itself, a java app, makemkv, a Plex component, or a corrupted btrfs causing this. I would look for updates for all apps, then check the disk file systems on all drives formatted with BTRFS. If at all possible, stop loading anything you aren't actually using. For example, do you really need so many of the NerdPack packages?

If the problem continues, and I suspect it will, you will have to run without selected apps, trying different combinations, and figure out which apps are using up the memory. You do have Cadvisor, perhaps it could be used to monitor all resource usage, see what is growing too large, and never shrinks back. Right now, the java processes are enormous, and there are a number of them, could be suspect.

This is a support issue, will probably be moved to the support board.

I closed down all VMs and dockers and run another diagnostics which is attached.

Yesterday I rebooted and did not start Crashplan or Dropbox. Today, about 12 hours after reboot, my problems started again.

Mar  3 12:08:51 Tower kernel: btrfs-transacti: page allocation stalls for 11007ms, order:2, mode:0x2404040(GFP_NOFS|__GFP_COMP)
Mar  3 12:08:51 Tower kernel: CPU: 9 PID: 14971 Comm: btrfs-transacti Not tainted 4.9.10-unRAID #1

I want to go back to 6.2.4. I tried this yesterday by swapping out the bz files on the flash.

6.2.4 booted ok, but my dockers did not start, and only one of 1 VMs showed up.

The settings/dockers page said I needed to recreate my docker image,

libvirt.log showed the following.

3+0000: 16742: info : libvirt version: 1.3.1
2017-03-02 23:52:46.633+0000: 16742: info : hostname: Tower
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id

How do I go back to 6.2.4 and fix the VMs and dockers?

Thanks

tower-diagnostics-20170302-2313.zip

richardsim7 · March 4, 2017

On 02/03/2017 at 8:17 PM, richardsim7 said:

I did a quick search but couldn't find anything:

I upgraded from 6.2.4 to 6.3.2, and now my Windows 10 VM won't boot. SeaBIOS just says "No bootable device"

Any ideas?

nas-diagnostics-20170302-2017.zip

Rolled back to 6.2.4 and the VM boots again. Any ideas why 6.3.2 isn't working?

thither · March 4, 2017

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

itimpi · March 4, 2017

57 minutes ago, thither said:

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

Quite a few people have reported that! In most cases it seems to occur for SuperMicro motherboards - what do you have?

thither · March 4, 2017

44 minutes ago, itimpi said:

Quite a few people have reported that! In most cases it seems to occur for SuperMicro motherboards - what do you have?

I've got an ASRock Z170 Extreme+ - this one.

al_uk · March 4, 2017

On 03/03/2017 at 6:06 PM, al_uk said:
I closed down all VMs and dockers and run another diagnostics which is attached.

Yesterday I rebooted and did not start Crashplan or Dropbox. Today, about 12 hours after reboot, my problems started again.
Mar  3 12:08:51 Tower kernel: btrfs-transacti: page allocation stalls for 11007ms, order:2, mode:0x2404040(GFP_NOFS|__GFP_COMP)
Mar  3 12:08:51 Tower kernel: CPU: 9 PID: 14971 Comm: btrfs-transacti Not tainted 4.9.10-unRAID #1
I want to go back to 6.2.4. I tried this yesterday by swapping out the bz files on the flash.

6.2.4 booted ok, but my dockers did not start, and only one of 1 VMs showed up.

The settings/dockers page said I needed to recreate my docker image,

libvirt.log showed the following.
3+0000: 16742: info : libvirt version: 1.3.1
2017-03-02 23:52:46.633+0000: 16742: info : hostname: Tower
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.633+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.634+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
2017-03-02 23:52:46.635+0000: 16742: error : virDomainDefParseXML:15455 : unsupported configuration: unsupported HyperV Enlightenment feature: vendor_id
How do I go back to 6.2.4 and fix the VMs and dockers?

Thanks

tower-diagnostics-20170302-2313.zip

Today I tried just having the VMs powered up. No dockers were started.

within 8 hours I was getting "tainted" problems again,

Any suggestions on what to try next? I have started a separate thread on how to roll back to 6.2.4.

hgeorges · March 5, 2017

Hi,

Background:

I upgraded yesterday to the new 6.3.2 version (and upgraded to a new motherboard - x10SLL-F-O - in the same time) apparently w/o issues (Thank you!).

However, after a short time I started receiving plugin errors (missing files, and an endless loop from a tenacious plugin wanting to send statistics from my system).

Reading also in your note:

Limetech quote:

Plugin Authors: as posted earlier, your plugin may not function properly depending how how POST requests are handled, see:

http://lime-technology.com/forum/index.php?topic=55986.0

Plugin Users: please post issues you find in the appropriate Plugin Support topic.

And to reiterate: true plugins (not Docker containers) run as the

root

user and have full access to everything on your server: Install 3rd party plugins at your own risk.

End Limetech quote.

I have decided to remove all the plugins from my server - as I don't have time to hunt for compatibility issues and resolve strange behaviors. I'm running now the barebone system, and want to use docker or VMs for additional functionality.

Now here is my question:

regarding plugins - I'm assuming there are no limetech endorsed and versions controlled plugins? Is that right?

Among those I had installed were a few which made sense to have close by (all were strictly tools, one to create and verify checksums, another adding an expanded tool set, etc) - perhaps you can create limetech optional add-on packages which make sense to run as root, and are safe to run (version controlled).

Please comment. Thanks again.

sambo · March 7, 2017

Hello,

Is it possible to add on next release the spinup information on dmesg like we have for spindown ?

Thanks for all this good work!

itimpi · March 8, 2017

12 hours ago, sambo said:

Hello,

Is it possible to add on next release the spinup information on dmesg like we have for spindown ?

Thanks for all this good work!

Although I would like to see such information I suspect it is not available. I think the Spindown messages relate to specific events within unRAID, while the Spinups are likely to happen automatically when an access is made to the drive (without an explicit Spinup command being issued).

The closest I could see is adding a message to the log on the periodic drive checks when the Spin state is found to be different to the last one logged. Although this may mean the log message is delayed from the actual event happening it would still be useful information.

sambo · March 8, 2017

Since unraid is able to spindown disk after some inactivity time, i think it shoud be possible, atleast i hope

JonUKRed · March 8, 2017

On 04/03/2017 at 7:04 PM, thither said:

After upgrading from 6.3.1 I'm seeing some odd behavior where I can boot into GUI mode, but when I try to boot into regular (OS) mode the server freezes after it loads /bzImage and is not pingable. I run headless most of the time with my monitor plugged into a different GUI card for VMs, so this isn't ideal. Anything I can try to diagnose this?

On 04/03/2017 at 8:02 PM, itimpi said:

Quite a few people have reported that! In most cases it seems to occur for SuperMicro motherboards - what do you have?

This is happening to me also after upgrade, only able to boot to GUI mode. Again, not ideal as I also run headless. MB is ASRock Z270 Pro4.

limetech · March 8, 2017

9 minutes ago, JonUKRed said:

This is happening to me also after upgrade, only able to boot to GUI mode. Again, not ideal as I also run headless. MB is ASRock Z270 Pro4.

Please confirm for me no corruption of the 'bzroot' file has occurred. From console or telnet/ssh please type this:

md5sum /boot/bzroot

Should return this for 6.3.2 release:

c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

JonUKRed · March 8, 2017

2 minutes ago, limetech said:
Please confirm for me no corruption of the 'bzroot' file has occurred. From console or telnet/ssh please type this:
md5sum /boot/bzroot
Should return this for 6.3.2 release:
c1a14a522656426fb9e20b66a5968d1a  /boot/bzroot

Hi there. Yes, it does indeed return the above from.

root@IronCloud:~# md5sum /boot/bzroot

c1a14a522656426fb9e20b66a5968d1a /boot/bzroot

Thanks, Jon

limetech · March 8, 2017

I guess for completeness can check 'em all:

5a4d270d192c0573bb78af92220e149b  bzimage
c1a14a522656426fb9e20b66a5968d1a  bzroot
f65c0917efe04edf5b91528c3c7eb1d1  bzroot-gui

JonUKRed · March 8, 2017

4 minutes ago, limetech said:
I guess for completeness can check 'em all:
5a4d270d192c0573bb78af92220e149b  bzimage
c1a14a522656426fb9e20b66a5968d1a  bzroot
f65c0917efe04edf5b91528c3c7eb1d1  bzroot-gui

root@IronCloud:~# md5sum /boot/bzroot

c1a14a522656426fb9e20b66a5968d1a /boot/bzroot

root@IronCloud:~# md5sum /boot/bzimage

5a4d270d192c0573bb78af92220e149b /boot/bzimage

root@IronCloud:~# md5sum /boot/bzroot-gui

f65c0917efe04edf5b91528c3c7eb1d1 /boot/bzroot-gui

Yes - all present and correct. It isn't the end of the world - I just thought it very odd...

Edited March 8, 2017 by JonUKRed

JonathanM · March 8, 2017

30 minutes ago, limetech said:

I guess for completeness can check 'em all:

Would it hurt anything or be worth the labor to do a checksum early in the boot process and log the results in syslog?

limetech · March 8, 2017

1 hour ago, JonUKRed said:

Yes - all present and correct. It isn't the end of the world - I just thought it very odd...

Please post your syslinux.cfg file.

G2-91305 · March 8, 2017

Hey guys, just want to say that i was having this problem with an Asus Z-170 Maximus viii hero. Tried everything and finally fixed it by updating to the latest bios on my board. Not sure if that will help others but its worth a shot for those of us on z-170.

JonUKRed · March 9, 2017

10 hours ago, limetech said:

Please post your syslinux.cfg file.

OK here is mu syslinux.cfg file.

default /syslinux/menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append initrd=/bzroot
label unRAID OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label unRAID OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Memtest86+
  kernel /memtest

10 hours ago, G2-91305 said:

Hey guys, just want to say that i was having this problem with an Asus Z-170 Maximus viii hero. Tried everything and finally fixed it by updating to the latest bios on my board. Not sure if that will help others but its worth a shot for those of us on z-170.

OK - issue fixed. I have a Z270 MB and after reading the above I thought I would try something. I knew I was running the latest BIOS as it was updated very recently and both the support site and MB told me so. Anyhow, I reset the MB to default setting and low and behold I can now boot straight to unRAID OS without any issue. So it wasn't that I was running out dated BIOS but a setting within BIOS that was causing the problem.

I will run some trial and error on MB settings to try and recreate the problem and see whether I can isolate it - If I find it I will let you all know.

Thanks for your help! Jon.

thither · March 9, 2017

17 hours ago, limetech said:
I guess for completeness can check 'em all:
5a4d270d192c0573bb78af92220e149b  bzimage
c1a14a522656426fb9e20b66a5968d1a  bzroot
f65c0917efe04edf5b91528c3c7eb1d1  bzroot-gui

Just to confirm, I also see these same checksums on my Asus Z170 board, and my syslinux.cfg is the same as the one @JonUKRed posted above (and I'm also not able to boot into non-GUI mode). Don't have time for a BIOS upgrade now but I'll try it sometime in the next few days and report back.

G2-91305 · March 9, 2017

7 hours ago, JonUKRed said:
OK here is mu syslinux.cfg file.
default /syslinux/menu.c32
menu title Lime Technology, Inc.
prompt 0
timeout 50
label unRAID OS
  menu default
  kernel /bzimage
  append initrd=/bzroot
label unRAID OS GUI Mode
  kernel /bzimage
  append initrd=/bzroot,/bzroot-gui
label unRAID OS Safe Mode (no plugins, no GUI)
  kernel /bzimage
  append initrd=/bzroot unraidsafemode
label Memtest86+
  kernel /memtest
OK - issue fixed. I have a Z270 MB and after reading the above I thought I would try something. I knew I was running the latest BIOS as it was updated very recently and both the support site and MB told me so. Anyhow, I reset the MB to default setting and low and behold I can now boot straight to unRAID OS without any issue. So it wasn't that I was running out dated BIOS but a setting within BIOS that was causing the problem.

I will run some trial and error on MB settings to try and recreate the problem and see whether I can isolate it - If I find it I will let you all know.

Thanks for your help! Jon.

Glad that helped a bit man!

Enver · March 13, 2017

On 20/02/2017 at 0:18 AM, sakh1979 said:
After I upgraded from 6.2.4 -> 6.3.2 I started seeing this error message in the log:
Feb 18 20:53:08 Tower emhttp: err: handleRequest: getpeername: Transport endpoint is not connected
Feb 18 20:53:08 Tower emhttp: err: handleRequest: getpeername: Transport endpoint is not connected
I only seem them if I am connecting to unRAID via a Window's 10 laptop, connecting to unRAID with any other OS (Linux or OSX) does not give me this error message.

Is there something I can do to prevent this error from showing up?

I am having the same error; please see my post here:

Have you had any progress on this issue?

JorgeB · March 14, 2017

Can anyone from LT (or anyone else) look at this thread, it's the second time I've seen this issue, when assigning a disk as parity, looks like the partition is successfully created but right after there's an invalid partition error and array won't start:

Mar 13 16:13:19 Tower emhttp: writing GPT on disk (sde), with partition 1 offset 64, erased: 0
Mar 13 16:13:19 Tower emhttp: shcmd (585): sgdisk -Z /dev/sde &> /dev/null
Mar 13 16:13:19 Tower kernel: sde: sde1
Mar 13 16:13:20 Tower emhttp: shcmd (586): sgdisk -o -a 64 -n 1:64:0 /dev/sde |& logger
Mar 13 16:13:21 Tower root: Creating new GPT entries.
Mar 13 16:13:21 Tower root: The operation has completed successfully.
Mar 13 16:13:21 Tower kernel: sde: sde1
Mar 13 16:13:21 Tower emhttp: shcmd (587): udevadm settle
Mar 13 16:13:21 Tower emhttp: invalid partition(s)

BoHiCa · March 14, 2017

I just ran through a successful upgrade from a very stable 6.1.9 version to 6.3.2 via the "Plugin" update method. Smooth as silk!

This machine has no dockers currently configured (but dockers are enabled) nor VM's (hardware can't handle it, CPU is an Atom quad-core, 4 GiB RAM, 19 devices in the array, single parity drive (actually a hardware RAID 1 in an enclosure off an eSATA port = parity drive) and bonded ethernet for fault-tolerance. The drives are mostly re-purposed laptop drives for power consumption reduction).).

System logs look clean (as in no errors), and all shares appear to be present and functioning nominally when accessed from Win 10 machines and Linux machines.

The only "oddity" I've noticed is with the report of the last parity check in the Main tab and the Dashboard tab. I checked this right before performing the upgrade, and it reported 0 errors from the prior parity check which completed yesterday.

Right after coming up in 6.3.2 and starting the array the UI reports this:

Last checked on Sun 12 Mar 2017 07:09:49 PM CDT (yesterday), finding errors.
Duration: 21 hours, 9 minutes, 48 seconds. Average speed: 26.3 MB/s

That is the usual parity check time and speeds for this tiny box (motherboard SATA (6 drives) + eSATA (parity) and LSI HBA SATA controller on PCIe X8 (12 drives) + 4 cache SSD's on the LSI controller also), it varies by minutes +/- every week like clockwork. Data = xfs, cache = btrfs.

I checked the /config/parity-checks.log file and the last entry from the last parity check is:

Mar 12 19:09:49|76188|26.3 MB/s|0

I'm assuming that there have been some changes to the format of the entries in parity-checks.log that explain the odd phrasing in the UI, but wanted to make sure before I rely on the integrity of the box again.

Great work guys!

Frank1940 · March 14, 2017

4 minutes ago, BoHiCa said:

I'm assuming that there have been some changes to the format of the entries in parity-checks.log that explain the odd phrasing in the UI, but wanted to make sure before I rely on the integrity of the box again.

I ran a Correcting Parity Check yesterday and this is the report on the Array Operation tab:

Last check completed on Sun 12 Mar 2017 03:09:01 PM EDT (yesterday), finding 0 errors.
Duration: 7 hours, 13 minutes, 18 seconds. Average speed: 115.4 MB/sec

Not really sure why you are seeing truncated report, but this is what I found in the parity-checks.log file:

2017 Mar 12 15:09:01|25998|115.4 MB/s|0|0

Did you by any chance terminate it before it completed?

unRAID OS version 6.3.2 Stable Release Available

Recommended Posts

Link to comment

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

limetech

limetech

Mettbrot

Posted Images

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation