Running out of memory during file copy; invoking oom-killer


Recommended Posts

Hi guys, I apologize if this has been covered...

 

I'm a new unRAID user and installed 5.0rc5 on a new box.  It has an Intel Core i7-3770 and 16GB RAM.

 

I'm mounting an NTFS disk and attempting to copy all of the files to the array (cp -R), but it runs the system out of memory after a few minutes and the kernel starts killing all the processes.  At that point, I'm able to hop on the console and get everything cleanly shutdown and reboot the box.

 

Syslog shows something like this when memory runs out:

 

Jul  8 19:39:22 Tower kernel: emhttp invoked oom-killer: gfp_mask=0x800d0, order=0, oom_adj=0, oom_score_adj=0

 

It looks like the filesystem cache is eating up all of the free memory (which should be fine), but when the system gets down to around 4GB free memory, it crashes.  I wrote a little shell snippet to continuously dump the cache every 10 seconds:

 

echo -n "        ";free|grep total;while true; do echo -n "before: ";free|grep Mem; sync;echo 3 > /proc/sys/vm/drop_caches; echo -n " after: ";free|grep Mem;sleep 10; done;

 

If I run this while I'm doing the copy, everything works great.  Any ideas what's happening?

 

Any help would be appreciated.  I'm looking forward to being able to use what looks like an awesome product!

 

Thanks,

Vince

syslog_201207081957.txt

Link to comment

After doing a little more research, it looks like I am running out of LOW memory.  I found this post which discusses this:

 

http://www.redhat.com/archives/taroon-list/2007-August/msg00006.html

 

It looks like the problem stems from having a large amount of memory (16GB) and a 32-bit kernel.  The kernel uses low memory to track memory allocations.  Eric recommends (in order of desirability):

 

1) using a 64-bit kernel

2) using a hugemem kernel

3) setting vm.lower_zone_protection to at least 250

4) disabling oom-killer

 

Since options 1 and 2 aren't really doable (easily?), I looked into adjusting the lower_zone_protection tunable.  After taking a peak in /proc/sys/vm, I discovered that it doesn't exist with the 3.x kernels that unRAID 5 uses.  This parameter has been changed to lowmem_reserve_ratio, but the syntax is a little different.  I found a page (https://bugzilla.redhat.com/show_bug.cgi?id=536734) that recommended setting it to "256 256 250".  My default values were "256 32 32".

 

After making that change, I tried the same copy operation, and I noticed the available memory (and low memory seen by free -l) would decrease as before, but instead of everything getting killed when it got really low, it just seemed to hover there and keep chugging along.

 

So...  Is this the correct "fix?"  Should unRAID detect systems with a large amount of memory and set this for us?  Is it possible to get a 64-bit kernel for unRAID?  Has anybody else seen this issue?  Does anybody have any experiencing tuning the lowmem_reserve_ratio parameter?

 

Thanks!

-Vince

Link to comment

Well, this didn't fix my problem after all. :(

 

It takes longer to crash now, but it still crashes.  For kicks, I took 12GB of memory out of the system, reducing it to 4GB.  Everything seems to work fine now -- I promise. :)

 

Does anybody have any ideas.  It'd be nice to be able to throw all of this memory in there.

Link to comment

THis is an interting discussion. The general consensus that since unRAID supports larger memorys there was never any need to a 64 bit build. But the days where 512MB of RAM was the norm are now long gone. Building a new system with 8GB+ RAM adds a couple of % to the overall cost and is likely the new norm.

 

I would like to see a 64bit kernel but obviously thats a huge huge deal.

 

Back on topic... have you considered setting a page file.

Link to comment

I agree.  16GB was $89.  A 64-bit kernel sure seems like it would be a good idea.  Especially once you start adding apps on top of unRAID.

 

I did try creating a 1GB swap file for fun, but it still crashed.  And it hadn't used any of the swap -- at least according to the last couple lines recorded in the syslog...

Link to comment

I agree.  16GB was $89.  A 64-bit kernel sure seems like it would be a good idea.  Especially once you start adding apps on top of unRAID.

 

I did try creating a 1GB swap file for fun, but it still crashed.  And it hadn't used any of the swap -- at least according to the last couple lines recorded in the syslog...

I agree, send an e-mail to Tom at lime-tech pointing him to this thread.

 

You could also try to disable the oom process killer, as the last possibility suggested in addition to the settings you tried changing. 

It will still fail on the program requesting memory, and it might still crash the server, but it won't be killing off samba or emhttp.

 

My older server unrAID server has only 512 Meg of RAM.  I can run it out of RAM by running too many processes.

(compiling ffmpg can do it if I don't add a swap file)    The eventual solution needs to remember there are a lot of older servers with much less RAM.  (and where it is not possible, or economically feasible to add memory)

 

Joe L.

Link to comment

Vince-

Have you tried adding a swap file?

 

Yes, I made a small (1GB) swap file at one point.  When it crashed, it showed that it hadn't used any of it.

 

Right, I see that above now.  I guess I'm not sure how to approach this problem.  To answer your earlier questions...

 

So...  Is this the correct "fix?"  Should unRAID detect systems with a large amount of memory and set this for us?

 

Not sure this will solve your problem, as I think you have verified.

 

Is it possible to get a 64-bit kernel for unRAID?

This is something I'd like to try but it's not going to happen until 5.0 is 'final'.

 

Has anybody else seen this issue?

I have not seen this specifically.  As Joe mentioned there are times where many apps might be running and cause an oom condition, but is solved by setting up a swap file.

 

As for the redhat "huge" kernel - I think that is simply a "PAE" enabled kernel, which is what unRaid uses already.

Link to comment

As for the redhat "huge" kernel - I think that is simply a "PAE" enabled kernel, which is what unRaid uses already.

 

I'm not sure that's what it is...  Reading through comment 10 here: https://bugzilla.redhat.com/show_bug.cgi?id=241314#c10

 

Chris indicates hugemem was a set of RedHat-proprietary patches that made addressing large amounts of memory "reliable."  He indicates 16GB is the maximum that a normal kernel can handle -- which is not what I'm seeing, but it could be that things are different with 3.x...

 

I might try playing around with the mem= boot parameter to find a happy medium.

 

As large-memory configurations become more common, it might be a good idea to have unRAID detect these scenarios and boot with a mem= flag so it won't crash at least.

 

Is anyone else using unRAID with 16GB RAM?

 

Thanks for the insight guys.

 

-Vince

Link to comment

Is anyone else using unRAID with 16GB RAM?

I recently went from 8GB to 12GB when I found 4 2GB sticks in my spares cupboard.

root@The-Vault:~# free -lm
             total       used       free     shared    buffers     cached
Mem:         11417      10687        729          0         79      10197
Low:           790        441        348
High:        10627      10246        380
-/+ buffers/cache:        410      11006
Swap:         3906          0       3906
root@The-Vault:~# egrep 'High|Low' /proc/meminfo
HighTotal:      10882120 kB
HighFree:         309136 kB
LowTotal:         808972 kB
LowFree:          357752 kB
root@The-Vault:~#

I also have swap space set up, which shows minimal usage:

root@The-Vault:~# swapon -s
Filename                                Type            Size    Used    Priority
/dev/sdb2                               partition       4000180 28      -1
root@The-Vault:~#

Cheers.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.