unRAID Server Release 6.0-beta3-x86_64 Available


limetech

Recommended Posts

I now tried a completely different USB key which I formatted before, I copied the config directory from the old key and the complete contents of the download but still get the strange screen as soon as the bzimage is loaded, the screen changes to the attached picture

 

Maybe your download is corrupted?  You can check the md5sum of the download zip file against the value on the Downloads page.

 

Hi Tom, I now checked the md5 from the download I used for my 2 usb keys and the md5 value matches from the download. Anything I can do? STrange thing I used the same usb key in another PC and there it booted fine without these funny symbols. Could this motherboard have a problem with xen?

 

this is the details from the working one

 

Motherboard: Gigabyte Technology Co., Ltd. - X48-DQ6

CPU: Intel® Core2 Quad CPU

Speed: 2.133 GHz

Cache: 64 kB, 4096 kB

Memory: 6144 MB (max. 4 GB)

Network: eth0: 1000Mb/s - Full Duplex

Connections: Zero

Uptime:

Link to comment
  • Replies 661
  • Created
  • Last Reply

Top Posters In This Topic

SchoolBusDriver - in http://lime-technology.com/forum/index.php?topic=31653.msg288903#msg288903 you recommend (amongst other things):

allocate a CPU to dom0

don't allocate fixed amount of memory to dom0

 

Just came across this: http://wiki.xen.org/wiki/Xen_Best_Practices which agrees with your first recommendation, but has the opposite view on memory - ie DO allocate fixed amount of memory.

 

Thanks for sharing. That is some great info.

 

The issue here is Tom has dedicated 2 GB of memory in beta 2. With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already). Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing.

 

1. A lot of users do not have more than 4GB of memory in our servers.

 

2. A lot of users do not use Cache Directories (I don't).

 

3. People who fall into 1 & 2 above wouldn't have any RAM left over to use for VMs because Tom assigned it all to Dom0. (We only use 512MB+ after unRAID boots and never more).

 

4. Most Xen Servers are bare bones and typically have 512mb to 1GB assigned to them. The rest is dedicated to VMs but those servers are vanilla and don't have memory hog apps like Cache Directories running.

 

5. If Tom doesn't dedicate memory to Dom0 it will get it all of it. BUT... The VMs can still use it and if a tug of war were to happen between Dom0 and VMs with memory... Dom0 would win. The way it is now, Dom0 loses and your server crashes.

 

Link to comment

The issue here is Tom has dedicated 2 GB of memory in beta 2.

This is very easily changed via syslinux.cfg for experimental purposes and it would be interesting to get reports from people doing that.

 

With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already).  Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing.

No way 'cache dirs' caused any OOM condition.  All it does is periodically do a 'find' on the user share file system to try and keep the inodes in memory. The reason I allocated 2GB was, as stated before, was to get around a boot issue I was seeing.  The crash (link?) was due to something else.

 

Edit: Yeah cache_dirs can indeed cause OOM with large directories.

 

Link to comment

The issue here is Tom has dedicated 2 GB of memory in beta 2.

This is very easily changed via syslinux.cfg for experimental purposes and it would be interesting to get reports from people doing that.

 

With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already).  Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing.

No way 'cache dirs' caused any OOM condition.  All it does is periodically do a 'find' on the user share file system to try and keep the inodes in memory.  The reason I allocated 2GB was, as stated before, was to get around a boot issue I was seeing.  The crash (link?) was due to something else.

 

Hey Tom,

 

Here is a link to the post regarding the crash:

 

http://lime-technology.com/forum/index.php?topic=31653.msg288851#msg288851

 

Ogi

Link to comment

Feb  4 19:06:38 Tower kernel: Out of memory: Kill process 11269 (qemu-dm) score 60 or sacrifice child
Feb  4 19:06:38 Tower kernel: Killed process 11269 (qemu-dm) total-vm:352628kB, anon-rss:78148kB, file-rss:5948kB

Can you make it happen again?  Maybe have a 'tail -f /var/log/syslog' running in a telnet window.

Link to comment

I can try, i added more ram to Dom0 up'd it to 5 gb running cache dir now its at  2.8 gigs and climbing i will run that command once it gets closer to 5 gb and let it die.

 

xentop - 14:39:58   Xen 4.3.1
3 domains: 2 running, 1 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 8387900k total, 8150848k used, 237052k free    CPUs: 12 @ 1809MHz
      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR  VBD_RSECT  VBD_WSECT SSID
    archVM --b---         28    0.0     524288    6.3     525312       6.3     1    1    14234     7458    0        0        0        0          0          0    0
  Domain-0 -----r      27855  249.6    5677428   67.7   no limit       n/a     9    0        0        0    0        0        0        0          0          0    0
  Windows7 -----r      23369   30.1    1832804   21.9    1836032      21.9     2    1 12696548  9313423    0        0        0        0          0          0    0

 

Arch = Mumble server

Windows = utorrent, sickbeard, coachpotato, PIA vpn

Unraid = Plex, cache_dir (I will be moving plex off to Arch once the storage issue is sorted)

Link to comment

Got it to crash again, unfortunately i was not running the code you asked when it failed (however i was running htop!) as i wasn't expecting it to crash. This is definitely related to cache_dirs as it ran fine for the past 18 hours untill i came home from work and started cache_dirs to see if i could get it to crash again....

 

20140205_161405.jpg

 

9dyR7kC.png

 

-----------------------------------------------------------------

the only change to cache_dirs i am using is ulimit -v 30720 (30 mb)

 

It will work on 6.0 if you comment out the "ulimit -v 5000" line.  Use this suggestion at your own risk, there's probably a better solution than completely commenting the line out.

 

Aha! Interesting how attempting to limit cache_dirs to 5MB of virtual memory causes the errors.
EDIT: It seems setting that value to any thing less then 15MB or 15360 causes things to break.

[/code]

 

ref: http://lime-technology.com/forum/index.php?topic=4500.msg287629#msg287629

syslog.zip

syslog-20140205-053255.zip

Link to comment

Crashed again 3rd time....... output from root@Tower:~# tail -f /var/log/syslog

 

Tower login: root
Linux 3.10.24p-unRAID.
Last login: Wed Feb  5 15:40:57 -0800 2014 on /dev/pts/0 from Office.
root@Tower:~# tail -f /var/log/syslog
Feb  5 15:25:45 Tower kernel: br0: port 2(vif1.0) entered listening state
Feb  5 15:26:00 Tower kernel: br0: port 2(vif1.0) entered learning state
Feb  5 15:26:15 Tower kernel: br0: topology change detected, propagating
Feb  5 15:26:15 Tower kernel: br0: port 2(vif1.0) entered forwarding state
Feb  5 15:32:22 Tower sudo:     root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND                                                                            =/etc/rc.d/rc.samba restart > /dev/null 2>&1
Feb  5 15:32:54 Tower shfs/user: shfs_rmdir: rmdir: /mnt/disk1/BIN/vmwin-pc/mnt/                                                                            user/Data (39) Directory not empty
Feb  5 15:40:55 Tower in.telnetd[5022]: connect from 192.168.2.80 (192.168.2.80)
Feb  5 15:40:57 Tower login[5023]: ROOT LOGIN  on '/dev/pts/0' from 'Office'
Feb  5 15:41:39 Tower in.telnetd[5065]: connect from 192.168.2.80 (192.168.2.80)
Feb  5 15:41:40 Tower login[5066]: ROOT LOGIN  on '/dev/pts/3' from 'Office'
Feb  5 15:48:02 Tower in.telnetd[5523]: connect from 192.168.2.80 (192.168.2.80)
Feb  5 15:48:04 Tower login[5524]: ROOT LOGIN  on '/dev/pts/4' from 'Office'
Feb  5 15:48:24 Tower in.telnetd[5537]: connect from 192.168.2.80 (192.168.2.80)
Feb  5 15:48:26 Tower login[5538]: ROOT LOGIN  on '/dev/pts/4' from 'Office'
Feb  5 17:03:04 Tower kernel: BUG: unable to handle kernel paging request at 000                                                                            000810000007a
Feb  5 17:03:04 Tower kernel: IP: [<ffffffff810a0adb>] isolate_migratepages_rang                                                                            e+0x2cb/0x600
Feb  5 17:03:04 Tower kernel: PGD 0
Feb  5 17:03:04 Tower kernel: Oops: 0000 [#1] SMP
Feb  5 17:03:04 Tower kernel: Modules linked in: tun md_mod w83627hf hwmon_vid x                                                                            en_netback xen_blkback xen_gntalloc xen_gntdev bridge stp llc sata_mv mperf k10t                                                                            emp hwmon shpchp pci_hotplug forcedeth sata_nv pata_amd [last unloaded: md_mod]
Feb  5 17:03:04 Tower kernel: CPU: 5 PID: 3528 Comm: Plex Media Serv Not tainted                                                                             3.10.24p-unRAID #13
Feb  5 17:03:04 Tower kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 08                                                                            0014  10/22/2009
Feb  5 17:03:04 Tower kernel: task: ffff8801d496e180 ti: ffff8801ce87c000 task.t                                                                            i: ffff8801ce87c000
Feb  5 17:03:04 Tower kernel: RIP: e030:[<ffffffff810a0adb>]  [<ffffffff810a0adb                                                                            >] isolate_migratepages_range+0x2cb/0x600
Feb  5 17:03:04 Tower kernel: RSP: e02b:ffff8801ce87d7e0  EFLAGS: 00010002
Feb  5 17:03:04 Tower kernel: RAX: 0000008100000002 RBX: ffff8801ce87d8c0 RCX: 0                                                                            000000000000002
Feb  5 17:03:04 Tower kernel: RDX: 8000000000000000 RSI: 0000000000000004 RDI: 0                                                                            00000000000000c
Feb  5 17:03:04 Tower kernel: RBP: ffff8801ce87d878 R08: ffff880007fffdc0 R09: 0                                                                            000000000000001
Feb  5 17:03:04 Tower kernel: R10: 0000000000000000 R11: 0000000000000771 R12: 0                                                                            000000000100e0e
Feb  5 17:03:04 Tower kernel: R13: ffffffff81688ac0 R14: 0000000000000001 R15: f                                                                            fffea0004038380
Feb  5 17:03:04 Tower kernel: FS:  00007f23eb63b700(0000) GS:ffff880210d40000(00                                                                            00) knlGS:0000000000000000
Feb  5 17:03:04 Tower kernel: CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb  5 17:03:04 Tower kernel: CR2: 000000810000007a CR3: 00000001d2654000 CR4: 0                                                                            000000000000660
Feb  5 17:03:04 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0                                                                            000000000000000
Feb  5 17:03:04 Tower kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0                                                                            000000000000400
Feb  5 17:03:04 Tower kernel: Stack:
Feb  5 17:03:04 Tower kernel:  ffffffff00000001 ffff8801ce87dfd8 ffff8801d496e18                                                                            0 0000000000101000
Feb  5 17:03:04 Tower kernel:  0000000000000403 0000000000000150 000000000000000                                                                            0 ffffffff81688f40
Feb  5 17:03:04 Tower kernel:  ffff8801ce87d8d0 000000000087d850 ffffea0004032fc                                                                            0 0000000000000008
Feb  5 17:03:04 Tower kernel: Call Trace:
Feb  5 17:03:04 Tower kernel:  [<ffffffff810a10d7>] compact_zone+0x23f/0x2cf
Feb  5 17:03:04 Tower kernel:  [<ffffffff810a1200>] compact_zone_order+0x99/0xae
Feb  5 17:03:04 Tower kernel:  [<ffffffff810a13dd>] try_to_compact_pages+0x9a/0x                                                                            e7
Feb  5 17:03:04 Tower kernel:  [<ffffffff81494360>] __alloc_pages_direct_compact                                                                            +0xa4/0x18e
Feb  5 17:03:04 Tower kernel:  [<ffffffff810915c9>] __alloc_pages_nodemask+0x4a0                                                                            /0x858
Feb  5 17:03:04 Tower kernel:  [<ffffffff813e2cab>] sk_page_frag_refill+0x74/0x1                                                                            32
Feb  5 17:03:04 Tower kernel:  [<ffffffff8141c510>] tcp_sendmsg+0x404/0xb6e
Feb  5 17:03:04 Tower kernel:  [<ffffffff8143b601>] inet_sendmsg+0x58/0x91
Feb  5 17:03:04 Tower kernel:  [<ffffffff8149ba72>] ? _raw_spin_unlock_irqrestor                                                                            e+0x19/0x1c
Feb  5 17:03:04 Tower kernel:  [<ffffffff813dee1d>] sock_sendmsg+0x6d/0x80
Feb  5 17:03:04 Tower kernel:  [<ffffffff8149afeb>] ? io_schedule+0xae/0xd1
Feb  5 17:03:04 Tower kernel:  [<ffffffff8149976e>] ? __wait_on_bit_lock+0x76/0x                                                                            85
Feb  5 17:03:04 Tower kernel:  [<ffffffff813dfdc8>] ___sys_sendmsg.part.32+0x163                                                                            /0x1d1
Feb  5 17:03:04 Tower kernel:  [<ffffffff811ed6c7>] ? fuse_file_aio_read+0x7f/0x                                                                            88
Feb  5 17:03:04 Tower kernel:  [<ffffffff810cc552>] ? do_sync_read+0x7a/0x9f
Feb  5 17:03:04 Tower kernel:  [<ffffffff8149ba1b>] ? _raw_spin_lock+0x9/0xd
Feb  5 17:03:04 Tower kernel:  [<ffffffff810dd2c7>] ? dput+0xd3/0x175
Feb  5 17:03:04 Tower kernel:  [<ffffffff813e0ab4>] __sys_sendmsg+0x49/0x6a
Feb  5 17:03:04 Tower kernel:  [<ffffffff813e0ade>] SyS_sendmsg+0x9/0xb
Feb  5 17:03:04 Tower kernel:  [<ffffffff8149c729>] system_call_fastpath+0x16/0x                                                                            1b
Feb  5 17:03:04 Tower kernel: Code: 85 c0 0f 89 39 02 00 00 49 8b 17 4c 89 f8 80                                                                             e6 80 74 04 49 8b 47 30 8b 40 1c ff c8 0f 85 1f 02 00 00 49 8b 47 08 48 85 c0 7                                                                            4 0d <48> 8b 40 78 48 c1 e8 1d 83 e0 01 eb 02 31 c0 85 c0 0f 84 ff 01
Feb  5 17:03:04 Tower kernel: RIP  [<ffffffff810a0adb>] isolate_migratepages_ran                                                                            ge+0x2cb/0x600
Feb  5 17:03:04 Tower kernel:  RSP <ffff8801ce87d7e0>
Feb  5 17:03:04 Tower kernel: CR2: 000000810000007a
Feb  5 17:03:04 Tower kernel: ---[ end trace 654e47442e593b34 ]---

Link to comment

not sure if cache_dirs was running the 3rd crash...... now running swap file (plg) and no cache_dir (deleted from box)

If this runs ok, re-enable cache_dirs and use a swap file.  I thought somewhere you posted what the cache_dirs command with options you were using, but I can't find it now, please repost if you don't mind.

Link to comment

not sure if cache_dirs was running the 3rd crash...... now running swap file (plg) and no cache_dir (deleted from box)

If this runs ok, re-enable cache_dirs and use a swap file.  I thought somewhere you posted what the cache_dirs command with options you were using, but I can't find it now, please repost if you don't mind.

 

Pyrater: the only change to cache_dirs i am using is ulimit -v 30720 (30 mb)

 

It will work on 6.0 if you comment out the "ulimit -v 5000" line.  Use this suggestion at your own risk, there's probably a better solution than completely commenting the line out.

 

Aha! Interesting how attempting to limit cache_dirs to 5MB of virtual memory causes the errors.

EDIT: It seems setting that value to any thing less then 15MB or 15360 causes things to break.

 

Ill let this run tonight with the swap and without cache_dirs to ensure we are not chasing the wrong thing. Assuming this is stable i will then re-enable cache_dirs. Currently rocking:

 

Swapfile Plugin local version: v0.5.3

Swap file location and filename: /mnt/cache/swapfile

Swap file size: 4096.06 MB used: 0.03 MB

Link to comment

Maybe not what people want to hear but just don't use cache dirs.

 

It's a great concept though after chasing my tail with oom's in the past I gave up and just reorganized my media and use disk shares for sources instead of one share for all video.  Not one oom since.

 

The whole point is to prevent spin ups.

If you use direct disk paths in xbmc, you scan your paths individually as you add media.

The only times xbmc will cause a spin up apart from actually watching something, is if you force a scan or didn't let it finish building all the artwork from a previous scan.

 

Link to comment

What if you just allocate more memory to the unRAID machine?

This is supposed to be 64bit with an increased capacity for memory and memory management.

 

cache_dirs never worked for me, I always had memory issues. Even just rsyncing huge directories in parallel would do this.

It depends on how many files you have vs how much low memory is available.

I've always had to drop the cache before and after an operation with a huge amount of directories.

 

I thought migration to 64bit would have made this less of an issue.

 

Look at all the plexmedia server processes. That could also be competing for ram.

Link to comment

I now tried a completely different USB key which I formatted before, I copied the config directory from the old key and the complete contents of the download but still get the strange screen as soon as the bzimage is loaded, the screen changes to the attached picture

 

Maybe your download is corrupted?  You can check the md5sum of the download zip file against the value on the Downloads page.

 

Hi Tom, I now checked the md5 from the download I used for my 2 usb keys and the md5 value matches from the download. Anything I can do? STrange thing I used the same usb key in another PC and there it booted fine without these funny symbols. Could this motherboard have a problem with xen?

 

this is the details from the working one

 

Motherboard: Gigabyte Technology Co., Ltd. - X48-DQ6

CPU: Intel® Core2 Quad CPU

Speed: 2.133 GHz

Cache: 64 kB, 4096 kB

Memory: 6144 MB (max. 4 GB)

Network: eth0: 1000Mb/s - Full Duplex

Connections: Zero

Uptime:

 

I had something similar on a server recently....my (very old) video card was on its way out. I found this after I had sent the motherboard out and paid a "fault not found" penalty :(

 

I replaced the the video card and all was good.

Tony

Link to comment

It has been stable for the last 10 hours without cache dirs using less than 1400mb of the 5000mb of ram I allocated to it. It is currently using 318mb of the 4gb swap file I set up as well. I will continue to monitor it and enable cache dirs after work tonight and see if the swap file fixed it.

Link to comment

It's hard for me to believe that any network protocol running between unRaid storage and a domU via a virtual network connection is going to be slower than the speed of a single disk, even an SSD.  If you have a raid5 setup with a large stripe size, then sure, but that's not unRaid.

 

I don't doubt that virtfs is faster, but anecdotally I'm not seeing any speed difference between writing to my SSD cache drive over Samba and directly. A caveat: I was blown away by the 10x speedup with network writes after switching to SSD, so maybe it's 9x versus 10x and I'm not noticing the difference.

 

On a related note, I used to have a separate computer writing 2 OTA 1080p mpeg streams to a dedicated disk, and it would have problems. Now I'm running that in a Windows VM that is writing to a network mount to an unraid share (with cache disk disabled), and it's rock-solid.

Link to comment

hi there all,

 

today I installed the beta on my production server as I wanted to test xen, however as soon as I select the zen option it just shows a screen with lots of strange numbers, unfortunately I don;t know how to capture besides a picture, I have already booted several times. When I choose the non xen option it works fine. Below is the mainboard I am using. I have attached the syslog from the normal boot.

 

Motherboard: ASUSTeK Computer INC. - P7H55-M LX

CPU: Intel® Core i3 CPU 540 @ 3.07GHz

Speed: 3.066 GHz

Cache: 128 kB, 512 kB, 4096 kB

Memory: 4096 MB (max. 16 GB)

Network: eth0: 1000Mb/s - Full Duplex

My guess is "virtualization technology" is not turned on in your mobo BIOS settings.

 

actually the opposite, I had VT-D enabled in the bios, but my CPU does not support it, never had problem before but I guess it was not called upon. So now I just enabled VT-x and disabled VT-D and it booted!!!

Link to comment

ok, so I am not in the webgui but I cannot create a share for my newly installed cache drive. Under Main it shows the new drive as Cache but every time I try to create a share via the gui and click add share it thinks for a few sec and it goes back to the add share screen, I don't get the option to enabled NFS/Samba. I had the same problem with 5.05.

 

system log please

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.