MrLondon Posted February 5, 2014 Share Posted February 5, 2014 I now tried a completely different USB key which I formatted before, I copied the config directory from the old key and the complete contents of the download but still get the strange screen as soon as the bzimage is loaded, the screen changes to the attached picture Maybe your download is corrupted? You can check the md5sum of the download zip file against the value on the Downloads page. Hi Tom, I now checked the md5 from the download I used for my 2 usb keys and the md5 value matches from the download. Anything I can do? STrange thing I used the same usb key in another PC and there it booted fine without these funny symbols. Could this motherboard have a problem with xen? this is the details from the working one Motherboard: Gigabyte Technology Co., Ltd. - X48-DQ6 CPU: Intel® Core2 Quad CPU Speed: 2.133 GHz Cache: 64 kB, 4096 kB Memory: 6144 MB (max. 4 GB) Network: eth0: 1000Mb/s - Full Duplex Connections: Zero Uptime: Quote Link to comment
SchoolBusDriver Posted February 5, 2014 Share Posted February 5, 2014 SchoolBusDriver - in http://lime-technology.com/forum/index.php?topic=31653.msg288903#msg288903 you recommend (amongst other things): allocate a CPU to dom0 don't allocate fixed amount of memory to dom0 Just came across this: http://wiki.xen.org/wiki/Xen_Best_Practices which agrees with your first recommendation, but has the opposite view on memory - ie DO allocate fixed amount of memory. Thanks for sharing. That is some great info. The issue here is Tom has dedicated 2 GB of memory in beta 2. With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already). Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing. 1. A lot of users do not have more than 4GB of memory in our servers. 2. A lot of users do not use Cache Directories (I don't). 3. People who fall into 1 & 2 above wouldn't have any RAM left over to use for VMs because Tom assigned it all to Dom0. (We only use 512MB+ after unRAID boots and never more). 4. Most Xen Servers are bare bones and typically have 512mb to 1GB assigned to them. The rest is dedicated to VMs but those servers are vanilla and don't have memory hog apps like Cache Directories running. 5. If Tom doesn't dedicate memory to Dom0 it will get it all of it. BUT... The VMs can still use it and if a tug of war were to happen between Dom0 and VMs with memory... Dom0 would win. The way it is now, Dom0 loses and your server crashes. Quote Link to comment
limetech Posted February 5, 2014 Author Share Posted February 5, 2014 The issue here is Tom has dedicated 2 GB of memory in beta 2. This is very easily changed via syslinux.cfg for experimental purposes and it would be interesting to get reports from people doing that. With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already). Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing. No way 'cache dirs' caused any OOM condition. All it does is periodically do a 'find' on the user share file system to try and keep the inodes in memory. The reason I allocated 2GB was, as stated before, was to get around a boot issue I was seeing. The crash (link?) was due to something else. Edit: Yeah cache_dirs can indeed cause OOM with large directories. Quote Link to comment
ogi Posted February 5, 2014 Share Posted February 5, 2014 The issue here is Tom has dedicated 2 GB of memory in beta 2. This is very easily changed via syslinux.cfg for experimental purposes and it would be interesting to get reports from people doing that. With only 2GB of RAM assigned, running Cache Directories your server will probably crash (as it did for one user already). Tom could dedicate 4GB of memory to Dom0 to prevent the users who run Cache Directories from crashing. No way 'cache dirs' caused any OOM condition. All it does is periodically do a 'find' on the user share file system to try and keep the inodes in memory. The reason I allocated 2GB was, as stated before, was to get around a boot issue I was seeing. The crash (link?) was due to something else. Hey Tom, Here is a link to the post regarding the crash: http://lime-technology.com/forum/index.php?topic=31653.msg288851#msg288851 Ogi Quote Link to comment
pyrater Posted February 5, 2014 Share Posted February 5, 2014 Feb 4 19:06:38 Tower kernel: Out of memory: Kill process 11269 (qemu-dm) score 60 or sacrifice child Feb 4 19:06:38 Tower kernel: Killed process 11269 (qemu-dm) total-vm:352628kB, anon-rss:78148kB, file-rss:5948kB Quote Link to comment
limetech Posted February 5, 2014 Author Share Posted February 5, 2014 Feb 4 19:06:38 Tower kernel: Out of memory: Kill process 11269 (qemu-dm) score 60 or sacrifice child Feb 4 19:06:38 Tower kernel: Killed process 11269 (qemu-dm) total-vm:352628kB, anon-rss:78148kB, file-rss:5948kB Can you make it happen again? Maybe have a 'tail -f /var/log/syslog' running in a telnet window. Quote Link to comment
pyrater Posted February 5, 2014 Share Posted February 5, 2014 I can try, i added more ram to Dom0 up'd it to 5 gb running cache dir now its at 2.8 gigs and climbing i will run that command once it gets closer to 5 gb and let it die. xentop - 14:39:58 Xen 4.3.1 3 domains: 2 running, 1 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown Mem: 8387900k total, 8150848k used, 237052k free CPUs: 12 @ 1809MHz NAME STATE CPU(sec) CPU(%) MEM(k) MEM(%) MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS VBD_OO VBD_RD VBD_WR VBD_RSECT VBD_WSECT SSID archVM --b--- 28 0.0 524288 6.3 525312 6.3 1 1 14234 7458 0 0 0 0 0 0 0 Domain-0 -----r 27855 249.6 5677428 67.7 no limit n/a 9 0 0 0 0 0 0 0 0 0 0 Windows7 -----r 23369 30.1 1832804 21.9 1836032 21.9 2 1 12696548 9313423 0 0 0 0 0 0 0 Arch = Mumble server Windows = utorrent, sickbeard, coachpotato, PIA vpn Unraid = Plex, cache_dir (I will be moving plex off to Arch once the storage issue is sorted) Quote Link to comment
pyrater Posted February 5, 2014 Share Posted February 5, 2014 when it crashed i believe i had it set to 1 gb of ram only while running the same addons. Quote Link to comment
pyrater Posted February 5, 2014 Share Posted February 5, 2014 Got it to crash again, unfortunately i was not running the code you asked when it failed (however i was running htop!) as i wasn't expecting it to crash. This is definitely related to cache_dirs as it ran fine for the past 18 hours untill i came home from work and started cache_dirs to see if i could get it to crash again.... ----------------------------------------------------------------- the only change to cache_dirs i am using is ulimit -v 30720 (30 mb) It will work on 6.0 if you comment out the "ulimit -v 5000" line. Use this suggestion at your own risk, there's probably a better solution than completely commenting the line out. Aha! Interesting how attempting to limit cache_dirs to 5MB of virtual memory causes the errors. EDIT: It seems setting that value to any thing less then 15MB or 15360 causes things to break. [/code] ref: http://lime-technology.com/forum/index.php?topic=4500.msg287629#msg287629 syslog.zip syslog-20140205-053255.zip Quote Link to comment
pyrater Posted February 6, 2014 Share Posted February 6, 2014 Crashed again 3rd time....... output from root@Tower:~# tail -f /var/log/syslog Tower login: root Linux 3.10.24p-unRAID. Last login: Wed Feb 5 15:40:57 -0800 2014 on /dev/pts/0 from Office. root@Tower:~# tail -f /var/log/syslog Feb 5 15:25:45 Tower kernel: br0: port 2(vif1.0) entered listening state Feb 5 15:26:00 Tower kernel: br0: port 2(vif1.0) entered learning state Feb 5 15:26:15 Tower kernel: br0: topology change detected, propagating Feb 5 15:26:15 Tower kernel: br0: port 2(vif1.0) entered forwarding state Feb 5 15:32:22 Tower sudo: root : TTY=unknown ; PWD=/ ; USER=root ; COMMAND =/etc/rc.d/rc.samba restart > /dev/null 2>&1 Feb 5 15:32:54 Tower shfs/user: shfs_rmdir: rmdir: /mnt/disk1/BIN/vmwin-pc/mnt/ user/Data (39) Directory not empty Feb 5 15:40:55 Tower in.telnetd[5022]: connect from 192.168.2.80 (192.168.2.80) Feb 5 15:40:57 Tower login[5023]: ROOT LOGIN on '/dev/pts/0' from 'Office' Feb 5 15:41:39 Tower in.telnetd[5065]: connect from 192.168.2.80 (192.168.2.80) Feb 5 15:41:40 Tower login[5066]: ROOT LOGIN on '/dev/pts/3' from 'Office' Feb 5 15:48:02 Tower in.telnetd[5523]: connect from 192.168.2.80 (192.168.2.80) Feb 5 15:48:04 Tower login[5524]: ROOT LOGIN on '/dev/pts/4' from 'Office' Feb 5 15:48:24 Tower in.telnetd[5537]: connect from 192.168.2.80 (192.168.2.80) Feb 5 15:48:26 Tower login[5538]: ROOT LOGIN on '/dev/pts/4' from 'Office' Feb 5 17:03:04 Tower kernel: BUG: unable to handle kernel paging request at 000 000810000007a Feb 5 17:03:04 Tower kernel: IP: [<ffffffff810a0adb>] isolate_migratepages_rang e+0x2cb/0x600 Feb 5 17:03:04 Tower kernel: PGD 0 Feb 5 17:03:04 Tower kernel: Oops: 0000 [#1] SMP Feb 5 17:03:04 Tower kernel: Modules linked in: tun md_mod w83627hf hwmon_vid x en_netback xen_blkback xen_gntalloc xen_gntdev bridge stp llc sata_mv mperf k10t emp hwmon shpchp pci_hotplug forcedeth sata_nv pata_amd [last unloaded: md_mod] Feb 5 17:03:04 Tower kernel: CPU: 5 PID: 3528 Comm: Plex Media Serv Not tainted 3.10.24p-unRAID #13 Feb 5 17:03:04 Tower kernel: Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 08 0014 10/22/2009 Feb 5 17:03:04 Tower kernel: task: ffff8801d496e180 ti: ffff8801ce87c000 task.t i: ffff8801ce87c000 Feb 5 17:03:04 Tower kernel: RIP: e030:[<ffffffff810a0adb>] [<ffffffff810a0adb >] isolate_migratepages_range+0x2cb/0x600 Feb 5 17:03:04 Tower kernel: RSP: e02b:ffff8801ce87d7e0 EFLAGS: 00010002 Feb 5 17:03:04 Tower kernel: RAX: 0000008100000002 RBX: ffff8801ce87d8c0 RCX: 0 000000000000002 Feb 5 17:03:04 Tower kernel: RDX: 8000000000000000 RSI: 0000000000000004 RDI: 0 00000000000000c Feb 5 17:03:04 Tower kernel: RBP: ffff8801ce87d878 R08: ffff880007fffdc0 R09: 0 000000000000001 Feb 5 17:03:04 Tower kernel: R10: 0000000000000000 R11: 0000000000000771 R12: 0 000000000100e0e Feb 5 17:03:04 Tower kernel: R13: ffffffff81688ac0 R14: 0000000000000001 R15: f fffea0004038380 Feb 5 17:03:04 Tower kernel: FS: 00007f23eb63b700(0000) GS:ffff880210d40000(00 00) knlGS:0000000000000000 Feb 5 17:03:04 Tower kernel: CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b Feb 5 17:03:04 Tower kernel: CR2: 000000810000007a CR3: 00000001d2654000 CR4: 0 000000000000660 Feb 5 17:03:04 Tower kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0 000000000000000 Feb 5 17:03:04 Tower kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0 000000000000400 Feb 5 17:03:04 Tower kernel: Stack: Feb 5 17:03:04 Tower kernel: ffffffff00000001 ffff8801ce87dfd8 ffff8801d496e18 0 0000000000101000 Feb 5 17:03:04 Tower kernel: 0000000000000403 0000000000000150 000000000000000 0 ffffffff81688f40 Feb 5 17:03:04 Tower kernel: ffff8801ce87d8d0 000000000087d850 ffffea0004032fc 0 0000000000000008 Feb 5 17:03:04 Tower kernel: Call Trace: Feb 5 17:03:04 Tower kernel: [<ffffffff810a10d7>] compact_zone+0x23f/0x2cf Feb 5 17:03:04 Tower kernel: [<ffffffff810a1200>] compact_zone_order+0x99/0xae Feb 5 17:03:04 Tower kernel: [<ffffffff810a13dd>] try_to_compact_pages+0x9a/0x e7 Feb 5 17:03:04 Tower kernel: [<ffffffff81494360>] __alloc_pages_direct_compact +0xa4/0x18e Feb 5 17:03:04 Tower kernel: [<ffffffff810915c9>] __alloc_pages_nodemask+0x4a0 /0x858 Feb 5 17:03:04 Tower kernel: [<ffffffff813e2cab>] sk_page_frag_refill+0x74/0x1 32 Feb 5 17:03:04 Tower kernel: [<ffffffff8141c510>] tcp_sendmsg+0x404/0xb6e Feb 5 17:03:04 Tower kernel: [<ffffffff8143b601>] inet_sendmsg+0x58/0x91 Feb 5 17:03:04 Tower kernel: [<ffffffff8149ba72>] ? _raw_spin_unlock_irqrestor e+0x19/0x1c Feb 5 17:03:04 Tower kernel: [<ffffffff813dee1d>] sock_sendmsg+0x6d/0x80 Feb 5 17:03:04 Tower kernel: [<ffffffff8149afeb>] ? io_schedule+0xae/0xd1 Feb 5 17:03:04 Tower kernel: [<ffffffff8149976e>] ? __wait_on_bit_lock+0x76/0x 85 Feb 5 17:03:04 Tower kernel: [<ffffffff813dfdc8>] ___sys_sendmsg.part.32+0x163 /0x1d1 Feb 5 17:03:04 Tower kernel: [<ffffffff811ed6c7>] ? fuse_file_aio_read+0x7f/0x 88 Feb 5 17:03:04 Tower kernel: [<ffffffff810cc552>] ? do_sync_read+0x7a/0x9f Feb 5 17:03:04 Tower kernel: [<ffffffff8149ba1b>] ? _raw_spin_lock+0x9/0xd Feb 5 17:03:04 Tower kernel: [<ffffffff810dd2c7>] ? dput+0xd3/0x175 Feb 5 17:03:04 Tower kernel: [<ffffffff813e0ab4>] __sys_sendmsg+0x49/0x6a Feb 5 17:03:04 Tower kernel: [<ffffffff813e0ade>] SyS_sendmsg+0x9/0xb Feb 5 17:03:04 Tower kernel: [<ffffffff8149c729>] system_call_fastpath+0x16/0x 1b Feb 5 17:03:04 Tower kernel: Code: 85 c0 0f 89 39 02 00 00 49 8b 17 4c 89 f8 80 e6 80 74 04 49 8b 47 30 8b 40 1c ff c8 0f 85 1f 02 00 00 49 8b 47 08 48 85 c0 7 4 0d <48> 8b 40 78 48 c1 e8 1d 83 e0 01 eb 02 31 c0 85 c0 0f 84 ff 01 Feb 5 17:03:04 Tower kernel: RIP [<ffffffff810a0adb>] isolate_migratepages_ran ge+0x2cb/0x600 Feb 5 17:03:04 Tower kernel: RSP <ffff8801ce87d7e0> Feb 5 17:03:04 Tower kernel: CR2: 000000810000007a Feb 5 17:03:04 Tower kernel: ---[ end trace 654e47442e593b34 ]--- Quote Link to comment
limetech Posted February 6, 2014 Author Share Posted February 6, 2014 Got it to crash again, Right, I stand corrected: it is possible for cache_dirs to consume your memory. I need to study this further. Probably a viable workaround is to setup some swap space. Quote Link to comment
pyrater Posted February 6, 2014 Share Posted February 6, 2014 not sure if cache_dirs was running the 3rd crash...... now running swap file (plg) and no cache_dir (deleted from box) Quote Link to comment
limetech Posted February 6, 2014 Author Share Posted February 6, 2014 not sure if cache_dirs was running the 3rd crash...... now running swap file (plg) and no cache_dir (deleted from box) If this runs ok, re-enable cache_dirs and use a swap file. I thought somewhere you posted what the cache_dirs command with options you were using, but I can't find it now, please repost if you don't mind. Quote Link to comment
pyrater Posted February 6, 2014 Share Posted February 6, 2014 not sure if cache_dirs was running the 3rd crash...... now running swap file (plg) and no cache_dir (deleted from box) If this runs ok, re-enable cache_dirs and use a swap file. I thought somewhere you posted what the cache_dirs command with options you were using, but I can't find it now, please repost if you don't mind. Pyrater: the only change to cache_dirs i am using is ulimit -v 30720 (30 mb) It will work on 6.0 if you comment out the "ulimit -v 5000" line. Use this suggestion at your own risk, there's probably a better solution than completely commenting the line out. Aha! Interesting how attempting to limit cache_dirs to 5MB of virtual memory causes the errors. EDIT: It seems setting that value to any thing less then 15MB or 15360 causes things to break. Ill let this run tonight with the swap and without cache_dirs to ensure we are not chasing the wrong thing. Assuming this is stable i will then re-enable cache_dirs. Currently rocking: Swapfile Plugin local version: v0.5.3 Swap file location and filename: /mnt/cache/swapfile Swap file size: 4096.06 MB used: 0.03 MB Quote Link to comment
limetech Posted February 6, 2014 Author Share Posted February 6, 2014 Ill let this run tonight with the swap and without cache_dirs to ensure we are not chasing the wrong thing. Assuming this is stable i will then re-enable cache_dirs. Quote Link to comment
hoek Posted February 6, 2014 Share Posted February 6, 2014 Maybe not what people want to hear but just don't use cache dirs. It's a great concept though after chasing my tail with oom's in the past I gave up and just reorganized my media and use disk shares for sources instead of one share for all video. Not one oom since. The whole point is to prevent spin ups. If you use direct disk paths in xbmc, you scan your paths individually as you add media. The only times xbmc will cause a spin up apart from actually watching something, is if you force a scan or didn't let it finish building all the artwork from a previous scan. Quote Link to comment
WeeboTech Posted February 6, 2014 Share Posted February 6, 2014 What if you just allocate more memory to the unRAID machine? This is supposed to be 64bit with an increased capacity for memory and memory management. cache_dirs never worked for me, I always had memory issues. Even just rsyncing huge directories in parallel would do this. It depends on how many files you have vs how much low memory is available. I've always had to drop the cache before and after an operation with a huge amount of directories. I thought migration to 64bit would have made this less of an issue. Look at all the plexmedia server processes. That could also be competing for ram. Quote Link to comment
spants Posted February 6, 2014 Share Posted February 6, 2014 I now tried a completely different USB key which I formatted before, I copied the config directory from the old key and the complete contents of the download but still get the strange screen as soon as the bzimage is loaded, the screen changes to the attached picture Maybe your download is corrupted? You can check the md5sum of the download zip file against the value on the Downloads page. Hi Tom, I now checked the md5 from the download I used for my 2 usb keys and the md5 value matches from the download. Anything I can do? STrange thing I used the same usb key in another PC and there it booted fine without these funny symbols. Could this motherboard have a problem with xen? this is the details from the working one Motherboard: Gigabyte Technology Co., Ltd. - X48-DQ6 CPU: Intel® Core2 Quad CPU Speed: 2.133 GHz Cache: 64 kB, 4096 kB Memory: 6144 MB (max. 4 GB) Network: eth0: 1000Mb/s - Full Duplex Connections: Zero Uptime: I had something similar on a server recently....my (very old) video card was on its way out. I found this after I had sent the motherboard out and paid a "fault not found" penalty I replaced the the video card and all was good. Tony Quote Link to comment
MrLondon Posted February 6, 2014 Share Posted February 6, 2014 Spants, problem is that I am actually using the build into CPU graphics on this server it's in the i3 540 chip... Quote Link to comment
pyrater Posted February 6, 2014 Share Posted February 6, 2014 It has been stable for the last 10 hours without cache dirs using less than 1400mb of the 5000mb of ram I allocated to it. It is currently using 318mb of the 4gb swap file I set up as well. I will continue to monitor it and enable cache dirs after work tonight and see if the swap file fixed it. Quote Link to comment
coppit Posted February 6, 2014 Share Posted February 6, 2014 It's hard for me to believe that any network protocol running between unRaid storage and a domU via a virtual network connection is going to be slower than the speed of a single disk, even an SSD. If you have a raid5 setup with a large stripe size, then sure, but that's not unRaid. I don't doubt that virtfs is faster, but anecdotally I'm not seeing any speed difference between writing to my SSD cache drive over Samba and directly. A caveat: I was blown away by the 10x speedup with network writes after switching to SSD, so maybe it's 9x versus 10x and I'm not noticing the difference. On a related note, I used to have a separate computer writing 2 OTA 1080p mpeg streams to a dedicated disk, and it would have problems. Now I'm running that in a Windows VM that is writing to a network mount to an unraid share (with cache disk disabled), and it's rock-solid. Quote Link to comment
MrLondon Posted February 6, 2014 Share Posted February 6, 2014 hi there all, today I installed the beta on my production server as I wanted to test xen, however as soon as I select the zen option it just shows a screen with lots of strange numbers, unfortunately I don;t know how to capture besides a picture, I have already booted several times. When I choose the non xen option it works fine. Below is the mainboard I am using. I have attached the syslog from the normal boot. Motherboard: ASUSTeK Computer INC. - P7H55-M LX CPU: Intel® Core i3 CPU 540 @ 3.07GHz Speed: 3.066 GHz Cache: 128 kB, 512 kB, 4096 kB Memory: 4096 MB (max. 16 GB) Network: eth0: 1000Mb/s - Full Duplex My guess is "virtualization technology" is not turned on in your mobo BIOS settings. actually the opposite, I had VT-D enabled in the bios, but my CPU does not support it, never had problem before but I guess it was not called upon. So now I just enabled VT-x and disabled VT-D and it booted!!! Quote Link to comment
MrLondon Posted February 6, 2014 Share Posted February 6, 2014 ok, so I am not in the webgui but I cannot create a share for my newly installed cache drive. Under Main it shows the new drive as Cache but every time I try to create a share via the gui and click add share it thinks for a few sec and it goes back to the add share screen, I don't get the option to enabled NFS/Samba. I had the same problem with 5.05. syslog_06022014.txt Quote Link to comment
limetech Posted February 6, 2014 Author Share Posted February 6, 2014 ok, so I am not in the webgui but I cannot create a share for my newly installed cache drive. Under Main it shows the new drive as Cache but every time I try to create a share via the gui and click add share it thinks for a few sec and it goes back to the add share screen, I don't get the option to enabled NFS/Samba. I had the same problem with 5.05. system log please Quote Link to comment
MrLondon Posted February 6, 2014 Share Posted February 6, 2014 seems like I was stupid, I set a too high minimum limit for a cache share that might have caused my 250gb SSD not to work. Now was able to create a cache_only share.... so onwards with creating the xen guest. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.