limetech Posted May 8, 2012 Author Share Posted May 8, 2012 Just ran mover again, after sab had downloaded overnight (only thing on my cache drive) and the system hangs again, sys log attached. Any help would be appriciated as i'm having to 'move' everything myself atm. Looks like you might be running out of memory. How much RAM do you have? Have you set up a swap file? Quote Link to comment
Rich Posted May 8, 2012 Share Posted May 8, 2012 Just ran mover again, after sab had downloaded overnight (only thing on my cache drive) and the system hangs again, sys log attached. Any help would be appriciated as i'm having to 'move' everything myself atm. Looks like you might be running out of memory. How much RAM do you have? Have you set up a swap file? I've got 4 gigs of RAM and never had this problem before. Don't know what a swap file is, i'm afraid? Quote Link to comment
lionelhutz Posted May 8, 2012 Share Posted May 8, 2012 Hi, To get the kernel version that unRAID 5.0 RC2 uses, get it from the official Linux kernel site, Kernel.org. wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.30.tar.bz2 Thank you. I found it in the documentation as well. What I didn't found however is how to recreate a bzroot from what have been compiled and tested in Slackware (I'm using Salix 13.37 as Slackware still doesn't answer...)? Do you have this documented somewhere? Thanks. ehfortin Discussing how to build a custom kernel DOES NOT belong in a release thread. Quote Link to comment
lionelhutz Posted May 8, 2012 Share Posted May 8, 2012 Just ran mover again, after sab had downloaded overnight (only thing on my cache drive) and the system hangs again, sys log attached. Any help would be appriciated as i'm having to 'move' everything myself atm. Try posting a full syslog. Personally, it looks like you may have a hardware issue. I suggest starting your own thread in the general support forum. Quote Link to comment
Rich Posted May 8, 2012 Share Posted May 8, 2012 Just ran mover again, after sab had downloaded overnight (only thing on my cache drive) and the system hangs again, sys log attached. Any help would be appriciated as i'm having to 'move' everything myself atm. Try posting a full syslog. Personally, it looks like you may have a hardware issue. I suggest starting your own thread in the general support forum. Thanks for your reply. The only part of the syslog i didn't post, was me logging in via telnet, the system had only been up for 30ish mins. But i will start a new thread if you still think its not related to rc1/2? Quote Link to comment
limetech Posted May 8, 2012 Author Share Posted May 8, 2012 I don't know whether this provides any clues, but here is a sequence of events on my Ubuntu desktop machine, after I encountered a 'Stale nfs file handle': While I was browsing around my Photos share, Nautilus became unresponsive (but there was no error message). I used telnet to connect to my unRAID server and performed 'ls -l' on both the Photos directory and the 'user' parent: root@Tower:~# ls -l /mnt/user/Photos total 161248 drwxrwx--- 1 nobody users 4360 2011-06-20 01:11 100OLYMP1/ drwxrwx--- 1 nobody users 496 2003-10-26 15:42 100OLYMP2/ drwxrwx--- 1 nobody users 120 2012-01-27 18:00 101029/ drwxr-xr-x 1 nobody users 72 2012-01-27 14:00 110324/ drwxr-xr-x 1 nobody users 72 2012-01-27 14:00 110830/ drwxr-xr-x 1 nobody users 72 2012-01-27 14:00 111004/ drwxr-xr-x 1 nobody users 72 2012-01-27 14:00 111007/ drwxr-xr-x 1 nobody users 336 2011-10-17 20:07 111007YNL/ drwxrwxr-x 1 nobody users 592 2011-10-25 00:29 111022Baking/ drwxrwxr-x 1 nobody users 96 2012-01-27 18:00 111106/ drwxrwxr-x 1 nobody users 136 2012-01-27 14:00 111107_MYF_CrocPark/ drwxrwxr-x 1 nobody users 104 2012-01-27 14:00 111209_Bukidnon/ drwxrwxr-x 1 nobody users 136 2012-01-27 18:00 120108_CDO/ drwxrwxr-x 1 nobody users 48 2012-05-07 18:20 120507/ drwx------ 1 nobody users 720 2012-01-27 14:00 Ai-Ai\ Graduation/ drwxr-xr-x 1 nobody users 2224 2012-01-27 14:00 Import/ drwxr-xr-x 1 nobody users 240 2012-01-27 14:00 Methodist\ Youth\ Surigao/ drwxr-xr-x 1 nobody users 208 2011-06-20 01:11 Ruby\ in\ UK/ -rw-r--r-- 1 nobody users 3072000 2012-01-27 16:06 digikam4.db -rw-r--r-- 1 nobody users 161874944 2012-01-27 16:06 thumbnails-digikam.db root@Tower:~# ls -l /mnt/user total 13024031 drwxr-xr-x 1 nobody users 48 2011-10-17 19:24 111007YNL/ drwxrwx--- 1 nobody users 424 2010-09-08 21:42 Athlon/ drwxr-xr-x 1 nobody users 20352 2011-12-08 08:30 Downloaded\ Files/ -rw-rw---- 1 nobody users 6542697514 2010-02-05 18:20 LoveStory_DVD.mkv drwxrwx--- 1 nobody users 128 2012-03-25 08:01 Maildir/ drwxrwx--- 1 nobody users 6912 2012-05-06 16:14 Movies/ drwxrwx--- 1 nobody users 384 2012-04-08 18:00 Music/ -rw-rw---- 1 nobody users 2088899096 2010-02-07 01:20 NOTTING\ HILL.mkv -rw-rw---- 1 nobody users 4691562496 2010-06-13 21:05 National\ Treasure\ 2004\ 720p.avi drwxrwxr-x 1 nobody users 504 2011-12-27 08:50 Pete's\ N97/ drwxrwxr-x 1 nobody users 72 2012-05-07 18:20 Photos/ drwxrwx--- 1 nobody users 296 2011-09-14 08:08 Series/ drwxrwx--- 1 nobody users 176 2012-04-08 07:55 Squeeze/ drwxr-xr-x 1 logitechmediaserver users 1392 2012-04-08 16:23 Squeeze-7.7.2/ drwxr-xr-x 1 root root 480 2012-03-30 12:32 Temporary/ drwxrwxr-x 1 nobody users 520 2011-11-12 19:58 UMC/ drwxrwx--- 1 nobody users 600 2010-09-04 20:22 UMC2.07/ drwxrwx--- 1 nobody users 600 2010-10-16 09:13 UMC2.08.1/ drwxrwxrwx 1 nobody users 600 2011-06-07 10:42 UMC2.08.1x/ drwxr-xr-x 1 nobody users 496 2011-09-13 08:41 UMC2.10/ drwxrwxr-x 1 nobody users 584 2012-02-15 22:54 UMC2.11/ drwxrwx--- 1 nobody users 696 2011-06-06 08:34 UMCold/ drwxrwx--- 1 nobody users 504 2012-01-02 09:01 Videos/ drwxrwxr-x 1 nobody users 168 2011-11-14 10:32 Wii/ drwxrwx--- 1 nobody users 576 2011-10-18 13:54 Work/ drwxr-xr-x 1 root root 504 2012-04-08 22:45 XFarG7/ drwxrwx--- 1 nobody users 72 2010-10-29 02:18 ZA30/ drwxrwx--- 1 nobody users 864 2010-10-29 01:21 ZK10/ drwxrwx--- 1 nobody users 120 2010-10-25 15:36 ZVideo/ drwxrwx--- 1 nobody users 184 2010-10-29 02:14 ZVideo2/ drwxrwx--- 1 nobody users 72 2010-10-19 23:43 mediaserver/ drwxrwx--- 1 nobody users 112 2011-02-26 13:12 mp3/ -rw-r--r-- 1 root root 18020 2012-03-30 12:32 nolimetangere.odt -rw-r--r-- 1 root root 342926 2012-03-30 12:30 output.pdf -rw-r--r-- 1 nobody users 30312 2012-01-11 00:35 pro.odt drwxrwx--- 1 nobody users 80 2011-09-15 07:19 series/ root@Tower:~# I then looked at the same directories from Ubuntu: peter@desktop:~$ ls -l /net/tower/mnt/user ls: cannot access /net/tower/mnt/user/Photos: Stale NFS file handle total 0 drwxr-xr-x 2 root root 0 May 7 18:19 Movies drwxr-xr-x 2 root root 0 May 7 18:19 Music d??? ? ? ? ? ? Photos drwxr-xr-x 2 root root 0 May 7 18:19 series drwxr-xr-x 2 root root 0 May 7 18:19 Series drwxr-xr-x 2 root root 0 May 7 18:19 UMC drwxr-xr-x 2 root root 0 May 7 18:19 Videos peter@desktop:~$ ls -l /net/tower/mnt/user/Photos ls: cannot access /net/tower/mnt/user/Photos: Stale NFS file handle peter@desktop:~$ sudo umount -f /net/tower/mnt/user/Photos [sudo] password for peter: peter@desktop:~$ ls -l /net/tower/mnt/user total 0 drwxr-xr-x 2 root root 0 May 7 18:19 Movies drwxr-xr-x 2 root root 0 May 7 18:19 Music drwxrwxr-x 1 99 users 72 May 7 18:20 Photos drwxr-xr-x 2 root root 0 May 7 18:19 series drwxr-xr-x 2 root root 0 May 7 18:19 Series drwxr-xr-x 2 root root 0 May 7 18:19 UMC drwxr-xr-x 2 root root 0 May 7 18:19 Videos peter@desktop:~$ ls -l /net/tower/mnt/user total 8 drwxrwx--- 1 99 users 6912 May 6 16:14 Movies drwxrwx--- 1 99 users 384 Apr 8 18:00 Music drwxrwxr-x 1 99 users 72 May 7 18:20 Photos drwxrwx--- 1 99 users 296 Sep 14 2011 series drwxr-xr-x 2 root root 0 May 7 18:19 Series drwxrwxr-x 1 99 users 520 Nov 12 19:58 UMC drwxrwx--- 1 99 users 504 Jan 2 09:01 Videos peter@desktop:~$ I use autofs to mount nfs shares automatically, hence I don't have to issue the mount command. Between the last two 'ls -l /net/tower/mnt/user' I had opened the Photos share in Nautilus - note that ownership of most folders has changed from 'root' to '99'. Here is the line showing details of the 'Photos' share from the output of 'mount' from Ubuntu. tower:/mnt/user/Photos on /net/tower/mnt/user/Photos type nfs (rw,nosuid,nodev,vers=3,hard,intr,nolock,udp,sloppy,addr=10.2.0.100) With version 5, you shouldn't see any files or directories owned by 'root' or in the group 'root'. Did you run the 'New Permissions' utility after upgrading to version 5? 99 is the UID for user 'nobody'. Quote Link to comment
limetech Posted May 8, 2012 Author Share Posted May 8, 2012 A report on 'spurious spindowns', with my apologies for its length. Thanks for the detail report. Yes indeed there's a potential issue when the clock is adjusted in a negative direction. Fixed in -rc3. Quote Link to comment
ehfortin Posted May 8, 2012 Share Posted May 8, 2012 Hi, Discussing how to build a custom kernel DOES NOT belong in a release thread. You are right. The question has been prompted by a remark that was related to RC2 but I should I moved the topic elsewhere after the initial answer. Have a nice day. ehfortin Quote Link to comment
LinuxGuyGary Posted May 8, 2012 Share Posted May 8, 2012 As I understand it, this means that the rc2 release now supports the BR10i as the kernel version is back to 3.0.30 ? can anyone confirm? Quote Link to comment
cyrnel Posted May 8, 2012 Share Posted May 8, 2012 After a few days on rc2 I'm getting 25+ load averages. I can still ssh in & see what's up but web access is useless and file access impossibly slow. A number of unmenu packages are enabled. I'll disable them for the next boot but will wait if anyone wants more info. Board is Supermicro MBD-X9SCL-F-O w/4GB Parity and 5 data drives are on motherboard ports. 4 data drives and cache drive are on AOC-SASLP-MV8. Syslog attached. Edit: File access seemed fine last night; we watched one DVD and one BR without noticing anything. Nothing interesting in syslog since the BLK_EH_NOT_HANDLED two days ago. The sas/ata9 activity around May5 4:45am looks ugly but I need help from better eyes. syslog.txt Quote Link to comment
BRiT Posted May 8, 2012 Share Posted May 8, 2012 As I understand it, this means that the rc2 release now supports the BR10i as the kernel version is back to 3.0.30 ? can anyone confirm? Yes. The BR10i is an LSI Controller. LSI Controllers work in Beta 12/12a and RC2. This has been confirmed by multiple people with their LSI Controllers. Quote Link to comment
BRiT Posted May 8, 2012 Share Posted May 8, 2012 After a few days on rc2 I'm getting 25+ load averages. I can still ssh in & see what's up but web access is useless and file access impossibly slow. What shows up in your process list or on top? ps -ef or top -n 1 Quote Link to comment
cyrnel Posted May 8, 2012 Share Posted May 8, 2012 Should have included that. Didn't see a culprit though python sure has lots of cumulative time. Tasks: 115 total, 1 running, 113 sleeping, 0 stopped, 1 zombie Cpu(s): 0.3%us, 0.7%sy, 0.0%ni, 85.6%id, 13.1%wa, 0.0%hi, 0.3%si, 0.0%st Mem: 4137476k total, 4009972k used, 127504k free, 300088k buffers Swap: 0k total, 0k used, 0k free, 3463920k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 828 304 264 S 0 0.0 0:06.25 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.29 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0 0.0 0:00.10 ksoftirqd/1 11 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2 13 root 20 0 0 0 0 S 0 0.0 0:00.14 ksoftirqd/2 14 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3 15 root 20 0 0 0 0 S 0 0.0 0:00.32 kworker/3:0 16 root 20 0 0 0 0 S 0 0.0 0:00.06 ksoftirqd/3 17 root 0 -20 0 0 0 S 0 0.0 0:00.00 khelper 153 root 20 0 0 0 0 S 0 0.0 0:20.08 sync_supers 155 root 20 0 0 0 0 S 0 0.0 0:00.00 bdi-default 157 root 0 -20 0 0 0 S 0 0.0 0:00.00 kblockd 272 root 0 -20 0 0 0 S 0 0.0 0:00.00 ata_sff 282 root 20 0 0 0 0 S 0 0.0 0:00.00 khubd Quote Link to comment
cyrnel Posted May 8, 2012 Share Posted May 8, 2012 hdparm is the zombie. 4 Z 0 15256 1291 0 80 0 - 0 exit ? 00:00:00 hdparm <defunct> edit 10:36pm: System load over 30 with no disk activity. Powerdown didn't complete. "Can't find sd[a-h]...". Switched off after editing out package launcher from go script. Currently running non correcting parity check at ~80MB/s. Quote Link to comment
Talsit Posted May 9, 2012 Share Posted May 9, 2012 Up and running under ESXi 5.0 update 1, 8 data drives on two M1015's. No issues. Thanks Tom! Quote Link to comment
pantner Posted May 9, 2012 Share Posted May 9, 2012 Tried to start a parity check before i left for work today. Before i started it i hit spin up so all my drives were ready. When this was done i started the parity check, i then heard disks making odd noises. refreshed the page and i saw drives 4-9 were spun down. hit refresh again all drives were spinning. hit refresh again all drives were spun down. this is the first partiy check with RC2. I did do one in RC2-test and it appeared to work fine. I stopped the parity check after that. Will do a clean boot when i get home, try again and if it continues to happen i will post my syslog. Quote Link to comment
jgs2n Posted May 9, 2012 Share Posted May 9, 2012 I am running 5.2 with Lion and get the following error on my lion machine. "Something wrong with the volume's CNID DB, using temporary CNID DB instead.Check server messages for details. Switching to read-only mode. Time machine eventually seems to work. Is this a netatalk issue? Relevant part of syslog is attached. Thanks. syslog-2012-05-09.txt Quote Link to comment
mvdzwaan Posted May 9, 2012 Share Posted May 9, 2012 I am running 5.2 with Lion and get the following error on my lion machine. "Something wrong with the volume's CNID DB, using temporary CNID DB instead.Check server messages for details. Switching to read-only mode. Time machine eventually seems to work. Is this a netatalk issue? Relevant part of syslog is attached. Thanks. See http://lime-technology.com/forum/index.php?topic=15657.0 Quote Link to comment
RackIt Posted May 9, 2012 Share Posted May 9, 2012 I rebooted the array and the web interface and array is offline. My syslog is attached. Any help would be appreciate. Thanks. syslog_2012-05-09.txt Quote Link to comment
MikeL Posted May 9, 2012 Share Posted May 9, 2012 I had to revert back to RC1! As RC2 started acting just like B12, and B13. After a little while the server would become slower, and in no time (Like maybe 5 minutes or less) it would become completely unresponsive. (Both webgui, and folders.) The first time, it lasted almost 2 days, but yesterday I had to force a restart, then last night I recieved about 20 text messages from my kids letting me know they were getting mad because they couldn't watch anything. When I got home from work this morning I reverted back to RC1, so now its a wait and see game. If this doesn't work, I'll have no choice but go back another step to B14. (That one is not perfect, but it did run more then 5 months with just small issues and problems. Quote Link to comment
jbartlett Posted May 9, 2012 Share Posted May 9, 2012 As RC2 started acting just like B12, and B13. After a little while the server would become slower, and in no time (Like maybe 5 minutes or less) it would become completely unresponsive. (Both webgui, and folders.) Were you able to telnet in and run top to see how much free memory you had available when it was in slowdown? Quote Link to comment
limetech Posted May 9, 2012 Author Share Posted May 9, 2012 I had to revert back to RC1! As RC2 started acting just like B12, and B13. After a little while the server would become slower, and in no time (Like maybe 5 minutes or less) it would become completely unresponsive. (Both webgui, and folders.) The first time, it lasted almost 2 days, but yesterday I had to force a restart, then last night I recieved about 20 text messages from my kids letting me know they were getting mad because they couldn't watch anything. When I got home from work this morning I reverted back to RC1, so now its a wait and see game. If this doesn't work, I'll have no choice but go back another step to B14. (That one is not perfect, but it did run more then 5 months with just small issues and problems. I can't help you without at least a system log and without a description of what plugins you might be using. I *never* see this kind of behavior on any test server; if fact, I've never seen this behavior *ever* on any test server. Report "it doesn't work" without any kind of information is pointless. I'd say 99% of all problems reported I've *never* seen before, and most of the time, the only way I can fix the issue is to be able to recreate it. Sorry, I don't mean to be picking on you in particular. Quote Link to comment
MikeL Posted May 9, 2012 Share Posted May 9, 2012 I understand how important a log can be in tracking down issues. But as I stated the system was completely unresponsive. (As in frozen!) No telnet, no web page, no folder, nothing! In hindsight, I should have gotten a log yesterday when I first noticed it was getting very slugish. I had to revert back to RC1! As RC2 started acting just like B12, and B13. After a little while the server would become slower, and in no time (Like maybe 5 minutes or less) it would become completely unresponsive. (Both webgui, and folders.) The first time, it lasted almost 2 days, but yesterday I had to force a restart, then last night I recieved about 20 text messages from my kids letting me know they were getting mad because they couldn't watch anything. When I got home from work this morning I reverted back to RC1, so now its a wait and see game. If this doesn't work, I'll have no choice but go back another step to B14. (That one is not perfect, but it did run more then 5 months with just small issues and problems. I can't help you without at least a system log and without a description of what plugins you might be using. I *never* see this kind of behavior on any test server; if fact, I've never seen this behavior *ever* on any test server. Report "it doesn't work" without any kind of information is pointless. I'd say 99% of all problems reported I've *never* seen before, and most of the time, the only way I can fix the issue is to be able to recreate it. Sorry, I don't mean to be picking on you in particular. Quote Link to comment
AwesomeAustn Posted May 9, 2012 Share Posted May 9, 2012 I had to revert back to RC1! As RC2 started acting just like B12, and B13. After a little while the server would become slower, and in no time (Like maybe 5 minutes or less) it would become completely unresponsive. (Both webgui, and folders.) The first time, it lasted almost 2 days, but yesterday I had to force a restart, then last night I recieved about 20 text messages from my kids letting me know they were getting mad because they couldn't watch anything. When I got home from work this morning I reverted back to RC1, so now its a wait and see game. If this doesn't work, I'll have no choice but go back another step to B14. (That one is not perfect, but it did run more then 5 months with just small issues and problems. I can't help you without at least a system log and without a description of what plugins you might be using. I *never* see this kind of behavior on any test server; if fact, I've never seen this behavior *ever* on any test server. Report "it doesn't work" without any kind of information is pointless. I'd say 99% of all problems reported I've *never* seen before, and most of the time, the only way I can fix the issue is to be able to recreate it. Sorry, I don't mean to be picking on you in particular. Same exact thing happened to me. I am using Simple Features and all of its plugins except the web server. The syslog attached may be the one that's using RC1 as I had to revert back. syslog.txt Quote Link to comment
mvdzwaan Posted May 9, 2012 Share Posted May 9, 2012 Same exact thing happened to me. I am using Simple Features and all of its plugins except the web server. The syslog attached may be the one that's using RC1 as I had to revert back. And what happens if you run a stock unraid install without simple features ? Please run a stock configuration for a couple of days so we can isolate the problem. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.