limetech Posted January 12, 2013 Share Posted January 12, 2013 Kernel 3.4.25 just got posted yesterday with this fix: commit 711cf004f2148f1d1967bc5eca025019e5001212 Author: Sonny Rao <[email protected]> Date: Thu Dec 20 15:05:07 2012 -0800 mm: fix calculation of dirtyable memory commit c8b74c2f6604923de91f8aa6539f8bb934736754 upstream. The system uses global_dirtyable_memory() to calculate number of dirtyable pages/pages that can be allocated to the page cache. A bug causes an underflow thus making the page count look like a big unsigned number. This in turn confuses the dirty writeback throttling to aggressively write back pages as they become dirty (usually 1 page at a time). This generally only affects systems with highmem because the underflowed count gets subtracted from the global count of dirtyable memory. The problem was introduced with v3.2-4896-gab8fabd Fix is to ensure we don't get an underflowed total of either highmem or global dirtyable memory. Signed-off-by: Sonny Rao <[email protected]> Signed-off-by: Puneet Kumar <[email protected]> Acked-by: Johannes Weiner <[email protected]> Tested-by: Damien Wyart <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> I think this might fix the issues you guys are seeing with slow writes. I'm building that kernel now and will be able to post a release for you to try. Might be some time after football games though Quote Link to comment
moose Posted January 13, 2013 Author Share Posted January 13, 2013 Thanks Tom! Enjoy the games and we'll test whenever it's ready. Quote Link to comment
Helmonder Posted January 13, 2013 Share Posted January 13, 2013 This is good news ! Have fun with the games and we're ready to test as soon as you are ! Quote Link to comment
RobJ Posted January 14, 2013 Share Posted January 14, 2013 Well, Im having the same problem! I "upgraded" my C2SEE to an X9SCM-F today, and my speeds plummeted. I added "sysctl vm.highmem_is_dirtyable=1" to my GO script and the speeds shot back up, but only for small files. A file larger than 5GB just craps out at the 5GB mark. I know I'm running a couple of plugins, but I have most of it disabled right now. I literally unplugged my old motherboard (which was working great) and plugged in the new one. I made no changes to the software., Any would would be much appreciated. Unfortunately, there is not much to learn from your syslog, as it is rather short (booted at 16:22:31, last log entry at 16:26:57). There are no errors apparent, except that the server had previously crashed or been shut down improperly, and was therefore replaying transactions and starting a parity check, which would of course make the system appear very slow until it ends. You cancelled it very quickly, then after a few minutes stopped the cache_dirs process, and there the syslog ends. Perhaps try capturing a syslog after testing some problematic transfers? If this syslog was captured after the slow 5GB transfer, then nothing appeared in the syslog about it. Quote Link to comment
StevenD Posted January 14, 2013 Share Posted January 14, 2013 Well, Im having the same problem! I "upgraded" my C2SEE to an X9SCM-F today, and my speeds plummeted. I added "sysctl vm.highmem_is_dirtyable=1" to my GO script and the speeds shot back up, but only for small files. A file larger than 5GB just craps out at the 5GB mark. I know I'm running a couple of plugins, but I have most of it disabled right now. I literally unplugged my old motherboard (which was working great) and plugged in the new one. I made no changes to the software., Any would would be much appreciated. Unfortunately, there is not much to learn from your syslog, as it is rather short (booted at 16:22:31, last log entry at 16:26:57). There are no errors apparent, except that the server had previously crashed or been shut down improperly, and was therefore replaying transactions and starting a parity check, which would of course make the system appear very slow until it ends. You cancelled it very quickly, then after a few minutes stopped the cache_dirs process, and there the syslog ends. Perhaps try capturing a syslog after testing some problematic transfers? If this syslog was captured after the slow 5GB transfer, then nothing appeared in the syslog about it. Thanks. this was immediately after the copy failure. Quote Link to comment
limetech Posted January 14, 2013 Share Posted January 14, 2013 For you guys with this 'slowdown' issue, please try this 'test' release and report back here: http://download.lime-technology.com/download/unRAID%20Server%205.0-rc10-test%20AiO.zip md5sum: 58fade2d78b11b77bf0fbba06903f421 If this helps the problem I'll create an 'offical' release for it, but for time being it's not necessary for everyone download. As with most releases, it's only necessary to copy the files bzimage and bzroot to your flash device and then reboot the server. Quote Link to comment
jangjong Posted January 14, 2013 Share Posted January 14, 2013 For you guys with this 'slowdown' issue, please try this 'test' release and report back here: http://download.lime-technology.com/download/unRAID%20Server%205.0-rc10-test%20AiO.zip md5sum: 58fade2d78b11b77bf0fbba06903f421 If this helps the problem I'll create an 'offical' release for it, but for time being it's not necessary for everyone download. As with most releases, it's only necessary to copy the files bzimage and bzroot to your flash device and then reboot the server. Do we test this with "sysctl vm.highmem_is_dirtyable=1" or no? Quote Link to comment
limetech Posted January 14, 2013 Share Posted January 14, 2013 For you guys with this 'slowdown' issue, please try this 'test' release and report back here: http://download.lime-technology.com/download/unRAID%20Server%205.0-rc10-test%20AiO.zip md5sum: 58fade2d78b11b77bf0fbba06903f421 If this helps the problem I'll create an 'offical' release for it, but for time being it's not necessary for everyone download. As with most releases, it's only necessary to copy the files bzimage and bzroot to your flash device and then reboot the server. Do we test this with "sysctl vm.highmem_is_dirtyable=1" or no? Remove doing anything with this. Quote Link to comment
Helmonder Posted January 14, 2013 Share Posted January 14, 2013 am running with this release and without the dirtyable prameter. Speeds are below 1MB/s. After activating the dirtyable paramter speeds climb up immediately, though it appears more slowly then they used to.. Am at 8 MB/s after copying a blueray, still climbing. The next file starts at 14MB/ and moves up to 20 MB/s. Which is comparable to what I used to get in the previous version. Release did not solve anything for me. Quote Link to comment
limetech Posted January 14, 2013 Share Posted January 14, 2013 am running with this release and without the dirtyable prameter. Speeds are below 1MB/s. After activating the dirtyable paramter speeds climb up immediately, though it appears more slowly then they used to.. Am at 8 MB/s after copying a blueray, still climbing. The next file starts at 14MB/ and moves up to 20 MB/s. Which is comparable to what I used to get in the previous version. Release did not solve anything for me. Well that's disappointing. As sanity check, type this command: uname -a Should see "3.4.25-unRAID" in there. Quote Link to comment
jangjong Posted January 14, 2013 Share Posted January 14, 2013 Same result for me. not seeing any difference.. and uname -a shows 3.4.25-unRAID Quote Link to comment
Helmonder Posted January 14, 2013 Share Posted January 14, 2013 Will do that tonight, am on the road ! Quote Link to comment
Glimmerman911 Posted January 14, 2013 Share Posted January 14, 2013 I upgraded to rc10-test, no change to my write speeds, still averaging 12MB/s or so and back on rc6-test2 and every earlier version back to 4.x was 24MB/s. I am going to swap out my supermicro controllers for IBM 1015m cards and see if that makes a difference. Quote Link to comment
StevenD Posted January 14, 2013 Share Posted January 14, 2013 Is anybody Else's server flat out crashing during a copy operation? I hit 2-5GB and then the server just craters. Most of the time I can telnet in and shut it down gracefully. I just built/upgraded this server this past weekend. I ran an overnight memtest, everything was functioning correctly before the upgrade. Ive disabled everything except simplefeatures and crashplan. The server will run just fine, until I try to copy data to it. Ive tried RC8 to RC10-test. Thanks. Quote Link to comment
jangjong Posted January 14, 2013 Share Posted January 14, 2013 Is anybody Else's server flat out crashing during a copy operation? I hit 2-5GB and then the server just craters. Most of the time I can telnet in and shut it down gracefully. I just built/upgraded this server this past weekend. I ran an overnight memtest, everything was functioning correctly before the upgrade. Ive disabled everything except simplefeatures and crashplan. The server will run just fine, until I try to copy data to it. Ive tried RC8 to RC10-test. Thanks. I have a similar problem. I don't think the server actually crashes, but it sometimes becomes unresponsive during a big file copy and gives me an error that the file transfer cannot be completed. When it cancels, I can access it normally again. This started happening when i upgraded 4GB ram to 16GB ram i think.. maybe i should switch back to 4gb and test it Quote Link to comment
StevenD Posted January 14, 2013 Share Posted January 14, 2013 Is anybody Else's server flat out crashing during a copy operation? I hit 2-5GB and then the server just craters. Most of the time I can telnet in and shut it down gracefully. I just built/upgraded this server this past weekend. I ran an overnight memtest, everything was functioning correctly before the upgrade. Ive disabled everything except simplefeatures and crashplan. The server will run just fine, until I try to copy data to it. Ive tried RC8 to RC10-test. Thanks. I have a similar problem. I don't think the server actually crashes, but it sometimes becomes unresponsive during a big file copy and gives me an error that the file transfer cannot be completed. When it cancels, I can access it normally again. This started happening when i upgraded 4GB ram to 16GB ram i think.. maybe i should switch back to 4gb and test it I lose access to the web console...it never comes back. Maybe I should pull one of the 8GB DIMMs. Quote Link to comment
jangjong Posted January 14, 2013 Share Posted January 14, 2013 Is anybody Else's server flat out crashing during a copy operation? I hit 2-5GB and then the server just craters. Most of the time I can telnet in and shut it down gracefully. I just built/upgraded this server this past weekend. I ran an overnight memtest, everything was functioning correctly before the upgrade. Ive disabled everything except simplefeatures and crashplan. The server will run just fine, until I try to copy data to it. Ive tried RC8 to RC10-test. Thanks. I have a similar problem. I don't think the server actually crashes, but it sometimes becomes unresponsive during a big file copy and gives me an error that the file transfer cannot be completed. When it cancels, I can access it normally again. This started happening when i upgraded 4GB ram to 16GB ram i think.. maybe i should switch back to 4gb and test it I lose access to the web console...it never comes back. Maybe I should pull one of the 8GB DIMMs. actually.. if it becomes unresponsive, look at this thread: http://lime-technology.com/forum/index.php?topic=23124.msg212864 Do you get errors like this in your syslog? After I upgraded my ram to 16GB, mine too became unresponsive when i was copying a big file and saw bunch of errors in syslog. setting min_free_kbytes value to 65536 helped me... Quote Link to comment
StevenD Posted January 14, 2013 Share Posted January 14, 2013 Is anybody Else's server flat out crashing during a copy operation? I hit 2-5GB and then the server just craters. Most of the time I can telnet in and shut it down gracefully. I just built/upgraded this server this past weekend. I ran an overnight memtest, everything was functioning correctly before the upgrade. Ive disabled everything except simplefeatures and crashplan. The server will run just fine, until I try to copy data to it. Ive tried RC8 to RC10-test. Thanks. I have a similar problem. I don't think the server actually crashes, but it sometimes becomes unresponsive during a big file copy and gives me an error that the file transfer cannot be completed. When it cancels, I can access it normally again. This started happening when i upgraded 4GB ram to 16GB ram i think.. maybe i should switch back to 4gb and test it I lose access to the web console...it never comes back. Maybe I should pull one of the 8GB DIMMs. actually.. if it becomes unresponsive, look at this thread: http://lime-technology.com/forum/index.php?topic=23124.msg212864 Do you get errors like this in your syslog? After I upgraded my ram to 16GB, mine too became unresponsive when i was copying a big file and saw bunch of errors in syslog. setting min_free_kbytes value to 65536 helped me... Thanks! Ill have a look at this tomorrow. Ive had so many hangs/crashes I decided to run a parity check. Its reporting it will be done in 11 hours. Quote Link to comment
Helmonder Posted January 14, 2013 Share Posted January 14, 2013 Will do that tonight, am on the road ! root@Tower:~# uname -a Linux Tower 3.4.25-unRAID #1 SMP Sat Jan 12 16:14:59 PST 2013 i686 Intel® Core i3-2120T CPU @ 2.60GHz GenuineIntel GNU/Linux Quote Link to comment
limetech Posted January 14, 2013 Share Posted January 14, 2013 You guys with slow write... is there anything in your motherboard bios that refers to "disk write cache" or "write caching", etc.? This degree of slow down is exactly what I'd expect if write caching was turned off in the hard drives. Quote Link to comment
sheppp Posted January 14, 2013 Share Posted January 14, 2013 I can't test (or check) until tomorrow night because I am still attempting to preclear some 4tb drives prior to firing up my initial array on this box with a new x9scm mb (I RMAed the one that I had problems with last week for NIC port 1 - and the fact that it crashed the LAN side of my entire network a couple of times on the NIC 2 port). FYI, the NIC 1 port in the new x9scm still doesn't work. As nice as Supermicro's tech support has been to deal with, I'm close to being done with the whole Supermicro IPMI experience. My other UnRAID box, with a non-SM mb works flawlessly. I wish that I could find another decent IPMI mb. Quote Link to comment
jangjong Posted January 14, 2013 Share Posted January 14, 2013 Not sure if this helps.. but I just downgraded my ram from 16GB to 4GB and I am getting solid write speed of 15 - 30 MB/s for files. no crapping out or nothing. but I also know my 16GB don't have any problem because it ran memtest for few days without any errors... You guys with slow write... is there anything in your motherboard bios that refers to "disk write cache" or "write caching", etc.? This degree of slow down is exactly what I'd expect if write caching was turned off in the hard drives. My MB doesn't have this option.. My board is ECS H61H2-M2.. I don't think anyone here uses this board lol Quote Link to comment
Glimmerman911 Posted January 14, 2013 Share Posted January 14, 2013 I swapped out my 2 Supermicro AOC-SAS2LP-MV8 for 2 IBM M1015 controller cards, a small increase in speed from 12 --> 14 MB/s average Still slower than the 24 MB/s under rc6-test2 Quote Link to comment
bonienl Posted January 15, 2013 Share Posted January 15, 2013 I lose access to the web console...it never comes back. When this happens can you telnet into the box and do "netstat -nope" Do you see emhttp there with a TCP session in CLOSE_WAIT ? Quote Link to comment
moose Posted January 15, 2013 Author Share Posted January 15, 2013 With 5.0-rc-10-test and no "sysctl vm.highmem_is_dirtyable=1" command, I am getting ~ 1 MB/s write speed. verified that kernel is 3.4.25 and output of "netstat -nope" command (during slow write) below. No emhttp with a TCP session in CLOSE_WAIT. Moose login: root Password: Linux 3.4.25-unRAID. root@Moose:~# uname -a Linux Moose 3.4.25-unRAID #1 SMP Sat Jan 12 16:14:59 PST 2013 i686 Intel(R) Xeon(R) CPU E31270 @ 3.40GHz GenuineIntel GNU/Linux root@Moose:~# netstat -nope Active Internet connections (w/o servers) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name Timer tcp 2426412 0 192.168.0.31:445 192.168.0.100:50374 ESTABLISHED 0 9464 2174/smbd keepalive (6692.89/0/0) tcp 0 172 192.168.0.31:23 192.168.0.100:50424 ESTABLISHED 0 9468 2214/in.telnetd: 19 on (0.47/0/0) Active UNIX domain sockets (w/o servers) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 9 [ ] DGRAM 6207 1043/syslogd /dev/log unix 2 [ ] DGRAM 4124 788/udevd @/org/kernel/udev/udevd unix 2 [ ] DGRAM 10450 1200/emhttp unix 2 [ ] DGRAM 9313 1192/crond unix 3 [ ] STREAM CONNECTED 7278 1187/dbus-daemon unix 3 [ ] STREAM CONNECTED 7277 1187/dbus-daemon unix 2 [ ] DGRAM 10418 1177/acpid unix 2 [ ] DGRAM 175 1170/ntpd unix 2 [ ] DGRAM 7257 1153/rpc.statd unix 2 [ ] DGRAM 6213 1149/rpc.portmap unix 2 [ ] DGRAM 7244 1047/klogd unix 3 [ ] DGRAM 4129 788/udevd unix 3 [ ] DGRAM 4128 788/udevd root@Moose:~# Tom, please see this link for all X9SCM-F (v2.0b) BIOS settings: I don't see any option for "disk write cache" or "write caching" in the BIOS Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.