downloadski Posted October 2, 2012 Share Posted October 2, 2012 I have some strange issues with my server and i hope someone can help me. When i move files from my PC to the unraid server (cache drive) it runs with 70-80 MByte/sec for large files (Blu Ray 1:1 rips) and that works for 200-300 GB. At some moment in time (always in a large file (10-20 GB) it slows down to 2-10 Mbyte/sec and if i leave it like this it will get back to normal speed sometimes. Sometimes if i stop the copy from the PC and just restart it, it runs again with full speed. After a new startup from the unraid server, it always runs good. I suspected my source PC as well, but this night l let it download so the PC is kept quite bussy, so it is not going into sleep- i am running 5.0-rc6-r8168-test. Updated to rc8a, same issuxe My hardware is in my sig. I can see noting strange whan i look with the top command, nothing i find in the syslog. If someone has any idea what could be happening here. syslog_slowdown.txt.txt Link to comment
downloadski Posted October 2, 2012 Author Share Posted October 2, 2012 The move from the cache disk to the array also slows down i think. Normally 900 GB copy to array runs druing my night (about 9 hours) Now started at 08:38 and still running at 20:36 and still 340 GB to go. Is there a addd on to see what the write speed from the cache to the array is other that looking at the main screen and caclulate yourself ? My other server has a redballed parity drive and it seems to try and delete all the moved files from the current mover run from the array, but that is in read only mode (presumably because of the missing parity) This seems a not so smart process, go over all the discs to try this, wasting time only.. Link to comment
dgaschk Posted October 2, 2012 Share Posted October 2, 2012 unMenu has performance monitoring. Link to comment
downloadski Posted October 4, 2012 Author Share Posted October 4, 2012 i found that, during slow period i captured the display of the network monitor Also i looked with the command "top" and saw the following: Cpu(s): 7.1%us, 0.3%sy, 0.0%ni, 90.0%id, 2.6%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16608292k total, 354884k used, 16253408k free, 384k buffers Swap: 0k total, 0k used, 0k free, 300304k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2628 root 20 0 13000 1512 1168 S 1 0.0 0:03.97 emhttp 4481 root 20 0 3452 1124 936 D 1 0.0 0:00.02 smartctl 4482 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4483 root 20 0 3452 1124 936 D 1 0.0 0:00.02 smartctl 4484 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4485 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4491 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4492 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4493 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4494 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4495 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4496 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4497 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4498 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4499 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 1027 root 20 0 0 0 0 S 0 0.0 0:00.10 kworker/1:2 1251 root 20 0 1908 612 524 S 0 0.0 0:00.05 syslogd 2914 root 39 19 0 0 0 S 0 0.0 1:14.19 kipmi0 4486 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4487 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4488 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4489 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4490 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 31441 root 20 0 6840 1920 1484 S 0 0.0 0:00.01 in.telnetd 1 root 20 0 828 280 240 S 0 0.0 0:05.46 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.03 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0 0.0 0:00.02 ksoftirqd/1 11 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2 13 root 20 0 0 0 0 S 0 0.0 0:00.03 ksoftirqd/2 14 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3 Is the system running smartcontrol on all the hdd and does that slow down things? Link to comment
downloadski Posted October 4, 2012 Author Share Posted October 4, 2012 and after a reboot of the server: top - 20:23:08 up 7 min, 1 user, load average: 0.72, 0.63, 0.33 Tasks: 137 total, 2 running, 135 sleeping, 0 stopped, 0 zombie Cpu(s): 2.9%us, 4.8%sy, 0.0%ni, 87.6%id, 3.2%wa, 0.0%hi, 1.6%si, 0.0%st Mem: 16608292k total, 13997396k used, 2610896k free, 56960k buffers Swap: 0k total, 0k used, 0k free, 13665376k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6287 dune 20 0 16304 3980 3012 R 22 0.0 0:35.52 smbd 3404 root 20 0 29984 1436 560 S 22 0.0 0:36.83 shfs 504 root 20 0 0 0 0 S 6 0.0 0:01.09 kswapd0 3364 root 20 0 0 0 0 S 2 0.0 0:03.49 flush-65:32 1000 root 20 0 0 0 0 S 1 0.0 0:00.04 kworker/0:2 2604 root 20 0 11900 1492 1168 S 0 0.0 0:00.56 emhttp 2890 root 39 19 0 0 0 S 0 0.0 0:02.39 kipmi0 1 root 20 0 828 280 240 S 0 0.0 0:05.40 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/0 4 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 8 root 20 0 0 0 0 S 0 0.0 0:00.03 kworker/1:0 9 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/1 10 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/0:1 11 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2 13 root 20 0 0 0 0 S 0 0.0 0:00.01 ksoftirqd/2 14 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3 16 root 20 0 0 0 0 S 0 0.0 0:00.01 ksoftirqd/3 17 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/4 18 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/4:0 19 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/4 20 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/5 21 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/5:0 22 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/5 23 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/6 24 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/6:0 25 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/6 26 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/7 28 root 20 0 0 0 0 S 0 0.0 0:00.00 ksoftirqd/7 29 root 0 -20 0 0 0 S 0 0.0 0:00.00 khelper 177 root 20 0 0 0 0 S 0 0.0 0:00.36 sync_supers 179 root 20 0 0 0 0 S 0 0.0 0:00.00 bdi-default 181 root 0 -20 0 0 0 S 0 0.0 0:00.00 kblockd 341 root 0 -20 0 0 0 S 0 0.0 0:00.00 ata_sff 351 root 20 0 0 0 0 S 0 0.0 0:00.00 khubd 464 root 0 -20 0 0 0 S 0 0.0 0:00.00 rpciod 566 root 20 0 0 0 0 S 0 0.0 0:00.00 fsnotify_mark 586 root 0 -20 0 0 0 S 0 0.0 0:00.00 nfsiod 589 root 0 -20 0 0 0 S 0 0.0 0:00.00 cifsiod 595 root 0 -20 0 0 0 S 0 0.0 0:00.00 crypto 652 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/5:1 693 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/7:1 694 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/6:1 695 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/4:1 696 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/3:1 697 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/2:1 774 root 0 -20 0 0 0 S 0 0.0 0:00.00 deferwq 802 root 16 -4 2348 964 496 S 0 0.0 0:00.05 udevd 910 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_0 911 root 20 0 0 0 0 S 0 0.0 0:00.00 scsi_eh_1 Link to comment
Joe L. Posted October 4, 2012 Share Posted October 4, 2012 i found that, during slow period i captured the display of the network monitor Also i looked with the command "top" and saw the following: Cpu(s): 7.1%us, 0.3%sy, 0.0%ni, 90.0%id, 2.6%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 16608292k total, 354884k used, 16253408k free, 384k buffers Swap: 0k total, 0k used, 0k free, 300304k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2628 root 20 0 13000 1512 1168 S 1 0.0 0:03.97 emhttp 4481 root 20 0 3452 1124 936 D 1 0.0 0:00.02 smartctl 4482 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4483 root 20 0 3452 1124 936 D 1 0.0 0:00.02 smartctl 4484 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4485 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4491 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4492 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4493 root 20 0 0 0 0 Z 1 0.0 0:00.02 smartctl <defunct> 4494 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4495 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4496 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 4497 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4498 root 20 0 3452 1132 936 D 1 0.0 0:00.02 smartctl 4499 root 20 0 3452 1128 936 D 1 0.0 0:00.02 smartctl 1027 root 20 0 0 0 0 S 0 0.0 0:00.10 kworker/1:2 1251 root 20 0 1908 612 524 S 0 0.0 0:00.05 syslogd 2914 root 39 19 0 0 0 S 0 0.0 1:14.19 kipmi0 4486 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4487 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4488 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4489 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 4490 root 20 0 0 0 0 Z 0 0.0 0:00.01 smartctl <defunct> 31441 root 20 0 6840 1920 1484 S 0 0.0 0:00.01 in.telnetd 1 root 20 0 828 280 240 S 0 0.0 0:05.46 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:00.03 ksoftirqd/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0 0.0 0:00.02 ksoftirqd/1 11 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2 13 root 20 0 0 0 0 S 0 0.0 0:00.03 ksoftirqd/2 14 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/3 Is the system running smartcontrol on all the hdd and does that slow down things? Doesn't the "D"in the run-time-status indicate uninterruptable sleep?? "Z" = defunct... (process has ended, but parent has not waited fit exit status.) Link to comment
downloadski Posted October 4, 2012 Author Share Posted October 4, 2012 I guess checking the mainbord options for powersaving etc might be wise. Edit: disabled ACPI as far as i could and set the hdd spindown at 4 hours in the unraid GUI. Will see how this copy run goes. Link to comment
downloadski Posted October 5, 2012 Author Share Posted October 5, 2012 I doubt it is my source pc its problem. Copy to my ATOM based server runs at 40-48 MB/sec for most of the time, and never a drop to sub 10 MB/sec like with the other server. Link to comment
downloadski Posted October 14, 2012 Author Share Posted October 14, 2012 I noticed that after a mover run (450 GB in 15 hours = dead slow) the memory is almost completely allocated (15+ GB in the graph of the webgui. Some hours after the mover run is done, still all the memory seems used. When i than start moving data to the cache drive, the speed is bad, like 15-20 Mbye/sec Server reset, memory free is than 15+ GB, start copy and 70+ Mbyte/sec speed. I than see the memory being allocated for the copy i presume. Could this be the problem ? Link to comment
dgaschk Posted October 14, 2012 Share Posted October 14, 2012 I noticed that after a mover run (450 GB in 15 hours = dead slow) the memory is almost completely allocated (15+ GB in the graph of the webgui. Some hours after the mover run is done, still all the memory seems used. When i than start moving data to the cache drive, the speed is bad, like 15-20 Mbye/sec Server reset, memory free is than 15+ GB, start copy and 70+ Mbyte/sec speed. I than see the memory being allocated for the copy i presume. Could this be the problem ? What is the value after subtracting "cached"? Link to comment
downloadski Posted October 15, 2012 Author Share Posted October 15, 2012 I ordered a I3 2100T cpu and will try it with that. The Xeon E3-1265L-V2 puts the bord in PCI-E3 mode, and supermicro advised not to use the E3-12x5 series CPU. Perhaps this is a underlaying cause of lots of my problems. What is the value after subtracting "cached"? Do you mean the normal used memory withouth the cached amount ? I think that is below 1 Gbyte. Is this shown with: free -m Link to comment
downloadski Posted October 15, 2012 Author Share Posted October 15, 2012 Ok, after a boot: root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 467 15752 0 7 294 -/+ buffers/cache: 165 16053 Swap: 0 0 0 During a copy: root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 14504 1714 0 63 14163 -/+ buffers/cache: 277 15941 Swap: 0 0 0 During the copy the gui becomes non responsive also, and once the copy is done its back After the copy: root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 14442 1776 0 63 14097 -/+ buffers/cache: 281 15937 Swap: 0 0 0 When i than wait a few minutes and start the next copy job, it runs slower (48-70 MB/s first job, now 20-35 MB/s) root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 13576 2642 0 90 13173 -/+ buffers/cache: 312 15906 Swap: 0 0 0 And now the pc slows down: (5-20 MB/s) root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 15682 536 0 55 15332 -/+ buffers/cache: 294 15924 Swap: 0 0 0 I canceled the last copy, as 5 MB/sec is waisting my time. root@Tower2:~# free -m total used free shared buffers cached Mem: 16219 345 15873 0 0 294 -/+ buffers/cache: 51 16167 Swap: 0 0 0 But now i see that a parity check has started... Perhaps it is scheduled for today, but i ran a full check 2 days ago, so this strikes me odd. Perhaps due to the slowed down gui i pressed it by accident ? (guess not, but i never exclude myself for making stupid mistakes) I attached a screen print of the gui after a copy job. During the copy, the graph looks like a saw (small tops) so it seems to free some memory after a while and than re-takes it. When the copy is done the cpu use drops down, but the memory stays in use. Is this because no other process claims it, so it stays reserved for the copy process ? Link to comment
dgaschk Posted October 17, 2012 Share Posted October 17, 2012 At most 312M is being used: -/+ buffers/cache: 312 15906 Link to comment
downloadski Posted October 19, 2012 Author Share Posted October 19, 2012 I place i3-2100T in the mainbord, and i moved 400GB to the cache drive. I saw some fluctiations, but that was due to the things the pc was doing. No slowdown this time. Mover run went fine, 400 GB in about 4 hours. So the warning from supermicro not to use an E3 12x5 series Xeon in this mainbord seems valid. The 1265L v2 did work, but these slowdown drove me mad. Will do some more test, but it seems i made a expensive mistake here.. Perhaps look for a mainbord which supports it, and update the atom server with it. Or just sell it. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.