ich777 Posted March 9, 2017 Share Posted March 9, 2017 Hello, i have a strange Problem... If i transfer a big file to Unraid about ~3GB+ Unraid's webinterface Dockers, VM's and Shares stop to respond. The filetransfer (to a Cache drive) is copying 3GB at full speed then it drops to 0KB/s, wait for a minute or two and then everything is ok. The filetransfer went up to about 50MB/s and after 10 seconds it went down to ~14KB/s (webinterface is again not responding) and so on... After the transfer is complete i have to wait about 1 minute and everything works without problems. I've tested if i transfer such a file within the VM, same behavior - Docker, VM's, Webinterface unresponsive... Even if my Docker do a big filetransfer the whole system becomes unresponsive like i described abvoe. Already checked a few threads but nothing found with the same error... Also downloaded the TipsAndTweaks App and set the 'vm.dirty_background_ration' to '2' and 'vm.dirty_ration' to '3', nothing changed so i set it back to '10' and '20'. My Server: Unraid 6.3.2 Pro ASUS Z9PA-D8 2x Intel Xeon 2670v1 64GB DDR3 1333 850W Corsair Power Supply 1x DellPerc H310 (IT-Mode) 1x DigitalDevices Octopus CI PCIe (passtrough to Windows 10Pro) 1x DigitalDevices CineV6 DVB-C Tuner PCIe (passtrough to Windows 10Pro) 1x DigitalDevices CineV6 DVB-C Tuner addon card (passtrough to Windows 10Pro) Array: 6x WD Red with various Sizes (one of them for Parity) Cache: 2x SanDisk 120GB + 2x SanDisk 240GB Unnasigned Devices App (1x SanDisk 32GB, 1x WD Blue 500GB) 2 Dockers (1x McMyAdmin, 1x jDownloader2) 4 VM's (always running: 1x Ubuntu 14.04, 1x Windows 10Pro; running if needed: 2x Windows 7) NIC's: 2x Onboard Intel 82574L Gigabit LAN (Network is in Bonding mode - balance-alb 6) (already turned off bonding to see if it helps but no change so i turned it back on) I attached the diagnostics file from my Server. Please help i dont know what do do else and sorry for my poor english i hope someone understand what i wrote... chipsserver-diagnostics-20170309-2025.zip 1 Quote Link to comment
trurl Posted March 9, 2017 Share Posted March 9, 2017 Maybe unrelated to your stated problem, but it looks like you added a cache pool after you had already setup your cache-prefer shares, and they still have files on the array. Recommend you stop the docker service and VMs then run mover to get them moved to cache. You might also install Fix Common Problems plugin, which would have told you about this and might find other problems you don't know about. Then, to simplify things, go ahead and try it with one NIC, no dockers or VMs and see how the transfer goes. Quote Link to comment
ich777 Posted March 9, 2017 Author Share Posted March 9, 2017 First i want to thank you for the quick answer! You are right i removed the entire cache pool last week and rebuild it because i upgraded it to SSD's. But everything should be on the cache pool from the shares with the option 'Prefer', i double checked it and looked on all disks, everything is there where it belongs to be. I downloaded the Fix Common Problems app and it finds only that the automatic update is turned off (turned off by me) and that a second registration key is found - i've fixed this instantly Then i started my testing (testfile is a 8,5GB img file), i turned everything off Docker, VM - still same behavior Then with dockers on - copy speed is ok but everything else is not responding, dockers and webgui Then with everything on (copy speed is ok everything responds great but at the last 300MB the speed goes down to 15MB/s after 10 seconds it goes up to 105MB/s it finishes and nothing is responding for about 1 minute) I think it has something to do with the Memory If all my dockers and two VM's are turned on the Memory is allmost full and the copyspeed is like i described in the first post (I know that Linux holds a lot in the cache but is this right?) I dont understand why nothing responds for about a minute, can copy from system memory to the SSD's make the system this much unresponsive? Quote Link to comment
ich777 Posted March 13, 2017 Author Share Posted March 13, 2017 Is someone out there who can help me with this? Quote Link to comment
jonp Posted March 13, 2017 Share Posted March 13, 2017 Hi and sorry to hear you are having difficulty. Have you eliminated the network itself as the problem? Tried new network cable to the unRAID system or your test machine? How about new ports on the router / switch its attached to? RAM is definitely not the problem here. Quote Link to comment
ich777 Posted March 14, 2017 Author Share Posted March 14, 2017 (edited) Hi thank you for the reply. Yes, the network isn't the problem. Before i put unraid on the server it runs on windows server 2012 r2 essentials and never had such problems. Also i tried a direct connection between the server and 2 different computers with a 2 new cat6 cables, same again. Strange is if go to my VM through the RDP connection and began the file transfer (file was 20gb because of the much faster copy speed) i experienced the same as described above copy speed went down to 0kb/s, Unraid is unresponsive, VM doesn't respond, docker doesn't work after ~30secs the copy continues the vm and everything else respons normaly after a few seconds the copy speed drops again and everything is, again, unresponsive. Maybe it's the unnassigned drives plugin? Should i try to make a 'New Config' under the Tools section? Edited March 14, 2017 by ich777 Quote Link to comment
trurl Posted March 14, 2017 Share Posted March 14, 2017 2 hours ago, ich777 said: Should i try to make a 'New Config' under the Tools section? The only thing New Config does is reset your drive assignments. Quote Link to comment
ich777 Posted March 18, 2017 Author Share Posted March 18, 2017 (edited) Hi, i've copied a VM image to another share (on the Cache Disk), same problem as discribed above and i discovered this in the log: Mar 18 20:23:24 chipsServer kernel: perf: interrupt took too long (2591 > 2500), lowering kernel.perf_event_max_sample_rate to 77000 Mar 18 20:23:47 chipsServer kernel: perf: interrupt took too long (3322 > 3238), lowering kernel.perf_event_max_sample_rate to 60000 What means this? Is this part of my problem? Edit: Another strange thing that i discovered is when i copy files from my Cache Pool to my Unnassigned Drives the copy speed is constant, something must be wrong with the cache disks... Edited March 18, 2017 by ich777 Attached screenshot Quote Link to comment
psparks Posted March 19, 2017 Share Posted March 19, 2017 (edited) Hi ich777, So after running into the same problem and pulling my hair our for more than a few days with about 100 different configurations I think I figured out what's going on. It looks like its a problem with the way the SSDs are made, and how much actually 'fast' (SLC) memory they have vs the slower (TLC) memory. I've got an OCZ Trion 150 500GB, and it looks like there is ~7GB of SLC memory on there which moves at around 400-500MB/s. After that 7GB it drops down to HDD speeds of around 70-80MB/s. This is rather pitiful in my opinion. It wasn't a problem until i started downloading bluRays via sonarr/sabnzb and my entire system would lock up whenever they were unpacking/moving. As to why it 'locks up' and makes unraid unresponsive I couldn't tell you, but I'm almost certain this is the cause of our troubles. Let me know if that makes sense for you and if you come to a solution. JonP, any idea why a really slow cache drive would lock things up? Here is an article from pcworld about the exact issue: http://www.pcworld.com/article/2947864/storage/ocz-trion-100-review-an-affordable-ssd-with-a-problem.html Edited March 19, 2017 by psparks forgot link Quote Link to comment
ich777 Posted March 20, 2017 Author Share Posted March 20, 2017 Hi psparks, thanks for the reply. Yeah i read such an article about the SanDisk SSD's too, but in the article the write speed, within windows and connected to a sata3 port, drops from ~450MB/s to about ~150MB/s when the SSD transfers the data from the SLC to the TLC memory, but why drop the speed in my unraid config to 0MB/s for about a minute and everything is completely unresponsive und locks everything up? Bevor i changed my Cache Pool to SSD's i've got 2x 500GB WD Blue in my Pool write speed is good but after 2GB of transfer my copy speed drops to ~30MB/s and stays at this level. This is very strange and i don't understand why this is happening. After a few more testing, with the Cache SSD's, i discovered that locking up everything is not constant e.g if i copy a file (~20GB) write speed drops after 3GB goes up for about 2GB drops and so on. Wait a few hours copy the same file aigain, write speed is constant for about 6GB then drops, goes up copys 4GB drops and so on, if i copy the file again without waiting (after everything responds like it should) the speed is constant for about 10GB then drops and is realy slow for the rest of the copy. Can someone help us out? Quote Link to comment
daze Posted March 29, 2017 Share Posted March 29, 2017 I had a similar issue, and my case it was a NIC. Which it doesn't seem to be the case here from what I read. If you have another NIC lying around, install it without bonding and see if it makes any difference. Quote Link to comment
ich777 Posted May 5, 2017 Author Share Posted May 5, 2017 Okay, it was the two cache drives in my case (2x 120GB SanDisk). I've ordered 2x 240GB SanDisk drives and once again same behaviour, after that my last attempt was to put 2x 500GB WD Blue in it and now it works. But can you please look into it why unRAID becomes unresponsive if you have suche disks who have 10GB of cache on it (like the SanDisks or in @psparks case with the OCZ drives)? If i put the SSD in a Windows or a Ubuntu machine there is no noticable speed drop. Thank you all for the help! Appriciate it! Quote Link to comment
klhutchins Posted June 6, 2017 Share Posted June 6, 2017 (edited) I'm running into a similar issue. When I transfer to my cache drive, it is extremely slow, my Docker Containers become unresponsive, and the Web UI stops responding. The cache drive is a SanDisk SDSSDA120G. I get better speeds when copying to a share on a WD red. One thing I noticed, was the drive gets pretty warm, so I bumped the warning to 55°C and threshold to 60°C. (The manufacturer has a rating up to 70°) Unfortunately this hasn't done much. (I'm not even sure if that only changes when I get notifications, or if it throttles based on temp.) I may need to look into getting a different SSD for my cache. Edited June 6, 2017 by klhutchins Quote Link to comment
defiant Posted February 8, 2018 Share Posted February 8, 2018 i'm having this same problem when i move files from an unassigned SSD to my BTFRS cache pool (2x512GB 850 Pros). Plex doesn't load, nor does the unraid gui. Quote Link to comment
trurl Posted February 8, 2018 Share Posted February 8, 2018 1 minute ago, defiant said: i'm having this same problem when i move files from an unassigned SSD to my BTFRS cache pool (2x512GB 850 Pros). Plex doesn't load, nor does the unraid gui. Start your own thread. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.