joshstrange Posted August 6, 2017 Share Posted August 6, 2017 I've had this issue for a while and I'm about at my wits end. I'm tried a ton of different things and the problem persists. I first saw issues as my array started to fill up and when NZBGet would unpack a download the load average would spike as a ton of processes would enter state "D" uninterruptible sleep (See picture). It turns out my CPU is sitting there doing nothing, memory is fine, but load average will climb to 20, 30, 40+ and the web UI becomes unresponsive, "docker ps" won't complete, all docker images and VM's just lock and won't do anything. Looking at docker logs for the containers I see messages about "database locked", "query took over 1 seconds", etc (sqlite DB's). I've tried limiting NZBGet's resource allotment but it appears it was just the cannery as even without it running the system can lock up. I have 10x5TB drives in the array and 1x256GB SSD cache drive. The array drives were all sitting at about 95% full but I just deleted about a 1TB of content to get all the drives under 90% as my previous hunch was nearly-full drives were causing slow reads/writes and by clearing out some space I could alleviate the problem. It hasn't. I've done this (or moved data) and I thought that helped fixed this issue but it may have just been my imagination. Pretty much my entire setup is automated so I don't pay much attention to it unless my VM locks up or one of my docker containers stops responding. I will do ANYTHING to try to fix this as it's becoming a time sync. I have 3 unraid boxes at the highest paid tier but this is my only box I run multiple docker containers on as it's my "power house" while the other two are glorified NFS's that are mounted on the main box for Plex to serve up. Here are some stats and I've also attached the diagnostic logs: Model: Custom M/B: ASRock - Z97 Extreme6 CPU: Intel® Core™ i7-4770K CPU @ 3.50GHz HVM: Enabled IOMMU: Disabled Cache: 256 kB, 1024 kB, 8192 kB Memory: 32 GB (max. installable capacity 32 GB) Network: eth0: 1000 Mb/s, full duplex, mtu 1500 eth1: not connected Kernel: Linux 4.9.10-unRAID x86_64 OpenSSL: 1.0.2k tower-diagnostics-20170806-1315.zip Quote Link to comment
Squid Posted August 6, 2017 Share Posted August 6, 2017 First off, hopefully you live in China, and have multiple computers in your household with various IPs from different providers connecting to your server. If not, YOU ARE BEING HACKED. Remove your server from a DMZ 1 Quote Link to comment
joshstrange Posted August 6, 2017 Author Share Posted August 6, 2017 (edited) I assume you are referencing this: Aug 6 13:31:14 Tower sshd[7408]: Received disconnect from 116.31.116.43 port 35126:11: [preauth] Aug 6 13:31:14 Tower sshd[7408]: Disconnected from 116.31.116.43 port 35126 [preauth] Aug 6 13:32:24 Tower sshd[8082]: Received disconnect from 116.31.116.43 port 59440:11: [preauth] Aug 6 13:32:24 Tower sshd[8082]: Disconnected from 116.31.116.43 port 59440 [preauth] Aug 6 13:32:54 Tower sshd[6366]: Received disconnect from 96.29.187.74 port 40130:11: disconnected by user Aug 6 13:32:54 Tower sshd[6366]: Disconnected from 96.29.187.74 port 40130 Aug 6 13:32:55 Tower sshd[6275]: Received disconnect from 96.29.187.74 port 33723:11: disconnected by user Aug 6 13:32:55 Tower sshd[6275]: Disconnected from 96.29.187.74 port 33723 Aug 6 13:33:34 Tower sshd[10060]: Received disconnect from 116.31.116.43 port 25303:11: [preauth] Aug 6 13:33:34 Tower sshd[10060]: Disconnected from 116.31.116.43 port 25303 [preauth] Aug 6 13:34:42 Tower sshd[11499]: Received disconnect from 59.45.175.11 port 50098:11: [preauth] I have pubkey auth-only enabled but yes this is annoying as hell in my syslogs. A while back I looked into something like fail2ban for unraid but didn't find anything. Edit: Also my server is not in a DMZ, just 22, 80, 443 are exposed through the router (which yes, you could argue is close enough to being in a DMZ). Edit 2: I just re-setup DenyHosts, I had it before but I had some issue with it (I can't remember what the problem was). Hopefully that will block the attacker. Edited August 6, 2017 by joshstrange Added info Quote Link to comment
joshstrange Posted August 8, 2017 Author Share Posted August 8, 2017 The mover running will also cause my array to lock up. It's like writing to my array is extremely slow. Is there any utility I can use to find the culprit of the slowness? Quote Link to comment
trurl Posted August 8, 2017 Share Posted August 8, 2017 Have you checked to see if you have fixed the hack attempts? Why do you need those ports exposed on the internet anyway? Quote Link to comment
joshstrange Posted August 8, 2017 Author Share Posted August 8, 2017 @trurl I have DenyHosts setup and I have blocked the attempts. I need them open for the following reasons: 22: This is my jump server into my home network, it is key-only login 80: To redirect to 443 443: To serve up my SSL apps running in docker containers That said this is slightly divergent from my main issue which is writing to the array takes a long time, causes IO blocking, and causes the load average to jump to 40+. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.