timekiller Posted February 20, 2019 Share Posted February 20, 2019 I am almost done setting up my first unraid server and have a problem. As the title says, I'm getting fork: retry: Resource temporarily unavailable EVERYWHERE. A little background: High level: i7-6700K 4Ghz 32GB RAM 13 Drives totaling 42TB usable, 1TB SSD Cache Bunch of dockers, no running VMs (yet) Details: So that's the setup. No a little history: I am migrating from a home built storage server that was running on hardware RAID 6. I cobbled together enough old drives on the unraid server to move my data. I setup the new server and everything was looking good. I rsync'd my data from the old server (about 19TB) without issue. Once the rsync was done, I decommissioned the server and pulled the drives for the new server. Here is where the problems started. While preclearing the new drives I started seeing 'fork: retry: Resource temporarily unavailable' when trying to tab-complete from the terminal. Then I started seeing issues with my dockers. Unifi-video especially would be stopped every night. I'd look at the log for the docker and see I'm getting a ton of emails from cron with the subject 'cron for user root /etc/rc.d/rc.diskinfo --daemon &> /dev/null' with the same fork error. I got through the preclear and add the new drives, but then I had more data to sync back. I added a 10TB drive and mounted it with unassigned drives so I can directly rsync the data. The fork error got WAY out of control to the point no docker containers are working. I did some googling and found the error means I'm hitting a resource limit (obvious) and to look at `ulimit -a` to see what my limits are: root@Storage:~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 127826 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 40960 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) unlimited cpu time (seconds, -t) unlimited max user processes (-u) 127826 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I suspected open files was the issue and doubled (then doubled again) the number. I also increased max user processes and tried to increase pipe size, but it wouldn't let me. I though this helped, but now I'm have the same issues and I'm not sure if it helped, or I just was looking at a time when the issue temporarily stopped... The problem seems to be tied to high disk I/O. I've seen the issue now during preclear, disk to disk sync, and during a parity rebuild (I rearranged my drives after installing the new ones). It COULD be that this will all go away when I'm done the massive data moves, but I want to know why it's happening and fix it now. Any help is appreciated. Quote Link to comment
trurl Posted February 20, 2019 Share Posted February 20, 2019 Go to Tools - Diagnostics and attach the complete diagnostics zip file to your next post. Quote Link to comment
timekiller Posted February 20, 2019 Author Share Posted February 20, 2019 Attached. storage-diagnostics-20190220-1100.zip Quote Link to comment
trurl Posted February 20, 2019 Share Posted February 20, 2019 Your syslog also has a lot of wrong csrf tokens. This is caused by having another browser still open to your server somewhere after a reboot. Your docker image is many times larger than necessary. If you have things configured correctly it is unlikely you will ever need more than 20G. If you are filling it up then making it larger will only make it take longer to fill. You currently are using less than 20 and if that is increasing it typically means you have something misconfigured. As for your problem, I don't see anything obvious. And this complaint isn't at all common so I don't know what you are doing different than other people who don't get this. Possibly these symptoms don't have a common cause, but you have multiple causes. Maybe you could try eliminating some things to try to narrow it down. Each of your dockers has its own support thread that you can easily access by clicking on its icon and selecting Support. Quote Link to comment
timekiller Posted February 20, 2019 Author Share Posted February 20, 2019 Thanks Constructor - I definitely have unraid open on several laptops, so I'm sure that explains the csrf errors. As for the docker size, I knew I would have a bunch of containers and didn't want to run out of space. Besides, I have a 1TB SSD, I figured I had the room. As for the actual problem. I had a feeling this was not a common problem as searching turned up nothing unraid related. I think Unifi-Video is pushing me over whatever limit I am hitting. I stopped the container and haven't seen any errors for a few hours. The app is the NVR for my 8 security cameras, so there is a lot of disk i/o there. Still, this is nothing different from what I had running on my previous (Ubuntu) server, and I checked the ulimits there and they were actually much lower in a lot of cases. My only thought here is it has something to do with how the unraid fuse file system works. I know as data is written to a drive, it also has to read from all drives to calculate parity. Maybe the strain of 8 video streams plus my disk to disk rsync is just too much? I did not have any problems when I copied the bulk of my data from the old server (over gigabit ethernet). My concern here is that if I schedule a regular parity check I have to knock my security system offline for the 24+ hours it takes to finish. I am still on the trial license as I wanted to give the system time to show me any potential issues, and this is a pretty big one for me. I'm hoping I'm not going to be given the run around on this - I could easily open the unifi-video support thread, but we both know the maintainer just packages Ubiquiti's video app in a docker, so he's not going to know much about that proprietary application. Ubiquity is a large company (that honestly is focusing on their new hardware NVR) and they are just going to blame the OS since it works on Ubuntu without hitting a resource limit - and they'd be right, they can't support every OS under the sun. Hitting a resource limit is an OS restriction placed on user space applications. Being Linux, my go to was to look at ulimit, but that didn't seem to help. Is there anything unraid specific I can look at to try and resolve this? Quote Link to comment
trurl Posted February 20, 2019 Share Posted February 20, 2019 46 minutes ago, timekiller said: My only thought here is it has something to do with how the unraid fuse file system works. I know as data is written to a drive, it also has to read from all drives to calculate parity. There are actually 2 different methods you can choose for calculating parity. The one you mention isn't actually the default method so unless you changed the default it isn't reading all drives. See here: https://forums.unraid.net/topic/50397-turbo-write/ Fuse and parity don't have anything to do with each other, since parity just treats the complete disk as a bunch of bits and has nothing at all to do with files. I always recommend taking things a bit at a time, getting one thing working well before adding anything else. Sounds like what you are doing now will not be a typical load. Get the initial data load out of the way before trying to get your applications going. Then add the other stuff a little at a time. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.