aptalca Posted March 25, 2020 Share Posted March 25, 2020 2 hours ago, samcool55 said: So, for some reason, it does one WU and then it all basically dies. If i delete the container, delete the appdata folder and download it again, it works right away, once. No WU for almost 24 hours seems just, not right. 21:12:06:68:192.168.1.57:New Web connection 21:40:27:WU01:FS00:Connecting to 65.254.110.245:8080 [93m21:40:27:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m 21:40:27:WU01:FS00:Connecting to 18.218.241.186:80 [93m21:40:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m [91m21:40:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m ******************************* Date: 2020-03-24 ******************************* 23:43:27:WU01:FS00:Connecting to 65.254.110.245:8080 [93m23:43:27:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m 23:43:27:WU01:FS00:Connecting to 18.218.241.186:80 [93m23:43:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m [91m23:43:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m 03:02:27:WU01:FS00:Connecting to 65.254.110.245:8080 [93m03:02:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m 03:02:28:WU01:FS00:Connecting to 18.218.241.186:80 [93m03:02:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m [91m03:02:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m ******************************* Date: 2020-03-25 ******************************* 08:24:27:WU01:FS00:Connecting to 65.254.110.245:8080 [93m08:24:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m 08:24:28:WU01:FS00:Connecting to 18.218.241.186:80 [93m08:24:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m [91m08:24:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m ******************************* Date: 2020-03-25 ******************************* 14:24:27:WU01:FS00:Connecting to 65.254.110.245:8080 [93m14:24:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m 14:24:28:WU01:FS00:Connecting to 18.218.241.186:80 [93m14:24:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m [91m14:24:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m My other f&h system that runs W10 and the client keeps getting WU's so it's, confusing... Jobs are distributed server side. We have no control over it. They may have different priorities based on cpu size, gpu type, etc. Quote Link to comment
J89eu Posted March 25, 2020 Share Posted March 25, 2020 Can this run on AMD GPUs? I have a Vega 56 and it seems the Windows app does work with GPU but perhaps not on Linux? Quote Link to comment
aptalca Posted March 25, 2020 Share Posted March 25, 2020 1 hour ago, J89eu said: Can this run on AMD GPUs? I have a Vega 56 and it seems the Windows app does work with GPU but perhaps not on Linux? Folding@home works with amd gpus, however, we do not support it with this image. Simply because none of us have a suitable test environment. I have one amd gpu, but it crashes my unraid servers when I try to pass through to a Linux VM. I don't believe there currently is a way to install necessary amd drivers on unraid for use in containers, but again, my knowledge on amd in containers is not very deep. Quote Link to comment
earthworm Posted March 26, 2020 Share Posted March 26, 2020 22 hours ago, J89eu said: Can this run on AMD GPUs? I have a Vega 56 and it seems the Windows app does work with GPU but perhaps not on Linux? I have 2 older AMD GPUs (5xxx, 6xxx) and neither of them has ever received a work unit which is disappointing because they would certainly be faster than any CPU I own. My systems are running Windows. Quote Link to comment
jmbrnt Posted March 26, 2020 Share Posted March 26, 2020 (edited) Something seems really broken with this container, or at least the job F@H is sending me. It folds for a couple of seconds, then dies. Log below. 21:21:13:WU00:FS00:0xa7:ERROR:------------------------------------------------------- 21:21:13:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown 21:21:13:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902 21:21:13:WU00:FS00:0xa7:ERROR: 21:21:13:WU00:FS00:0xa7:ERROR:Fatal error: 21:21:13:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 25 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm 21:21:13:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings 21:21:13:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition 21:21:13:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS 21:21:13:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors 21:21:13:WU00:FS00:0xa7:ERROR:------------------------------------------------------- 21:21:18:WU00:FS00:0xa7:WARNING:Unexpected exit() call 21:21:18:WU00:FS00:0xa7:WARNING:Unexpected exit from science code 21:21:18:WU00:FS00:0xa7:Saving result file ../logfile_01.txt 21:21:18:WU00:FS00:0xa7:Saving result file md.log 21:21:18:WU00:FS00:0xa7:Saving result file science.log 21:21:18:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66) 21:22:13:WU00:FS00:Starting 21:22:13:WU00:FS00:Removing old file './work/00/logfile_01-20200326-205104.txt' 21:22:13:WU00:FS00:Running FahCore: /app/usr/bin/FAHCoreWrapper /config/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 258 -checkpoint 15 -np 31 Edit: talking to someone with more F@H experience, this seems like a dud WU. Will re-install the container. Edited March 26, 2020 by jmbrnt edit Quote Link to comment
mschindl Posted March 27, 2020 Share Posted March 27, 2020 Hello, it works well for CPU processing, but did someone get it running on ubuntu with docker and GPU (i.e. M2200)? What I did: # Add the package repositories distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker docker run -d -it \ --name=foldingathome \ -e PUID=1000 \ -e PGID=1000 \ -e TZ=Europe/Berlin \ -e NVIDIA_VISIBLE_DEVICES=all \ -p 7396:7396 \ -v /DATAINT/Docker-Conf/foldinghome:/config \ --restart unless-stopped \ --name foldingathome \ linuxserver/foldingathome But I got following error with newest driver in Ubuntu 18.04: root@Server:~# nvidia-smi Thu Mar 26 14:47:54 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.64 Driver Version: 440.64 CUDA Version: 10.2 | root@Server:~# docker logs -f foldingathome 13:40:16:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually 10:53:15:******************************* System ******************************** 10:53:15: CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz 10:53:15: CPU ID: GenuineIntel Family 6 Model 158 Stepping 9 10:53:15: CPUs: 8 10:53:15: Memory: 31.14GiB 10:53:15:Free Memory: 29.29GiB 10:53:15: Threads: POSIX_THREADS 10:53:15: OS Version: 4.15 10:53:15:Has Battery: true 10:53:15: On Battery: false 10:53:15: UTC Offset: 1 10:53:15: PID: 259 10:53:15: CWD: /config 10:53:15: OS: Linux 4.15.0-91-generic x86_64 10:53:15: OS Arch: AMD64 10:53:15: GPUs: 1 10:53:15: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:5 GM206 [Quadro M2200] 10:53:15: CUDA: Not detected: cuInit() returned 100 10:53:15: OpenCL: Not detected: clGetPlatformIDs() returned -1001 12:22:50:<config> 12:22:50: <!-- Remote Command Server --> 12:22:50: <password v='********'/> 12:22:50: 12:22:50: <!-- Slot Control --> 12:22:50: <power v='FULL'/> 12:22:50: 12:22:50: <!-- User Information --> 12:22:50: <passkey v='********************************'/> 12:22:50: <team v='xxx'/> 12:22:50: <user v='xxx'/> 12:22:50: 12:22:50: <!-- Folding Slots --> 12:22:50: <slot id='0' type='CPU'> 12:22:50: <paused v='true'/> 12:22:50: </slot> 12:22:50: <slot id='1' type='GPU'> 12:22:50: <paused v='true'/> 12:22:50: </slot> 12:22:50:</config> Quote Link to comment
hawihoney Posted March 27, 2020 Share Posted March 27, 2020 Tried to install this docker but it seems to be completely broken. The Web-UI does nothing. The links don't work, can't change identity, etc. What's wrong here? Thanks in advance. Quote Link to comment
Mikey160984 Posted March 27, 2020 Share Posted March 27, 2020 Did anyone try to use more then one nvidia gpu with this container? editing the config for a second gpu slot is no problem, but do both gpus work as they should? Quote Link to comment
aptalca Posted March 27, 2020 Share Posted March 27, 2020 5 hours ago, mschindl said: Hello, it works well for CPU processing, but did someone get it running on ubuntu with docker and GPU (i.e. M2200)? What I did: # Add the package repositories distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit sudo systemctl restart docker docker run -d -it \ --name=foldingathome \ -e PUID=1000 \ -e PGID=1000 \ -e TZ=Europe/Berlin \ -e NVIDIA_VISIBLE_DEVICES=all \ -p 7396:7396 \ -v /DATAINT/Docker-Conf/foldinghome:/config \ --restart unless-stopped \ --name foldingathome \ linuxserver/foldingathome But I got following error with newest driver in Ubuntu 18.04: root@Server:~# nvidia-smi Thu Mar 26 14:47:54 2020 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.64 Driver Version: 440.64 CUDA Version: 10.2 | root@Server:~# docker logs -f foldingathome 13:40:16:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually 10:53:15:******************************* System ******************************** 10:53:15: CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz 10:53:15: CPU ID: GenuineIntel Family 6 Model 158 Stepping 9 10:53:15: CPUs: 8 10:53:15: Memory: 31.14GiB 10:53:15:Free Memory: 29.29GiB 10:53:15: Threads: POSIX_THREADS 10:53:15: OS Version: 4.15 10:53:15:Has Battery: true 10:53:15: On Battery: false 10:53:15: UTC Offset: 1 10:53:15: PID: 259 10:53:15: CWD: /config 10:53:15: OS: Linux 4.15.0-91-generic x86_64 10:53:15: OS Arch: AMD64 10:53:15: GPUs: 1 10:53:15: GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:5 GM206 [Quadro M2200] 10:53:15: CUDA: Not detected: cuInit() returned 100 10:53:15: OpenCL: Not detected: clGetPlatformIDs() returned -1001 12:22:50:<config> 12:22:50: <!-- Remote Command Server --> 12:22:50: <password v='********'/> 12:22:50: 12:22:50: <!-- Slot Control --> 12:22:50: <power v='FULL'/> 12:22:50: 12:22:50: <!-- User Information --> 12:22:50: <passkey v='********************************'/> 12:22:50: <team v='xxx'/> 12:22:50: <user v='xxx'/> 12:22:50: 12:22:50: <!-- Folding Slots --> 12:22:50: <slot id='0' type='CPU'> 12:22:50: <paused v='true'/> 12:22:50: </slot> 12:22:50: <slot id='1' type='GPU'> 12:22:50: <paused v='true'/> 12:22:50: </slot> 12:22:50:</config> You forgot "--runtime=nvidia" 1 Quote Link to comment
aptalca Posted March 27, 2020 Share Posted March 27, 2020 2 hours ago, hawihoney said: Tried to install this docker but it seems to be completely broken. The Web-UI does nothing. The links don't work, can't change identity, etc. What's wrong here? Thanks in advance. Try an incognito window Quote Link to comment
aptalca Posted March 27, 2020 Share Posted March 27, 2020 1 hour ago, Mikey160984 said: Did anyone try to use more then one nvidia gpu with this container? editing the config for a second gpu slot is no problem, but do both gpus work as they should? I tested with 2 gpus and it works There's a screenshot of it in this article: https://blog.linuxserver.io/2020/03/21/covid-19-a-quick-update/ By the way, you don't need to edit the config at all (in fact, don't). If you allow both gpus via nvidia arguments, they'll both be used automatically. 1 Quote Link to comment
Mikey160984 Posted March 27, 2020 Share Posted March 27, 2020 (edited) haha... got the same GT710 lying around and so i thought i put it in the server too... so I only have to add another variable with the GPU ID or should I type all two IDs in the same field? maybe you could assist me in this case. could be interesting for more people Edit: Got it, had to reinstall the Container and typed "all" under Nvidia_Visible_Devices now both GPUs are listed, one is folding, second one waits for WU... should work fine I think Edited March 27, 2020 by Mikey160984 Quote Link to comment
mschindl Posted March 27, 2020 Share Posted March 27, 2020 2 hours ago, aptalca said: You forgot "--runtime=nvidia" Thank, there was also package missing. root@Server:~# sudo apt-get install nvidia-docker2 docker run -d -it \ --runtime=nvidia \ --name=foldingathome \ -e PUID=1000 \ -e PGID=1000 \ -e TZ=Europe/Berlin \ -e NVIDIA_VISIBLE_DEVICES=all \ -p 7396:7396 \ -v /DATAINT/Docker-Conf/foldinghome:/config \ --restart unless-stopped \ --name foldingathome \ linuxserver/foldingathome Thank you Quote Link to comment
arough Posted March 27, 2020 Share Posted March 27, 2020 Can someone help me with my F@H docker only using one of my CPU cores? As you can see in the image F@H somehow sees that I have 4 cores but doesn't use them. I already tried CPU pinning, but to no avail. Maybe some setting inside the docker that I#m not seeing? Thanks in advance Quote Link to comment
hawihoney Posted March 28, 2020 Share Posted March 28, 2020 Any idea why I always get HTTP_NOT_FOUND in this Dockers Logs: 04:50:26:WU00:FS00:Connecting to 65.254.110.245:8080 [93m04:50:26:WARNING:WU00:FS00:Failed to get ID from '65.254.110.245:8080': 10001: Server responded: HTTP_NOT_FOUND[0m 04:50:26:WU00:FS00:Connecting to 18.218.241.186:80 [93m04:50:26:WARNING:WU00:FS00:Failed to get ID from '18.218.241.186:80': 10001: Server responded: HTTP_NOT_FOUND[0m [91m04:50:26:ERROR:WU00:FS00:Exception: Could not get an assignment ID[0m Thanks in advance. Quote Link to comment
Mikey160984 Posted March 28, 2020 Share Posted March 28, 2020 (edited) Another short question or little problem... after a WU got folded it will be sent to the f@h servers, and then it displays "cleanup". But it sticks there. Didn't have the problem with the docker from mobiousnine. After restarting the container, the "cleanup" WU is away from the list. The log says: ... 21:06:38:WU02:FS00:Cleaning up 21:06:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:06:38:WU02:FS00:Cleaning up 21:06:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:07:38:WU02:FS00:Cleaning up 21:07:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:09:15:WU02:FS00:Cleaning up 21:09:15:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:11:52:WU02:FS00:Cleaning up 21:11:52:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:16:07:WU02:FS00:Cleaning up 21:16:07:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" 21:22:58:WU02:FS00:Cleaning up 21:22:58:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02" And I'm not sure, if the points were added to my account as it got not fully finish... Folding too with two other WinPCs and there I do not have this problem. Anyone got same problems? Edited March 28, 2020 by Mikey160984 Quote Link to comment
Mikey160984 Posted March 29, 2020 Share Posted March 29, 2020 On 3/27/2020 at 1:47 PM, aptalca said: I tested with 2 gpus and it works There's a screenshot of it in this article: https://blog.linuxserver.io/2020/03/21/covid-19-a-quick-update/ By the way, you don't need to edit the config at all (in fact, don't). If you allow both gpus via nvidia arguments, they'll both be used automatically. Another question for 2 GPUs If I type in the the docker only the GPU ID I want to use, the sytem still uses the two GPUs (like the "all" argument). Testet with a reinstall (yes, deletet the folder in appdata). Is this normal? Quote Link to comment
tjb_altf4 Posted March 29, 2020 Share Posted March 29, 2020 Moved over to Nvidia/Unraid build 6.8.3 today, so I migrated my folding vm over to the f@h docker I already had setup. Compared to GPU only folding VM, which seemed to pump all the cores (4C/4T) on the VM, this smashes a single core. Hopefully docker only utilising one core isn't a bottleneck for F@H. Quote Link to comment
aptalca Posted March 29, 2020 Share Posted March 29, 2020 7 hours ago, Mikey160984 said: Another question for 2 GPUs If I type in the the docker only the GPU ID I want to use, the sytem still uses the two GPUs (like the "all" argument). Testet with a reinstall (yes, deletet the folder in appdata). Is this normal? If you set it to only one gpu's id, f@h will still see both gpus, but it won't be able to start the job on one of them. You'll see an error in the log, something like " no compute devices matched gpu #0 blah blah you may need to update your graphics drivers". I paused that gpu so it no longer receives jobs it won't be able to complete. 1 Quote Link to comment
SniperkroZ Posted March 30, 2020 Share Posted March 30, 2020 Hi, I am running the docker and have FAHControl set up, but my gpu (yes i know only a 1060) only runs when set to medium and then ramps up to 100%usage. Is there any way to let the gpu run when set at light power? Quote Link to comment
Squid Posted March 30, 2020 Share Posted March 30, 2020 18 minutes ago, SniperkroZ said: Hi, I am running the docker and have FAHControl set up, but my gpu (yes i know only a 1060) only runs when set to medium and then ramps up to 100%usage. Is there any way to let the gpu run when set at light power? no Quote Link to comment
shiftylilbastrd Posted April 1, 2020 Share Posted April 1, 2020 Is it possible to get this to run only "on idle". Running on Unraid and it doesn't seem to ever start folding when the option is checked. 1 Quote Link to comment
Michel Amberg Posted April 7, 2020 Share Posted April 7, 2020 On 4/1/2020 at 10:50 PM, shiftylilbastrd said: Is it possible to get this to run only "on idle". Running on Unraid and it doesn't seem to ever start folding when the option is checked. I have the same problem. With the default config it seems like it will wait for idle forever. Does this work for anyone else? I have my GPU assigned for both my Plex container and now this is that an issue? Quote Link to comment
aptalca Posted April 7, 2020 Share Posted April 7, 2020 24 minutes ago, Michel Amberg said: I have the same problem. With the default config it seems like it will wait for idle forever. Does this work for anyone else? I have my GPU assigned for both my Plex container and now this is that an issue? Shouldn't be an issue Quote Link to comment
norbertt Posted April 8, 2020 Share Posted April 8, 2020 On 4/1/2020 at 10:50 PM, shiftylilbastrd said: Is it possible to get this to run only "on idle". Running on Unraid and it doesn't seem to ever start folding when the option is checked. Same here. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.