[Support] Linuxserver.io - Folding@home


Recommended Posts

2 hours ago, samcool55 said:

So, for some reason, it does one WU and then it all basically dies.

If i delete the container, delete the appdata folder and download it again, it works right away, once.

No WU for almost 24 hours seems just, not right.

 

 

21:12:06:68:192.168.1.57:New Web connection
21:40:27:WU01:FS00:Connecting to 65.254.110.245:8080
[93m21:40:27:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
21:40:27:WU01:FS00:Connecting to 18.218.241.186:80
[93m21:40:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m21:40:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m
******************************* Date: 2020-03-24 *******************************
23:43:27:WU01:FS00:Connecting to 65.254.110.245:8080
[93m23:43:27:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
23:43:27:WU01:FS00:Connecting to 18.218.241.186:80
[93m23:43:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m23:43:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m
03:02:27:WU01:FS00:Connecting to 65.254.110.245:8080
[93m03:02:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
03:02:28:WU01:FS00:Connecting to 18.218.241.186:80
[93m03:02:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m03:02:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m
******************************* Date: 2020-03-25 *******************************
08:24:27:WU01:FS00:Connecting to 65.254.110.245:8080
[93m08:24:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
08:24:28:WU01:FS00:Connecting to 18.218.241.186:80
[93m08:24:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m08:24:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m
******************************* Date: 2020-03-25 *******************************
14:24:27:WU01:FS00:Connecting to 65.254.110.245:8080
[93m14:24:28:WARNING:WU01:FS00:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
14:24:28:WU01:FS00:Connecting to 18.218.241.186:80
[93m14:24:28:WARNING:WU01:FS00:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m14:24:28:ERROR:WU01:FS00:Exception: Could not get an assignment[0m

 

 

 

My other f&h system that runs W10 and the client keeps getting WU's so it's, confusing...

Jobs are distributed server side. We have no control over it.

 

They may have different priorities based on cpu size, gpu type, etc.

Link to comment
1 hour ago, J89eu said:

Can this run on AMD GPUs? I have a Vega 56 and it seems the Windows app does work with GPU but perhaps not on Linux?

Folding@home works with amd gpus, however, we do not support it with this image. Simply because none of us have a suitable test environment.

 

I have one amd gpu, but it crashes my unraid servers when I try to pass through to a Linux VM.

 

I don't believe there currently is a way to install necessary amd drivers on unraid for use in containers, but again, my knowledge on amd in containers is not very deep.

Link to comment
22 hours ago, J89eu said:

Can this run on AMD GPUs? I have a Vega 56 and it seems the Windows app does work with GPU but perhaps not on Linux?

I have 2 older AMD GPUs (5xxx, 6xxx) and neither of them has ever received a work unit which is disappointing because they would certainly be faster than any CPU I own.  My systems are running Windows.

Link to comment

Something seems really broken with this container, or at least the job F@H is sending me. It folds for a couple of seconds, then dies. Log below.

 


21:21:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------

21:21:13:WU00:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown

21:21:13:WU00:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902

21:21:13:WU00:FS00:0xa7:ERROR:

21:21:13:WU00:FS00:0xa7:ERROR:Fatal error:

21:21:13:WU00:FS00:0xa7:ERROR:There is no domain decomposition for 25 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm

21:21:13:WU00:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings

21:21:13:WU00:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition

21:21:13:WU00:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS

21:21:13:WU00:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors

21:21:13:WU00:FS00:0xa7:ERROR:-------------------------------------------------------

21:21:18:WU00:FS00:0xa7:WARNING:Unexpected exit() call

21:21:18:WU00:FS00:0xa7:WARNING:Unexpected exit from science code

21:21:18:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
21:21:18:WU00:FS00:0xa7:Saving result file md.log
21:21:18:WU00:FS00:0xa7:Saving result file science.log
21:21:18:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
21:22:13:WU00:FS00:Starting
21:22:13:WU00:FS00:Removing old file './work/00/logfile_01-20200326-205104.txt'
21:22:13:WU00:FS00:Running FahCore: /app/usr/bin/FAHCoreWrapper /config/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 258 -checkpoint 15 -np 31
 

 

Edit: talking to someone with more F@H experience, this seems like a dud WU. Will re-install the container. 

Edited by jmbrnt
edit
Link to comment

Hello,

 

it works well for CPU processing, but did someone get it running on ubuntu with docker and GPU (i.e. M2200)?

 

What I did:

 

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

 

docker run -d -it \
  --name=foldingathome \
  -e PUID=1000 \
  -e PGID=1000 \
  -e TZ=Europe/Berlin \

  -e NVIDIA_VISIBLE_DEVICES=all \

  -p 7396:7396 \
  -v /DATAINT/Docker-Conf/foldinghome:/config \
  --restart unless-stopped \

  --name foldingathome \

  linuxserver/foldingathome

 

But I got following error with newest driver in Ubuntu 18.04:

 

root@Server:~# nvidia-smi

Thu Mar 26 14:47:54 2020

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 440.64       Driver Version: 440.64       CUDA Version: 10.2     |

 

root@Server:~# docker logs -f foldingathome

13:40:16:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually

 

10:53:15:******************************* System ********************************

10:53:15:        CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

10:53:15:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9

10:53:15:       CPUs: 8

10:53:15:     Memory: 31.14GiB

10:53:15:Free Memory: 29.29GiB

10:53:15:    Threads: POSIX_THREADS

10:53:15: OS Version: 4.15

10:53:15:Has Battery: true

10:53:15: On Battery: false

10:53:15: UTC Offset: 1

10:53:15:        PID: 259

10:53:15:        CWD: /config

10:53:15:         OS: Linux 4.15.0-91-generic x86_64

10:53:15:    OS Arch: AMD64

10:53:15:       GPUs: 1

10:53:15:      GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:5 GM206 [Quadro M2200]

10:53:15:       CUDA: Not detected: cuInit() returned 100

10:53:15:     OpenCL: Not detected: clGetPlatformIDs() returned -1001

 

12:22:50:<config>

12:22:50:  <!-- Remote Command Server -->

12:22:50:  <password v='********'/>

12:22:50:

12:22:50:  <!-- Slot Control -->

12:22:50:  <power v='FULL'/>

12:22:50:

12:22:50:  <!-- User Information -->

12:22:50:  <passkey v='********************************'/>

12:22:50:  <team v='xxx'/>

12:22:50:  <user v='xxx'/>

12:22:50:

12:22:50:  <!-- Folding Slots -->

12:22:50:  <slot id='0' type='CPU'>

12:22:50:    <paused v='true'/>

12:22:50:  </slot>

12:22:50:  <slot id='1' type='GPU'>

12:22:50:    <paused v='true'/>

12:22:50:  </slot>

12:22:50:</config>

 

image002.png

Link to comment
5 hours ago, mschindl said:

Hello,

 

it works well for CPU processing, but did someone get it running on ubuntu with docker and GPU (i.e. M2200)?

 

What I did:

 

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

 

docker run -d -it \
  --name=foldingathome \
  -e PUID=1000 \
  -e PGID=1000 \
  -e TZ=Europe/Berlin \

  -e NVIDIA_VISIBLE_DEVICES=all \

  -p 7396:7396 \
  -v /DATAINT/Docker-Conf/foldinghome:/config \
  --restart unless-stopped \

  --name foldingathome \

  linuxserver/foldingathome

 

But I got following error with newest driver in Ubuntu 18.04:

 

root@Server:~# nvidia-smi

Thu Mar 26 14:47:54 2020

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 440.64       Driver Version: 440.64       CUDA Version: 10.2     |

 

root@Server:~# docker logs -f foldingathome

13:40:16:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually

 

10:53:15:******************************* System ********************************

10:53:15:        CPU: Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

10:53:15:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9

10:53:15:       CPUs: 8

10:53:15:     Memory: 31.14GiB

10:53:15:Free Memory: 29.29GiB

10:53:15:    Threads: POSIX_THREADS

10:53:15: OS Version: 4.15

10:53:15:Has Battery: true

10:53:15: On Battery: false

10:53:15: UTC Offset: 1

10:53:15:        PID: 259

10:53:15:        CWD: /config

10:53:15:         OS: Linux 4.15.0-91-generic x86_64

10:53:15:    OS Arch: AMD64

10:53:15:       GPUs: 1

10:53:15:      GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:5 GM206 [Quadro M2200]

10:53:15:       CUDA: Not detected: cuInit() returned 100

10:53:15:     OpenCL: Not detected: clGetPlatformIDs() returned -1001

 

12:22:50:<config>

12:22:50:  <!-- Remote Command Server -->

12:22:50:  <password v='********'/>

12:22:50:

12:22:50:  <!-- Slot Control -->

12:22:50:  <power v='FULL'/>

12:22:50:

12:22:50:  <!-- User Information -->

12:22:50:  <passkey v='********************************'/>

12:22:50:  <team v='xxx'/>

12:22:50:  <user v='xxx'/>

12:22:50:

12:22:50:  <!-- Folding Slots -->

12:22:50:  <slot id='0' type='CPU'>

12:22:50:    <paused v='true'/>

12:22:50:  </slot>

12:22:50:  <slot id='1' type='GPU'>

12:22:50:    <paused v='true'/>

12:22:50:  </slot>

12:22:50:</config>

 

image002.png

You forgot "--runtime=nvidia"

  • Like 1
Link to comment
1 hour ago, Mikey160984 said:

Did anyone try to use more then one nvidia gpu with this container? editing the config for a second gpu slot is no problem, but do both gpus work as they should?

I tested with 2 gpus and it works

There's a screenshot of it in this article: https://blog.linuxserver.io/2020/03/21/covid-19-a-quick-update/

 

By the way, you don't need to edit the config at all (in fact, don't). If you allow both gpus via nvidia arguments, they'll both be used automatically.

  • Like 1
Link to comment

haha... got the same GT710 lying around and so i thought i put it in the server too...

 

so I only have to add another variable with the GPU ID or should I type all two IDs in the same field? maybe you could assist me in this case. could be interesting for more people

 

Edit:

 

Got it, had to reinstall the Container and typed "all" under Nvidia_Visible_Devices now both GPUs are listed, one is folding, second one waits for WU... should work fine I think

Edited by Mikey160984
Link to comment
2 hours ago, aptalca said:

You forgot "--runtime=nvidia"

Thank, there was also package missing.

 

root@Server:~# sudo apt-get install nvidia-docker2

 

docker run -d -it \

  --runtime=nvidia \

  --name=foldingathome \
  -e PUID=1000 \
  -e PGID=1000 \
  -e TZ=Europe/Berlin \

  -e NVIDIA_VISIBLE_DEVICES=all \

  -p 7396:7396 \
  -v /DATAINT/Docker-Conf/foldinghome:/config \
  --restart unless-stopped \

  --name foldingathome \

  linuxserver/foldingathome

 

 

Thank you

Link to comment

Can someone help me with my F@H docker only using one of my CPU cores?

As you can see in the image F@H somehow sees that I have 4 cores but doesn't use them.

 

I already tried CPU pinning, but to no avail.

Maybe some setting inside the docker that I#m not seeing?

 

Thanks in advance

fath.jpg

Link to comment

Any idea why I always get HTTP_NOT_FOUND in this Dockers Logs:

 

04:50:26:WU00:FS00:Connecting to 65.254.110.245:8080
[93m04:50:26:WARNING:WU00:FS00:Failed to get ID from '65.254.110.245:8080': 10001: Server responded: HTTP_NOT_FOUND[0m

04:50:26:WU00:FS00:Connecting to 18.218.241.186:80
[93m04:50:26:WARNING:WU00:FS00:Failed to get ID from '18.218.241.186:80': 10001: Server responded: HTTP_NOT_FOUND[0m

[91m04:50:26:ERROR:WU00:FS00:Exception: Could not get an assignment ID[0m

Thanks in advance.

 

Link to comment

Another short question or little problem... after a WU got folded it will be sent to the f@h servers, and then it displays "cleanup". But it sticks there. Didn't have the problem with the docker from mobiousnine. After restarting the container, the "cleanup" WU is away from the list.

 

The log says:

 

...

21:06:38:WU02:FS00:Cleaning up
21:06:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:06:38:WU02:FS00:Cleaning up
21:06:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:07:38:WU02:FS00:Cleaning up
21:07:38:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:09:15:WU02:FS00:Cleaning up
21:09:15:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:11:52:WU02:FS00:Cleaning up
21:11:52:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:16:07:WU02:FS00:Cleaning up
21:16:07:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"
21:22:58:WU02:FS00:Cleaning up
21:22:58:ERROR:WU02:FS00:Exception: Failed to remove directory './work/02': boost::filesystem::remove: Directory not empty: "./work/02"

 

And I'm not sure, if the points were added to my account as it got not fully finish...

 

Folding too with two other WinPCs and there I do not have this problem.

 

Anyone got same problems?

Edited by Mikey160984
Link to comment
On 3/27/2020 at 1:47 PM, aptalca said:

I tested with 2 gpus and it works

There's a screenshot of it in this article: https://blog.linuxserver.io/2020/03/21/covid-19-a-quick-update/

 

By the way, you don't need to edit the config at all (in fact, don't). If you allow both gpus via nvidia arguments, they'll both be used automatically.

 

Another question for 2 GPUs

If I type in the the docker only the GPU ID I want to use, the sytem still uses the two GPUs (like the "all" argument). Testet with a reinstall (yes, deletet the folder in appdata).
 

Is this normal?

Link to comment

Moved over to Nvidia/Unraid build 6.8.3 today, so I migrated my folding vm over to the f@h docker I already had setup.

 

Compared to GPU only folding VM, which seemed to pump all the cores (4C/4T) on the VM, this smashes a single core.

Hopefully docker only utilising one core isn't a bottleneck for F@H.

Link to comment
7 hours ago, Mikey160984 said:

 

Another question for 2 GPUs

If I type in the the docker only the GPU ID I want to use, the sytem still uses the two GPUs (like the "all" argument). Testet with a reinstall (yes, deletet the folder in appdata).
 

Is this normal?

If you set it to only one gpu's id, f@h will still see both gpus, but it won't be able to start the job on one of them. You'll see an error in the log, something like " no compute devices matched gpu #0 blah blah you may need to update your graphics drivers".

 

I paused that gpu so it no longer receives jobs it won't be able to complete.

  • Like 1
Link to comment
18 minutes ago, SniperkroZ said:

Hi, I am running the docker and have FAHControl set up, but my gpu (yes i know only a 1060) only runs when set to medium and then ramps up to 100%usage.

Is there any way to let the gpu run when set at light power?

no

Link to comment
On 4/1/2020 at 10:50 PM, shiftylilbastrd said:

Is it possible to get this to run only "on idle".

Running on Unraid and it doesn't seem to ever start folding when the option is checked.

 

I have the same problem. With the default config it seems like it will wait for idle forever. Does this work for anyone else? I have my GPU assigned for both my Plex container and now this is that an issue?

Link to comment
24 minutes ago, Michel Amberg said:

I have the same problem. With the default config it seems like it will wait for idle forever. Does this work for anyone else? I have my GPU assigned for both my Plex container and now this is that an issue?

Shouldn't be an issue

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.