Release: Folding@Home Docker

March 21, 20206 yr

On 3/17/2020 at 2:15 AM, saarg said:

Not sure if those cards are still supported by the driver. You also need to install the nvidia plugin and download the nvidia build.

For the driver to work, you have to remove the blacklisting so the driver can load.

Oh, okay. I'll keep running it with CPU only, then.

There's no way to only get COVID19 work units, is there?

Quote

March 21, 20206 yr

10 minutes ago, cyberspectre said:

Oh, okay. I'll keep running it with CPU only, then.

There's no way to only get COVID19 work units, is there?

F@H (and probably BOINC now) are prioritizing covid WU's. (With F@H, you select that you're fighting for any disease) But, do to the massive surge in people folding and doing their part, WUs on both platforms for covid aren't always available. BOINC (which has internationally lower adoptance than F@H) seems to never run out of WUs in the interim, whereas F@H seems to be always running short of everything.

Net result is that I'm running both as they are complimentary researches and not mutually exclusive.

Quote

March 21, 20206 yr

<slot id='0' type='CPU'>
<paused v='true'/>
</slot>
<slot id='3' type='GPU'>
<paused v='true'/>
</slot>
<slot id='2' type='GPU'>

This is the entry I put into the Folding Slots, it shows both of my cards but only my 1070 only works. Now I can't tell you why it does that and how to choose one or the other. All I know it works now....Good Luck.

Quote

March 21, 20206 yr

3 hours ago, IGOBYD said:


<slot id='0' type='CPU'>
<paused v='true'/>
</slot>
<slot id='3' type='GPU'>
<paused v='true'/>
</slot>
<slot id='2' type='GPU'>

This is the entry I put into the Folding Slots, it shows both of my cards but only my 1070 only works. Now I can't tell you why it does that and how to choose one or the other. All I know it works now....Good Luck.

Definitely worked! Check the log below:

05:08:55: 
05:08:55: <slot id='0' type='CPU'>
05:08:55: <paused v='true'/>
05:08:55: </slot>
05:08:55: <slot id='3' type='GPU'>
05:08:55: <paused v='true'/>
05:08:55: </slot>
05:08:55: <slot id='2' type='GPU'/>
05:08:55:</config>
05:08:55:Trying to access database...
05:08:55:Successfully acquired database lock
05:08:55:Enabled folding slot 00: PAUSED cpu:9 (by user)
05:08:55:Enabled folding slot 03: PAUSED gpu:0:GP107 [GeForce GTX 1050 LP] 1862 (by user)
05:08:55:Enabled folding slot 02: READY gpu:1:GP102 [GeForce GTX 1080 Ti] 11380

Quote

March 21, 20206 yr

1420911743_Schermafdrukvan2020-03-2122-11-26.png.95f7ae21bc41c6f1cd9ab22220b033fe.png

2 GPU's assign to FaH, bold are working, no cpu slot enabled (it still use some cpu cores to pass work to the gpu's, for my setup its using 4 cores +/-75%)

config.xml:

  <slot id='0' type='GPU'/>
  <slot id='1' type='GPU'/>

Docker setting, NVIDIA_VISIBLE_DEVICES:

all

if you have a nvidia gpu assign to a VM make it disabled for FaH! or the system will crash.

I'm not really sure how to do that (i have an AMD card) but if i understand it correct you can change the syslinux configuration:

pci-stub.ids=xxxx:xxxx,xxxx:xxxx (IOMMU id, graphic/vga and sound of it!), after change reboot system.

Quote

March 23, 20206 yr

On 3/18/2020 at 6:39 PM, DayspringGaming said:

After having been folding most of the week, I'm not getting the following errors in my logs:

Any ideas?


22:37:52:WU01:FS00:0xa7:ERROR:

22:37:52:WU01:FS00:0xa7:ERROR:-------------------------------------------------------

22:37:52:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown

22:37:52:WU01:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902

22:37:52:WU01:FS00:0xa7:ERROR:

22:37:52:WU01:FS00:0xa7:ERROR:Fatal error:

22:37:52:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.37225 nm

22:37:52:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings

22:37:52:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition

22:37:52:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS

22:37:52:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors

22:37:52:WU01:FS00:0xa7:ERROR:-------------------------------------------------------

22:37:57:WU01:FS00:0xa7:WARNING:Unexpected exit() call

22:37:57:WU01:FS00:0xa7:WARNING:Unexpected exit from science code

I have the same issue.

Quote

March 23, 20206 yr

So I have setup everything now the docker claims 40+k points and just the first 9k are registered.

Also, if I restart the container it gets reassigned and starts folding if I leave it be, nothing happens.

Very different behavior compared to my windows machine folding.

Getting a few warnings and errors:

ERROR:WU03:FS01:Exception: Failed to connect to 40.114.52.201:80: Connection timed out

ERROR:WU03:FS01:Exception: Could not get an assignment

WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11779 run:0 clone:8658 gen:5 core:0x22 unit:

is this expected?

Quote

March 24, 20206 yr

How is this possible?

874843349_Screenshot2020-03-24at23_34_41.png.64214e5a90f03e6e6163c0ad0f918368.png

when set up for a Nvidia GPU? It is a AMD Radeon RX580 in the server as well. The log file show both GPU`s, but it is grabbing GPU:0, and not GPU1: which is the Nvidia.

Any thoughts for fixing this?

<gpu v='true'/>

#Following allows access from local network
<allow v='10.X.X.0/24'/>

#Change password for remote access
<password v='PASSWORD'/>

#Change Team Number and Username if desired. Currently folding for UnRAID team!
<team v='227802'/> 
<user v='frodm'/> 
<passkey v='xxxxxxxxxxxxxxxxxxxxx'/>

#Following allows access from local network
<web-allow v='10.X.X.0/24/24'/>

<power v='medium'/>


<slot id='0' type='CPU'/>
<slot id='1' type='GPU'/>
</config>

FaH logfile.rtf

Edited March 24, 20206 yr by frodr

Quote

March 26, 20206 yr

I was able to get both GPU`s and CPU up in the control panel. I added to the folding slots:

<slot id='0' type='GPU'/>
<slot id='1' type='GPU'/>
<slot id='2' type='CPU'/>
</config>

Two questions:

1) How to disable GPU:0?

Removing slot id=0 is not possible. Removing id=1 and only the AMD card is visible in the Web Control.

2) The Folding keeps stopping, and like now, it is not starting again. Any thoughts?

1625846568_Screenshot2020-03-26at16_53_29.png.d2bfdadf72b7782ee772b2cb003cef43.png

Logg:

Mar 26 15:17:01 a26d1e0fc4ed CRON[40]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
15:39:00:WU01:FS02:Connecting to 65.254.110.245:8080
[93m15:39:00:WARNING:WU01:FS02:Failed to get assignment from '65.254.110.245:8080': No WUs available for this configuration[0m
15:39:00:WU01:FS02:Connecting to 18.218.241.186:80
[93m15:39:01:WARNING:WU01:FS02:Failed to get assignment from '18.218.241.186:80': No WUs available for this configuration[0m
[91m15:39:01:ERROR:WU01:FS02:Exception: Could not get an assignment[0m

Quote

March 29, 20206 yr

On 3/25/2020 at 6:51 AM, frodr said:

Any thoughts for fixing this?

AMD card wont work in the docker, it doesn't have the drivers present in unraid OS.

The Nvidia card can work with the lsio nvidia build of unraid (which adds drivers and nvidia runtime), however if you want both cards working, create a vm for them with a couple of cores, and retain the docker for CPU only.

Edited March 29, 20206 yr by tjb_altf4

Quote

March 30, 20206 yr

On 3/29/2020 at 10:41 AM, tjb_altf4 said:

AMD card wont work in the docker, it doesn't have the drivers present in unraid OS.

The Nvidia card can work with the lsio nvidia build of unraid (which adds drivers and nvidia runtime), however if you want both cards working, create a vm for them with a couple of cores, and retain the docker for CPU only.

I'm aware of AMD not working. But FoldingatHome Docker is picking up this GPU for uknown reasons. If I set up according to SIO's video and correct nvidia plugin setup, the FoldingatHome Docker picks up only the AND GPU. First when I added another GPU slot, the Nvidia GPU showed in the Control Panel.

Now I want to get rid of the AMD in the Control Panel. And to get the docker working.

Quote

April 9, 20206 yr

Hi!

Since a few weeks ago I'm running the docker for F@H and I got some issues... the first probably related to the docker, the second probably not: any help on the matter is appreciated!

1. Impossible to cleanup work folders

From time to time F@H has trouble cleaning up the work folder since a "fuse" file there and it's not removable by the application...

Here the error:

15:52:51:WU00:FS00:Cleaning up
15:52:51:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: Directory not empty: "./work/00"

Here the content:

# v work/00/
total 6868
-rw-r--r-- 0 nobody users 7029760 Apr  9 17:36 .fuse_hidden0000a8d90000004b

Here the "lsof":

# lsof work/00/.fuse_hidden0000a8d90000004b 
COMMAND     PID   USER   FD   TYPE DEVICE SIZE/OFF              NODE NAME
FAHCoreWr 11585 nobody    8r   REG   0,41  7029760 10977524093294902 work/00/.fuse_hidden0000a8d90000004b
FahCore_a 11589 nobody    8r   REG   0,41  7029760 10977524093294902 work/00/.fuse_hidden0000a8d90000004b

I can manually force the deletion but it would be preferable that the system was able to do it autonomously...

Any idea?

2. WU not compatible with CPU?

At the beginning I used to get a lot of "No WUs available for this configuration" and some "gromacs" errors...

Here an example of the "gromacs" error:

05:39:44:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:44:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
05:39:44:WU01:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
05:39:44:WU01:FS00:0xa7:ERROR:
05:39:44:WU01:FS00:0xa7:ERROR:Fatal error:
05:39:44:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:39:44:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:39:44:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
05:39:44:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
05:39:44:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
05:39:44:WU01:FS00:0xa7:ERROR:-------------------------------------------------------

After a not-so-deep research on various forums I got that the number of available "cpus" can determine the ability of the application to organize the job... also there was some comment about using only multipliers of 6 for this "cpus" parameter... maybe I misunderstood something but I applied that workaround and limited the numbers of "cpus" that F@H can use directly from within the config.xml:

<config>
  <!-- Folding Slot Configuration -->
  <cpus v='18'/>

  <!-- Slot Control -->
  <power v='FULL'/>

  <!-- User Information -->
  <passkey v='***********'/>
  <team v='***********'/>
  <user v='***********'/>

  <!-- Folding Slots -->
  <slot id='0' type='CPU'/>
  <!-- slot id='1' type='GPU'/ -->
</config>

This enabled me to start folding.

A note: the UnRaid machine is an AMD 3950X (16C/32T).

Another note: I also have a "service" GPU (GeForce GT 730) that never received a job so I disabled it to remove it from the UI.

Can anyone confirm this behavior?

Edited April 9, 20206 yr by sirfaber

Quote

April 17, 20206 yr

just got this up and running ... is there a UnRaid team i can join?

Quote

April 17, 20206 yr

14 minutes ago, EvilTiger said:

just got this up and running ... is there a UnRaid team i can join?

227802

Quote

April 17, 20206 yr

I used the trick from page five of this discussion, including cache, etc, and Chrome gives me this:

Access to 10.0.1.187 was denied
You don't have authorization to view this page.
HTTP ERROR 403

Quote

April 20, 20206 yr

I haven't been getting WUs for my NVIDIA gpu for a couple days now. I paused the slot, restarted it and immediately got a WU.

Something to try out if you aren't getting any units.

Quote

April 20, 20206 yr

On 4/9/2020 at 12:29 PM, sirfaber said:
2. WU not compatible with CPU?

At the beginning I used to get a lot of "No WUs available for this configuration" and some "gromacs" errors...

Here an example of the "gromacs" error:
05:39:44:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
05:39:44:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
05:39:44:WU01:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-avx-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
05:39:44:WU01:FS00:0xa7:ERROR:
05:39:44:WU01:FS00:0xa7:ERROR:Fatal error:
05:39:44:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
05:39:44:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
05:39:44:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
05:39:44:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
05:39:44:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
05:39:44:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
After a not-so-deep research on various forums I got that the number of available "cpus" can determine the ability of the application to organize the job... also there was some comment about using only multipliers of 6 for this "cpus" parameter... maybe I misunderstood something but I applied that workaround and limited the numbers of "cpus" that F@H can use directly from within the config.xml:
<config>
  
  <cpus v='18'/>

  
  <power v='FULL'/>

  
  <passkey v='***********'/>
  <team v='***********'/>
  <user v='***********'/>

  
  <slot id='0' type='CPU'/>
  
</config>
This enabled me to start folding.

A note: the UnRaid machine is an AMD 3950X (16C/32T).

Another note: I also have a "service" GPU (GeForce GT 730) that never received a job so I disabled it to remove it from the UI.

Can anyone confirm this behavior?

I tried this and it did not work for me. I'm still getting the following in my logs for this particular work unit, all others worked fine:

ERROR:There is no domain decomposition for 10 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm

I'm on an AMD Ryzen R9 3900X (12C/24T). I edited the /config/confix.xml file in the docker container with vim while it was running. Tried restarting after editing the file, still same error.

Quote

April 22, 20206 yr

On 4/9/2020 at 12:29 PM, sirfaber said:
1. Impossible to cleanup work folders

From time to time F@H has trouble cleaning up the work folder since a "fuse" file there and it's not removable by the application...

Here the error:
15:52:51:WU00:FS00:Cleaning up
15:52:51:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: Directory not empty: "./work/00"
Here the content:
# v work/00/
total 6868
-rw-r--r-- 0 nobody users 7029760 Apr  9 17:36 .fuse_hidden0000a8d90000004b
Here the "lsof":
# lsof work/00/.fuse_hidden0000a8d90000004b 
COMMAND     PID   USER   FD   TYPE DEVICE SIZE/OFF              NODE NAME
FAHCoreWr 11585 nobody    8r   REG   0,41  7029760 10977524093294902 work/00/.fuse_hidden0000a8d90000004b
FahCore_a 11589 nobody    8r   REG   0,41  7029760 10977524093294902 work/00/.fuse_hidden0000a8d90000004b
I can manually force the deletion but it would be preferable that the system was able to do it autonomously...

Any idea?

Did you ever discover how to correct this?

I installed the docker container today and this is happening on every WU.

EDIT: They eventually clean themselves up. I did nothing and after a few more WU it was able to clean up after itself.

Edited April 23, 20206 yr by draeh

Quote

June 27, 20206 yr

On 4/22/2020 at 9:48 PM, draeh said:

Did you ever discover how to correct this?

I installed the docker container today and this is happening on every WU.

EDIT: They eventually clean themselves up. I did nothing and after a few more WU it was able to clean up after itself.

Nope.

Actually I stopped looking at F@H logs altogether

It's working and crunching data and that's what matters.

Quote

June 28, 20206 yr

Now that there is an option to fold specifically for Covid-19 in the menu's; is there an update that can push this functionality out to docker instances? It is available in the client on windows, but when i try to manage and change it to covid through the dropdown menu in the basic view, it doesn't show as an option. As well, I get an error when attempting the same change via the "advanced view" (running on my main win box), which i enabled remote access to change my docker settings on my unraid machine.

Quote

July 4, 20206 yr

Does anybody have an idea why my F@H will not use more than 2 CPU cores when I have 24 available? I can't figure out why it is not using the rest of them. I am using the latest LSIO docker.

I have tried all of the following below with no success:

New installation with default configuration, no CPU pinning
Changing power from MEDIUM to FULL
CPU pinning to specific cores (CPU 2 - 10 HT; 18 total)
Specifying the number of cores (18) in config.xml

When I look in the F@H container log I can see that it recognizes my CPU correctly as it shows the following:

Quote

01:30:28:******************************* System ********************************
01:30:28: CPU: Intel(R) Xeon(R) CPU X5680 @ 3.33GHz
01:30:28: CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
01:30:28: CPUs: 24
01:30:28: Memory: 47.23GiB
01:30:28:Free Memory: 275.46MiB
01:30:28: Threads: POSIX_THREADS

Here is what my config.xml file looks like. Any advice or ideas would be very much appreciated I am pretty stumped.

Quote

01:30:28:***********************************************************************
01:30:28:<config>
01:30:28: 
01:30:28: <cause v='COVID_19'/>
01:30:28: <cpus v='18'/>

:
01:30:28: 
01:30:28: <allow v='172.16.1.0/24'/>

:
01:30:28: 
01:30:28: <power v='FULL'/>

:
01:30:28: 
01:30:28: <passkey v='*****'/>
01:30:28: <team v='*****'/>
01:30:28: <user v='*****'/>

:
01:30:28: 
01:30:28: <web-allow v='172.16.1.0/24'/>

:
01:30:28: 
01:30:28: <slot id='0' type='CPU'>
01:30:28: <paused v='true'/>
01:30:28: </slot>
01:30:28:</config>

Quote

August 3, 20205 yr

I've been using the docker container to fold with 4 GPUs / no CPU. Everything seem to be working well but I've noticed that the container seems to use a CPU core for each GPU slot, and each CPU core it uses is pinned at 100% utilization. Is anyone else getting similar behavior?

I realize the GPUs have to be fed data to fold, but it seems like that shouldn't take up 100% of a core. The CPUs are Xeon x5690s, so not exactly new, but not slouches either. Can anyone offer any thoughts? Am I misunderstanding something in how all this works?

Quote

August 4, 20205 yr

1 hour ago, Execut1ve said:

I've been using the docker container to fold with 4 GPUs / no CPU. Everything seem to be working well but I've noticed that the container seems to use a CPU core for each GPU slot, and each CPU core it uses is pinned at 100% utilization. Is anyone else getting similar behavior?

I realize the GPUs have to be fed data to fold, but it seems like that shouldn't take up 100% of a core. The CPUs are Xeon x5690s, so not exactly new, but not slouches either. Can anyone offer any thoughts? Am I misunderstanding something in how all this works?

That's normal. The CPU thread is used to load data to and from the GPU and it's a substantial amount of data to load.

That's why it's important to ensure you pin the right cores for the F@H docker to prevent lag to the important stuff.

Quote

August 4, 20205 yr

12 hours ago, testdasi said:

That's normal. The CPU thread is used to load data to and from the GPU and it's a substantial amount of data to load.

That's why it's important to ensure you pin the right cores for the F@H docker to prevent lag to the important stuff.

Hm, I wonder if I'd notice a hit to folding performance if I assigned the container 2 cores and 2 hyperthreads instead of 4 cores? Time for some experimentation!

Quote

August 5, 20205 yr

After some informal experimentation, I'm not seeing much difference (if any) in my total PPD between allocating the container 4 cores (with nothing on the HTs) VS allocating 2 cores with their 2 HTs.

For reference I'm folding on 4 GPUs: 3 of the Zotac 1060 mining variants and 1 GTX960. They are connected to the mainboard via powered PCIE riser cables. Two of them are in x8 slots and two are in x4 slots. All the PCIE slots are Gen 2. The computer is a PowerEdge R710 server with dual Xeon X5690 processors. I'm averaging 800k-1M total PPD, with each card sitting in the 200k-250k range. I don't notice any substantial difference between the cards on the x4 slots vs the x8 slots.

Can anyone else with a hyperthreaded CPU offer any observations?

Quote

Release: Folding@Home Docker

Featured Replies

Top Posters In This Topic

Popular Days

Most Popular Posts

SpaceInvaderOne

cyberspectre

MobiusNine

Posted Images

Join the conversation

Top Posters In This Topic

Popular Days

Most Popular Posts

SpaceInvaderOne

cyberspectre

MobiusNine

Posted Images

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)