MobiusNine

Members
  • Posts

    13
  • Joined

  • Last visited

Everything posted by MobiusNine

  1. I'm not experiencing the issue anymore, but I'm not quite sure what caused it to stop. It could be related to the fact that I removed a second, unnecessary link from my hba to my netapp ds4246 the might have been causing weird issues to occur. It could also have something to do with upgrading to 6.9.0 stable. I had googled around initially and saw a few posts like the one you linked and sort of guessed that it something funky going on with drivers in the newer kernel, but could never confirm the actual cause.
  2. Changed Status to Closed Changed Priority to Minor
  3. The network connection for my server running 6.9.0 RC2 randomly drops and the only way to reestablish it is by rebooting the server. There are instances where it seems the server is running fine for days/weeks, but then I have to do something that requires a reboot and I have to reboot several times hoping the network connection won't drop out. I think it has something to do with this warning regarding a firmware error in the syslog: 'Feb 15 21:29:59 NAS kernel: ixgbe 0000:07:00.0: Warning firmware error detected FWSM: 0xFFFFFFFF' syslog.log
  4. Hey Squid, I was wondering if you could help me figure out why there are portions of my docker-template not being pulled through from github to CA. Here is what the template looks like when you pull it from CA. Here is the template on github. https://github.com/MobiusNine/docker-templates/blob/master/MobiusNine/FoldingAtHome.xml A large portion of the overview that includes information on how to use the docker is missing, which can be problematic.
  5. I'm glad to hear it! Also, after looking into it I don't think the version of Cuda that Folding at Home reports matters as the version of Cuda installed with the drivers in unraid seems to be the latest and the work is passed through to it. I could be wrong, but it works and I can't find anything to the contrary.
  6. the odd thing is that FaH reports Cuda 6.1 for me, but when I run nvidia-smi in the console for the docker it reports 410.78 for the driver and 10.0 for Cuda.
  7. I am passing all as shown in the attached picture, and yes, the latest image is accessible through Community Apps. I wonder if using a quadro is causing some other sort of issue.
  8. If there's interest I could look into integrating some sort of fan control script since there's no real way to control temps/fan speed atm.
  9. I was able to get gpu folding working. The image needed to have opencl installed, which has been taken care of. Here are some logs showing it working. 21:26:25:******************************* System ******************************** 21:26:25: CPU: Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz 21:26:25: CPU ID: GenuineIntel Family 6 Model 62 Stepping 4 21:26:25: CPUs: 32 21:26:25: Memory: 62.94GiB 21:26:25:Free Memory: 2.23GiB 21:26:25: Threads: POSIX_THREADS 21:26:25: OS Version: 4.19 21:26:25:Has Battery: false 21:26:25: On Battery: false 21:26:25: UTC Offset: 0 21:26:25: PID: 31 21:26:25: CWD: /config 21:26:25: OS: Linux 4.19.23-Unraid x86_64 21:26:25: OS Arch: AMD64 21:26:25: GPUs: 1 21:26:25: GPU 0: NVIDIA:7 GP106 [GeForce GTX 1060 6GB] 4372 21:26:25: CUDA: 6.1 21:26:25:CUDA Driver: 10000 21:26:25: <!-- Folding Slots --> 21:26:25: <slot id='0' type='GPU'/> 21:26:25:</config> 21:26:25:Trying to access database... 21:26:25:Successfully acquired database lock 21:26:25:Enabled folding slot 00: READY gpu:0:GP106 [GeForce GTX 1060 6GB] 4372 21:26:25:WU00:FS00:Starting 21:26:25:WU00:FS00:Running FahCore: /opt/fah/usr/bin/FAHCoreWrapper /config/cores/cores.foldingathome.org/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 31 -checkpoint 15 -gpu 0 -gpu-vendor nvidia 21:26:25:WU00:FS00:Started FahCore on PID 41 21:26:25:WU00:FS00:Core PID:45 21:26:25:WU00:FS00:FahCore 0x21 started 21:26:26:WU00:FS00:0x21:*********************** Log Started 2019-03-15T21:26:25Z *********************** 21:26:26:WU00:FS00:0x21:Project: 14163 (Run 43, Clone 1, Gen 85) 21:26:26:WU00:FS00:0x21:Unit: 0x000000730002894c5c38bfb015a28477 21:26:26:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000 21:26:26:WU00:FS00:0x21:Machine: 0 21:26:26:WU00:FS00:0x21:Reading tar file core.xml 21:26:26:WU00:FS00:0x21:Reading tar file integrator.xml 21:26:26:WU00:FS00:0x21:Reading tar file state.xml 21:26:26:WU00:FS00:0x21:Reading tar file system.xml 21:26:26:WU00:FS00:0x21:Digital signatures verified 21:26:26:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core 21:26:26:WU00:FS00:0x21:Version 0.0.18 21:26:29:WU00:FS00:0x21:Completed 0 out of 12500000 steps (0%) 21:26:29:WU00:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900 21:27:41:WU00:FS00:0x21:Completed 125000 out of 12500000 steps (1%) 21:27:41:WU00:FS00:0x21:Completed 125000 out of 12500000 steps (1%) 21:28:53:WU00:FS00:0x21:Completed 250000 out of 12500000 steps (2%)
  10. I can't seem to get the gpu detected when passing it to the container, anyone have any ideas? I tried installing the nvidia driver 410 on the image, but all that did was increase the build time for the image and cause folding at home to not detect cuda, which is did beforehand at version 6.1 09:58:37:************************* Folding@home Client ************************* 09:58:37: Website: http://folding.stanford.edu/ 09:58:37: Copyright: (c) 2009-2014 Stanford University 09:58:37: Author: Joseph Coffland <[email protected]> 09:58:37: Args: --config /config/config.xml 09:58:37: Config: /config/config.xml 09:58:37:******************************** Build ******************************** 09:58:37: Version: 7.4.4 09:58:37: Date: Mar 4 2014 09:58:37: Time: 12:02:38 09:58:37: SVN Rev: 4130 09:58:37: Branch: fah/trunk/client 09:58:37: Compiler: GNU 4.4.7 09:58:37: Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math 09:58:37: -fno-unsafe-math-optimizations -msse2 09:58:37: Platform: linux2 3.2.0-1-amd64 09:58:37: Bits: 64 09:58:37: Mode: Release 09:58:37:******************************* System ******************************** 09:58:37: CPU: Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz 09:58:37: CPU ID: GenuineIntel Family 6 Model 62 Stepping 4 09:58:37: CPUs: 32 09:58:37: Memory: 62.94GiB 09:58:37:Free Memory: 1.49GiB 09:58:37: Threads: POSIX_THREADS 09:58:37: OS Version: 4.19 09:58:37:Has Battery: false 09:58:37: On Battery: false 09:58:37: UTC Offset: 0 09:58:37: PID: 28 09:58:37: CWD: /config 09:58:37: OS: Linux 4.19.23-Unraid x86_64 09:58:37: OS Arch: AMD64 09:58:37: GPUs: 0 09:58:37: CUDA: 6.1 09:58:37:CUDA Driver: 10000 09:58:37:***********************************************************************
  11. Here's my repository. the updated FoldingAtHome is the only template in it. https://github.com/MobiusNine/docker-templates It starts an a7 unit without issue. 07:15:49:WU00:FS00:Starting 07:15:49:WU00:FS00:Running FahCore: /opt/fah/usr/bin/FAHCoreWrapper /config/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 704 -lifeline 28 -checkpoint 15 -np 16 07:15:49:WU00:FS00:Started FahCore on PID 37 07:15:49:WU00:FS00:Core PID:41 07:15:49:WU00:FS00:FahCore 0xa7 started 07:15:49:FS00:Unpaused 07:15:49:WU00:FS00:Starting 07:15:49:WU00:FS00:Running FahCore: /opt/fah/usr/bin/FAHCoreWrapper /config/cores/cores.foldingathome.org/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 704 -lifeline 28 -checkpoint 15 -np 16 07:15:49:WU00:FS00:Started FahCore on PID 37 07:15:49:WU00:FS00:Core PID:41 07:15:49:WU00:FS00:FahCore 0xa7 started 07:15:49:WU00:FS00:0xa7:*********************** Log Started 2019-03-15T07:15:49Z *********************** 07:15:49:WU00:FS00:0xa7:************************** Gromacs Folding@home Core *************************** 07:15:49:WU00:FS00:0xa7: Type: 0xa7 07:15:49:WU00:FS00:0xa7: Core: Gromacs 07:15:49:WU00:FS00:0xa7: Website: https://foldingathome.org/ 07:15:49:WU00:FS00:0xa7: Copyright: (c) 2009-2018 foldingathome.org 07:15:49:WU00:FS00:0xa7: Author: Joseph Coffland <[email protected]> 07:15:49:WU00:FS00:0xa7: Args: -dir 00 -suffix 01 -version 704 -lifeline 37 -checkpoint 15 -np 16 07:15:49:WU00:FS00:0xa7: Config: <none> 07:15:49:WU00:FS00:0xa7:************************************ Build ************************************* 07:15:49:WU00:FS00:0xa7: Version: 0.0.17 07:15:49:WU00:FS00:0xa7: Date: Apr 27 2018 07:15:49:WU00:FS00:0xa7: Time: 19:09:21 07:15:49:WU00:FS00:0xa7: Repository: Git 07:15:49:WU00:FS00:0xa7: Revision: 21359963583d09ec2063ef946399441c4df4ccd7 07:15:49:WU00:FS00:0xa7: Branch: master 07:15:49:WU00:FS00:0xa7: Compiler: GNU 6.3.0 20170516 07:15:49:WU00:FS00:0xa7: Options: -std=gnu++98 -O3 -funroll-loops 07:15:49:WU00:FS00:0xa7: Platform: linux2 4.14.0-3-amd64 07:15:49:WU00:FS00:0xa7: Bits: 64 07:15:49:WU00:FS00:0xa7: Mode: Release 07:15:49:WU00:FS00:0xa7: SIMD: avx_256 07:15:49:WU00:FS00:0xa7:************************************ System ************************************ 07:15:49:WU00:FS00:0xa7: CPU: Intel(R) Xeon(R) CPU E5-2667 v2 @ 3.30GHz 07:15:49:WU00:FS00:0xa7: CPU ID: GenuineIntel Family 6 Model 62 Stepping 4 07:15:49:WU00:FS00:0xa7: CPUs: 32 07:15:49:WU00:FS00:0xa7: Memory: 62.94GiB 07:15:49:WU00:FS00:0xa7:Free Memory: 981.30MiB 07:15:49:WU00:FS00:0xa7: Threads: POSIX_THREADS 07:15:49:WU00:FS00:0xa7: OS Version: 4.19 07:15:49:WU00:FS00:0xa7:Has Battery: false 07:15:49:WU00:FS00:0xa7: On Battery: false 07:15:49:WU00:FS00:0xa7: UTC Offset: 0 07:15:49:WU00:FS00:0xa7: PID: 41 07:15:49:WU00:FS00:0xa7: CWD: /config/work 07:15:49:WU00:FS00:0xa7: OS: Linux 4.19.23-Unraid x86_64 07:15:49:WU00:FS00:0xa7: OS Arch: AMD64 07:15:49:WU00:FS00:0xa7:******************************************************************************** 07:15:49:WU00:FS00:0xa7:Project: 13818 (Run 65, Clone 1, Gen 0) 07:15:49:WU00:FS00:0xa7:Unit: 0x0000000180fccb095c70623aa7181a41 07:15:49:WU00:FS00:0xa7:Digital signatures verified 07:15:49:WU00:FS00:0xa7:Calling: mdrun -s frame0.tpr -o frame0.trr -x frame0.xtc -cpi state.cpt -cpt 15 -nt 16 07:15:49:WU00:FS00:0xa7:Steps: first=0 total=250000 07:15:51:WU00:FS00:0xa7:Completed 2170 out of 250000 steps (0%) 07:15:51:WU00:FS00:0xa7:Completed 2170 out of 250000 steps (0%) 07:16:00:WU00:FS00:0xa7:Completed 2500 out of 250000 steps (1%) 07:16:00:WU00:FS00:0xa7:Completed 2500 out of 250000 steps (1%)
  12. How would I go about doing that? I've tried to figure out how to publish a docker without success so far. Would you be able to point me in the right direction?
  13. I'm fairly certain the work unit issues with captinsano's fah docker container is the result of an old phusion image base. I forked the project on github, updated the dockerfile to include a more recent phusion base, built it on dockerhub, and was able to handle a4's and a7's.