Rhomax

Members
  • Posts

    8
  • Joined

  • Last visited

Rhomax's Achievements

Noob

Noob (1/14)

0

Reputation

  1. I've had a 3 (Tb) drive fail which I have replaced and commenced the data-rebuild on. After about 24 hours (at approximately 45% complete) the rate of the rebuild dropped to about 500KB/s (from an initial ~ 10MB/s). I noticed that another drive was now showing around ~2,000 errors. I did a restart and now have the same problem even earlier with only 14% rebuilt after 13 hours and a rate of ~700KB/s. It seems likely to me that the problem is that I have another drive failing (now showing 31 errors) and it's consequently slowing down everything because it's read speed is limited. At this rate it seems to me that it's unlikely to ever complete the rebuild as the estimated finish varies between 30 and 60 days. I only have the one parity drive (as most of this data is not that crucial) and I'm ok to lose some data but would like to 'limit the damage' and know which files were affected. Does anyone have any suggestions as to what I do here? - Should I copy off as much data from the currently emulated drive to another location and then start with that drive from scratch? - Should I copy off as much data as possible from the currently failing disk and then replace that one? - Any other ideas?
  2. Actually, looks like this latest image worked once I removed the 'disable NVML' config option. I'd set it to true whilst trying to work out what was going on. Thanks for your help @PTRFRLL ! Really appreciated it
  3. Thanks for your help! Same NVML issue unfortunately... this is what the logs look like. Much the same as before 20211201 03:15:53 T-Rex NVIDIA GPU miner v0.24.7 - [Linux] 20211201 03:15:53 r.3aed4eddcdb3 20211201 03:15:53 20211201 03:15:53 20211201 03:15:53 NVIDIA Driver version N/A 20211201 03:15:53 20211201 03:15:53 + GPU #0: [00:01.0|2489] GeForce RTX 3060 Ti, 7982 MB 20211201 03:15:53 20211201 03:15:53 WARN: DevFee 1% (ethash) 20211201 03:15:53 20211201 03:15:53 URL : stratum+tcp://daggerhashimoto.usa.nicehash.com:3353 20211201 03:15:53 PASS: x 20211201 03:15:53 WRK : Unraid-TRex 20211201 03:15:53 20211201 03:15:53 WARN: NVML is disabled. You won't see GPUs stats. 20211201 03:15:53 Starting on: daggerhashimoto.usa.nicehash.com:3353 20211201 03:15:53 ApiServer: HTTP server started on 0.0.0.0:4067 20211201 03:15:53 ---------------------------------------------------- 20211201 03:15:53 For control navigate to: http://172.17.0.7:4067/trex 20211201 03:15:53 ---------------------------------------------------- 20211201 03:15:53 Using protocol: stratum2. 20211201 03:15:53 TREX: Can't initialize device [ID=0, GPU #0], NVML wasn't initialized 20211201 03:15:53 WARN: Miner is going to shutdown... 20211201 03:15:53 Extranonce is set to: 4b5c0e 20211201 03:15:53 Authorizing... 20211201 03:15:53 Main loop finished. Cleaning up resources... 20211201 03:15:53 ApiServer: stopped listening on 0.0.0.0:4067 20211201 03:15:53 Authorized successfully. 20211201 03:15:53 ethash epoch: 457, diff: 1.04 G 20211201 03:15:56 T-Rex finished.
  4. I get an unauthorised message Error response from daemon: Head https://ghcr.io/v2/ptrfrll/nv-docker-trex/manifests/test: unauthorized
  5. Tried updating the Nvidia drivers to the 495.44 and recreated the docker several time and still have the same result. The summary is: Using no tags or tags CUDA11 or latest or 3.5.2 I get the following error: TREX: Can't initialize device [ID=0, GPU #0], NVML wasn't initialized I'm wondering if the NVML error message is not the actual root cause and it's down to something like device initialisation more generally but this is what I see on nvidia-smi which doesn't have anything obvious stand out to me. +-----------------------------------------------------------------------------+ | NVIDIA-SMI 495.44 Driver Version: 495.44 CUDA Version: 11.5 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A | | 53% 54C P2 45W / 200W | 131MiB / 7982MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1646 C /trex/t-rex 129MiB | +-----------------------------------------------------------------------------+
  6. When I use the latest tag I get the following error... It's interesting because previously I didn't have any tag at all and it was working fine. I assumed that this meant it was defaulting to CUDA10 which is why I added the CUDA11 tag in the first place. Can't start T-Rex, can't initialize CUDA engine, cuda exception: CUDA_ERROR_COMPAT_NOT_SUPPORTED_ON_DEVICE. Is NVIDIA driver installed?
  7. I'm using CUDA11 with ptrfrll/nv-docker-trex:cuda11
  8. Great work on this, have been using this for months with no major issues. I did an update and now I consistently get the following error... followed by Miner shutdown. 20211129 20:59:17 TREX: Can't initialize device [ID=0, GPU #0], NVML wasn't initialized I tried disabling NVML in the config file and that hasn't helped. Any ideas?