Madfox

Members
  • Posts

    7
  • Joined

  • Last visited

Everything posted by Madfox

  1. Thank you @JorgeB. This indeed worked, no more mentions of errors on the disk. Happy to have had the script to even be notified of the corruption in the first place. So I guess if I do not encounter any more errors on this disc, there is no HW to replace?
  2. Thank you, I've ran: btrfs dev stats -z /mnt/disk1 Ouput was: [/dev/md1].write_io_errs 0 [/dev/md1].read_io_errs 0 [/dev/md1].flush_io_errs 0 [/dev/md1].corruption_errs 18 [/dev/md1].generation_errs 0 Now scrubbing again with repair errors on, will report back after this is done.
  3. I see, When I did it via restore option in Duplicati, it could not edit the file. Removing it by hand worked! However, there are still errors added to the log. Even from this morning I see now: Dec 27 09:25:49 Tower kernel: btrfs_print_data_csum_error: 3 callbacks suppressed Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2269184 csum 0x8941f998 expected csum 0x56f66b68 mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2273280 csum 0x8941f998 expected csum 0x6f4cb633 mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2277376 csum 0x8941f998 expected csum 0x611605c0 mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2281472 csum 0x6b9109a4 expected csum 0xefb5e681 mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 11, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 12, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 13, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 14, gen 0 Dec 27 09:25:49 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 27 09:25:49 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 15, gen 0 Those error should be in the logs attached in the previous post. What to do about this?
  4. Hello, I've been looking around for the solution I'm still unsure of how to proceed. I got an error from unraid on my disk 1: "ERRORS on disk 1 No description". On investigation, the log gave 10 errors: Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2269184 csum 0x8941f998 expected csum 0x56f66b68 mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2273280 csum 0x8941f998 expected csum 0x6f4cb633 mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2277376 csum 0x8941f998 expected csum 0x611605c0 mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2281472 csum 0x6b9109a4 expected csum 0xefb5e681 mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0 Dec 24 22:04:55 Tower kernel: BTRFS warning (device md1): csum failed root 5 ino 13270418 off 2265088 csum 0x006f85b2 expected csum 0x0fba521a mirror 1 Dec 24 22:04:55 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 Disk and SMART reported healthy.. So I found that I could initiate a btrfs scrub with repair from the disk page. Did that and it seems there are 5 errors left, all in the same file. UUID: 0fd6d4ef-feb6-4ae4-bfe1-4fc322633363 Scrub started: Mon Dec 26 17:16:23 2022 Status: finished Duration: 15:10:33 Total to scrub: 4.87TiB Rate: 93.50MiB/s Error summary: csum=5 Corrected: 0 Uncorrectable: 5 Unverified: 0 Dec 26 21:19:01 Tower kernel: BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0 Dec 26 21:19:01 Tower kernel: BTRFS error (device md1): unable to fixup (regular) error at logical 3898780192768 on dev /dev/md1 Dec 26 21:19:01 Tower kernel: BTRFS warning (device md1): checksum error at logical 3898780196864 on dev /dev/md1, physical 1603128565760, root 5, inode 13270418, offset 2269184, length 4096, links 1 (path: Photos/2017.05 - Taiwan/DSCN6177.JPG) I tough I could repair the corrupted file from a backup, but the restore fails because the file is not accessible (logical if its not fixed I guess). My questions are: 1. How to proceed from here: I see replies on the forum where people mention formatting the whole drive, but could I not repair this file/ sector on the disk alone? This is the first time I've encountered this, do I have to check the RAM for faults? 2. Should I do more to prevent this corruption in the future? I only got the error because of a backup I was running which stopped due to the file not being accessible. Attached diagnostics + syslog tower-diagnostics-20221227-0959.zip tower-syslog-20221227-0859.zip
  5. I found out UNRAID is set to boot in legacy already - would UEFI be something to try? I mean that its weird that when I normally reboot the system, the GPU is recognized in "system devices" but when using the drivers it seem like the only way to get the GPU to be recognized after bootup is by turning off the system and starting it (complete shutdown). Its almost as if the GPU gets in a state and needs to be reset or something - and the only way to do this is by removing the power... No - its startup up using a WOL request - but off most of the time. I guess the only thing left now is resetting the bios - so that will be my next move. *update - resetting bios settings did also not change the situation. I'm out of ideas! btw, I've had the situation now that the driver was loaded and the GPU recognized by Unraid, but the driver did not see the GPU. Only after starting and stopping the VM did the driver actually work. The situation did not survive a reboot unfortunately.. Any suggestions are very welcome!
  6. Thanks for the suggestions! Yes, it was active SSH was disabled so didn't check - I did hook up an external monitor and it was booted and could logon there. Yes, incl vbios. Its not bound to VFIO at boot. Not sure how its booting, I think it accepts both- I'll change it in the bios to legacy. If that doesn't work, there is a beta bios available for this board: I'll update that if the above doesn't work out. One thing I'll say is that it's weird that the problem of the GPU not being recognized after reboot only seems to happen with the drivers installed..
  7. Hi! I'm hoping you guys can help me out: Background: I'm trying to get the nvidia drivers to work in order to get the power consumption of the server in idle down. I only use the GPU for some casual gaming, but would like it to lower poweruse when not.. I got this working up to certain extent, Last time however when I rebooted it wouldn't load the web-gui. I had to fall back to an older flash image... GPU: NVIDIA Corporation GM204 [GeForce GTX 970] CPU: i7-8700k mobo: MSI B360M PRO-VDH Now I'm trying to figure out how I can get it working but still have trouble: After installing Nvidia drivers (tried latest and production) and rebooting. The GPU is not recognized in the "system devices". Syslog: May 4 21:18:14 Tower root: plugin: libvirtwol.plg installed May 4 21:18:14 Tower root: plugin: installing: /boot/config/plugins/nvidia-driver.plg May 4 21:18:14 Tower root: plugin: running: anonymous May 4 21:18:14 Tower root: plugin: skipping: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz already exists May 4 21:18:14 Tower root: plugin: running: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz May 4 21:18:14 Tower root: May 4 21:18:14 Tower root: +============================================================================== May 4 21:18:14 Tower root: | Installing new package /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz May 4 21:18:14 Tower root: +============================================================================== May 4 21:18:14 Tower root: May 4 21:18:14 Tower root: Verifying package nvidia-driver-2021.04.29.txz. May 4 21:18:14 Tower root: Installing package nvidia-driver-2021.04.29.txz: May 4 21:18:14 Tower root: PACKAGE DESCRIPTION: May 4 21:18:14 Tower root: Package nvidia-driver-2021.04.29.txz installed. May 4 21:18:14 Tower root: plugin: creating: /usr/local/emhttp/plugins/nvidia-driver/README.md - from INLINE content May 4 21:18:14 Tower root: plugin: running: anonymous May 4 21:18:15 Tower root: May 4 21:18:15 Tower root: --------------------Nvidia driver v460.73.01 found locally--------------------- May 4 21:18:15 Tower root: May 4 21:18:15 Tower root: -----------------Installing Nvidia Driver Package v460.73.01------------------- May 4 21:18:15 Tower kernel: device br0 entered promiscuous mode May 4 21:18:34 Tower kernel: nvidia: loading out-of-tree module taints kernel. May 4 21:18:34 Tower kernel: nvidia: module license 'NVIDIA' taints kernel. May 4 21:18:34 Tower kernel: Disabling lock debugging due to kernel taint May 4 21:18:34 Tower kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 245 May 4 21:18:34 Tower kernel: NVRM: No NVIDIA GPU found. May 4 21:18:34 Tower kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 245 May 4 21:18:34 Tower root: May 4 21:18:34 Tower root: --------------Installation of Nvidia driver v460.73.01 successful-------------- May 4 21:18:34 Tower root: plugin: nvidia-driver.plg installed May 4 21:18:34 Tower root: plugin: installing: /boot/config/plugins/tips.and.tweaks.plg May 4 21:18:34 Tower root: plugin: running: anonymous May 4 21:18:34 Tower root: plugin: skipping: /boot/config/plugins/tips.and.tweaks/tips.and.tweaks-2021.03.09.tgz already exists May 4 21:18:34 Tower root: plugin: running: anonymous May 4 21:18:34 Tower root: plugin: skipping: /boot/config/plugins/tips.and.tweaks/tips.and.tweaks.cfg already exists May 4 21:18:34 Tower root: plugin: running: anonymous May 4 21:18:34 Tower root: Only after shutting down and removing and starting again will de GPU show op again. However, when I look at the driver plugin, it will say no devices are found. Can someone point me into the direction of a solution? *update after a few times restarting, it seems to have loaded the GPU again and it recognizes the GPU again. May 4 21:54:39 Tower root: plugin: libvirtwol.plg installed May 4 21:54:39 Tower root: plugin: installing: /boot/config/plugins/nvidia-driver.plg May 4 21:54:39 Tower root: plugin: running: anonymous May 4 21:54:39 Tower root: plugin: skipping: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz already exists May 4 21:54:39 Tower root: plugin: running: /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz May 4 21:54:39 Tower root: May 4 21:54:39 Tower root: +============================================================================== May 4 21:54:39 Tower root: | Installing new package /boot/config/plugins/nvidia-driver/nvidia-driver-2021.04.29.txz May 4 21:54:39 Tower root: +============================================================================== May 4 21:54:39 Tower root: May 4 21:54:39 Tower root: Verifying package nvidia-driver-2021.04.29.txz. May 4 21:54:39 Tower root: Installing package nvidia-driver-2021.04.29.txz: May 4 21:54:39 Tower root: PACKAGE DESCRIPTION: May 4 21:54:39 Tower root: Package nvidia-driver-2021.04.29.txz installed. May 4 21:54:39 Tower root: plugin: creating: /usr/local/emhttp/plugins/nvidia-driver/README.md - from INLINE content May 4 21:54:39 Tower root: plugin: running: anonymous May 4 21:54:40 Tower kernel: device br0 entered promiscuous mode May 4 21:54:40 Tower root: May 4 21:54:40 Tower root: --------------------Nvidia driver v460.73.01 found locally--------------------- May 4 21:54:40 Tower root: May 4 21:54:40 Tower root: -----------------Installing Nvidia Driver Package v460.73.01------------------- May 4 21:55:00 Tower kernel: nvidia: loading out-of-tree module taints kernel. May 4 21:55:00 Tower kernel: nvidia: module license 'NVIDIA' taints kernel. May 4 21:55:00 Tower kernel: Disabling lock debugging due to kernel taint May 4 21:55:00 Tower kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 245 May 4 21:55:00 Tower kernel: May 4 21:55:00 Tower kernel: nvidia 0000:01:00.0: enabling device (0000 -> 0003) May 4 21:55:00 Tower kernel: nvidia 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none May 4 21:55:00 Tower kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 460.73.01 Thu Apr 1 21:40:36 UTC 2021 May 4 21:55:00 Tower kernel: Linux agpgart interface v0.103 May 4 21:55:00 Tower root: May 4 21:55:00 Tower root: --------------Installation of Nvidia driver v460.73.01 successful-------------- May 4 21:55:00 Tower root: plugin: nvidia-driver.plg installed May 4 21:55:00 Tower root: plugin: installing: /boot/config/plugins/tips.and.tweaks.plg May 4 21:55:00 Tower root: plugin: running: anonymous May 4 21:55:00 Tower root: plugin: skipping: /boot/config/plugins/tips.and.tweaks/tips.and.tweaks-2021.03.09.tgz already exists May 4 21:55:00 Tower root: plugin: running: anonymous May 4 21:55:00 Tower kernel: nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms 460.73.01 Thu Apr 1 21:32:31 UTC 2021 May 4 21:55:00 Tower kernel: [drm] [nvidia-drm] [GPU ID 0x00000100] Loading driver May 4 21:55:00 Tower kernel: [drm] Initialized nvidia-drm 0.0.0 20160202 for 0000:01:00.0 on minor 0 May 4 21:55:00 Tower root: plugin: skipping: /boot/config/plugins/tips.and.tweaks/tips.and.tweaks.cfg already exists May 4 21:55:00 Tower root: plugin: running: anonymous *Update2 After rebooting and taking the power off the GPU is visible but the driver does not recognize it, no devices were found (production driver) May 4 22:08:20 Tower kernel: DMAR: DRHD: handling fault status reg 3 May 4 22:08:20 Tower kernel: DMAR: [DMA Read] Request device [01:00.0] PASID ffffffff fault addr 4dd400f000 [fault reason 06] PTE Read access is not set May 4 22:08:24 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:24 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:28 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:28 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:32 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:32 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:37 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:37 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:41 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:41 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:45 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:45 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:49 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:49 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:54 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:54 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:08:58 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:08:58 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:02 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:02 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:07 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:07 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:11 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:11 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:15 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:15 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:19 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:19 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:24 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:24 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:28 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:28 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:32 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:32 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:37 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:37 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:41 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:41 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:45 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:45 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:50 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:50 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 May 4 22:09:54 Tower kernel: NVRM: GPU 0000:01:00.0: RmInitAdapter failed! (0x26:0x65:1290) May 4 22:09:54 Tower kernel: NVRM: GPU 0000:01:00.0: rm_init_adapter failed, device minor number 0 So is the trouble my GPU (why does it needs shutting down every time to be recognized?) and why can't I get the driver stable in running? I've attached diagnostics.. tower-diagnostics-20210504-2211 - GPU visible but driver not recognized.zip