mbc0

Members
  • Posts

    1118
  • Joined

  • Last visited

  • Days Won

    2

Everything posted by mbc0

  1. I already have this setup, sorry, I was not clear but that is what I meant by no useful information in the log at all. I am using the existing PSU from my previous threadripper setup, all I have changed is motherboard & cpu I have the Dynamix System Temperature plugin installed and normally around 26C CPU, highest I have seen is 40C when being hammered. I can see no more I can do with this setup, maybe an intermittant issue with the motherboard? since installation I have had RAM related issues, corrupt NVME cache drive, lost all dockers & VM's (all seperate occasions) but when running, everything is perfect for 1-3 days. I will transplant the threadripper back in and setup another server using this suspect motherboard/cpu config to see what happens, unless you can think of anything else I can try. The only reason(s) for doing all this were to save energy costs.
  2. Hi, I have had nothing but problems with this new motherboard/cpu config M/B:ASRock Z790 Pro RS/D4 Version BIOS:American Megatrends International, LLC. Version 4.01. Dated: 01/06/2023 CPU:13th Gen Intel® Core™ i5-13600K @ 3465 MHz When the server locks up, there is still power, the network card is flashing but I cannot access the server even directly with a monitor plugged in or SSL etc, nothing. I am now using a 3rd set of 4 DIMMS since problems began so would safely say it's not the RAM Temps are never above 40C so confident of not being a temperature related issue. I have attached DIAGS since powering off/on at 18:12 tonight I have logs written to a share but there is nothing of any use written in the log and no errors. The BIOS has been reset to standard, no power limits, no overclock, everything totally standard. Where can I go from here to find the problem please? I think I am going to have to swap back to my threadripper for reliability and put this motherboard/cpu config into a test bench. unraid1-diagnostics-20230226-1820.zip
  3. Been using shrmn/gsdock for years without issue, I did not realise there was another repo?
  4. I ran the 4 sticks I took out in memtest for over 24 hours in another board with no issues and as all 8 sticks have been running without issue in my previous config I am confident there is no physical issue with the RAM (All Corsair Vengeance) I have not overclocked but did choose a power saving option (The reason for changing from a Threadripper) in the BIOS of which I cannot remember but I am going to change that back to standard just to eliminate what the problem could be. I have also enabled power saving in tips and tweaks as this is what brought me under the 150w I was looking for, do you think that could have any influence on this problem?
  5. This is worrying! I replaced all 4 sticks of RAM the other day so they chances of it actually being the RAM are slim to none I would have thought. It is a brand new motherboard and CPU, my old motherboard had 8 slots for RAM and never missed a beat I have now had these errors on both sets of 4 so I am making the assumption that it is not the RAM causing the issue so maybe the motherboard? (Asrock z790 pro rs/d4) I am on the latest BIOS, are there any steps I can take to try and get on top of this issue?
  6. Unable to stop the server currently 😞 Feb 21 11:22:18 UNRAID1 emhttpd: shcmd (2491271): umount /mnt/cache Feb 21 11:22:18 UNRAID1 root: umount: /mnt/cache: target is busy. Feb 21 11:22:18 UNRAID1 emhttpd: shcmd (2491271): exit status: 32 Feb 21 11:22:18 UNRAID1 emhttpd: Retry unmounting disk share(s)... Feb 21 11:22:23 UNRAID1 emhttpd: Unmounting disks... Feb 21 11:22:23 UNRAID1 emhttpd: shcmd (2491272): umount /mnt/cache Feb 21 11:22:23 UNRAID1 root: umount: /mnt/cache: target is busy. Feb 21 11:22:23 UNRAID1 emhttpd: shcmd (2491272): exit status: 32 Feb 21 11:22:23 UNRAID1 emhttpd: Retry unmounting disk share(s)... Feb 21 11:22:28 UNRAID1 emhttpd: Unmounting disks... Feb 21 11:22:28 UNRAID1 emhttpd: shcmd (2491273): umount /mnt/cache Feb 21 11:22:28 UNRAID1 root: umount: /mnt/cache: target is busy. Feb 21 11:22:28 UNRAID1 emhttpd: shcmd (2491273): exit status: 32 Feb 21 11:22:28 UNRAID1 emhttpd: Retry unmounting disk share(s)...
  7. Hi @JorgeB I have just had my server crash and have attached diags, is this related to my other crashes you helped me with last week? the server is on 24/7 and this is the first problem since the possible RAM issue/corruption I had last week. Warning: mkdir(): Input/output error in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 351 unraid1-diagnostics-20230221-1113.zip
  8. Hey @KluthR All done, my backups were ok and not broken but I changed my USB location as you advised. I do think the problem is physically reading the files mentioned in Nextcloud, do you think that could be the problem as it still fails. backup.log
  9. Hi yes, I have run a couple of manual backups trying to diagnose the issue.
  10. Hi, Can anyone please help me out with what could be causing the input/output error I am getting with CA Backup on my Nextcloud container? The container is stopped and I do not even have calendar installed on Nextcloud [20.02.2023 03:42:10] Backing Up: nextcloud /usr/bin/tar: nextcloud/www/nextcloud/apps/calendar/: Cannot savedir: Input/output error /usr/bin/tar: Exiting with failure status due to previous errors [20.02.2023 03:43:21] tar creation/extraction failed! [20.02.2023 03:43:21] Verifying Backup nextcloud [20.02.2023 03:44:26] Backing Up: unifi-controller [20.02.2023 03:44:46] Verifying Backup unifi-controller [20.02.2023 03:44:51] Backing Up: zigbee2mqtt [20.02.2023 03:44:51] Verifying Backup zigbee2mqtt [20.02.2023 03:44:51] done [20.02.2023 03:44:51] Starting gsdock... (try #1) done! [20.02.2023 03:44:55] Starting mariadb... (try #1) done! [20.02.2023 03:44:59] Starting nextcloud... (try #1) done! [20.02.2023 03:45:04] Starting unifi-controller... (try #1) done! [20.02.2023 03:45:12] A error occurred somewhere. Not deleting old backup sets of appdata [20.02.2023 03:45:12] Backup / Restore Completed
  11. Hi, Can anyone please help me out with what could be causing the input/output error I am getting with CA Backup on my Nextcloud container? The container is stopped and I do not even have calendar installed on Nextcloud [20.02.2023 03:42:10] Backing Up: nextcloud /usr/bin/tar: nextcloud/www/nextcloud/apps/calendar/: Cannot savedir: Input/output error /usr/bin/tar: Exiting with failure status due to previous errors [20.02.2023 03:43:21] tar creation/extraction failed! [20.02.2023 03:43:21] Verifying Backup nextcloud [20.02.2023 03:44:26] Backing Up: unifi-controller [20.02.2023 03:44:46] Verifying Backup unifi-controller [20.02.2023 03:44:51] Backing Up: zigbee2mqtt [20.02.2023 03:44:51] Verifying Backup zigbee2mqtt [20.02.2023 03:44:51] done [20.02.2023 03:44:51] Starting gsdock... (try #1) done! [20.02.2023 03:44:55] Starting mariadb... (try #1) done! [20.02.2023 03:44:59] Starting nextcloud... (try #1) done! [20.02.2023 03:45:04] Starting unifi-controller... (try #1) done! [20.02.2023 03:45:12] A error occurred somewhere. Not deleting old backup sets of appdata [20.02.2023 03:45:12] Backup / Restore Completed
  12. Thank you, I will read through this to better monitor the pools, again, huge thanks for your help!
  13. ok, so have scrubbed all 3 pools and no errors found on any of them!
  14. Attached, many thanks unraid1-diagnostics-20230218-1125.zip
  15. Thank you @JorgeB 1, I have run the memtest which passed but I have another 4 sticks which I have now replaced the existing RAM with. 2, You say there was existing corruption? how do I address that please? 3, Is this not an NVME problem then? "BTRFS error (device nvme2n1p1): block=161059291136 write time tree block corruption detected" 4, I will look into ipvlan now. Thanks again!
  16. I also, just saw this in the log when I started to shutdown after I took the diags Feb 18 09:48:42 UNRAID1 kernel: BUG: Bad rss-counter state mm:00000000ee5ad6d6 type:MM_ANONPAGES val:1 Feb 18 09:48:45 UNRAID1 kernel: I/O error, dev loop3, sector 83360 op 0x1:(WRITE) flags 0x1800 phys_seg 8 prio class 0 Feb 18 09:48:45 UNRAID1 kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 Feb 18 09:48:45 UNRAID1 kernel: I/O error, dev loop3, sector 188192 op 0x1:(WRITE) flags 0x1800 phys_seg 8 prio class 0 Feb 18 09:48:45 UNRAID1 kernel: BTRFS error (device loop3): bdev /dev/loop3 errs: wr 2, rd 0, flush 0, corrupt 0, gen 0 Feb 18 09:48:45 UNRAID1 kernel: BTRFS: error (device loop3: state A) in __btrfs_update_delayed_inode:999: errno=-5 IO failure Feb 18 09:48:45 UNRAID1 kernel: BTRFS info (device loop3: state EA): forced readonly Feb 18 09:48:45 UNRAID1 kernel: BTRFS: error (device loop3: state EA) in __btrfs_run_delayed_items:1092: errno=-5 IO failure Feb 18 09:48:45 UNRAID1 kernel: BTRFS warning (device loop3: state EA): Skipping commit of aborted transaction. Feb 18 09:48:45 UNRAID1 kernel: BTRFS: error (device loop3: state EA) in cleanup_transaction:1982: errno=-5 IO failure Feb 18 09:48:45 UNRAID1 kernel: docker0: port 4(vethb7c8447) entered disabled state Feb 18 09:48:45 UNRAID1 kernel: vethd950096: renamed from eth0 Feb 18 09:48:45 UNRAID1 root: Error response from daemon: error while removing network: network br0 id b58d2467fa57ae7061498d882571927884c846ee782347c86f3071b85475f1f0 has active endpoints
  17. Hi, I woke this morning to find most of my dockers stopped, it looks like ca backup stopped my dockers ready to backup as the backup folder was created but it is empty and I can see all these BTRFS errors on nvme2n1p1 but I cannot work out which physical drive this is? This is the 3rd nvme related problem I have had on this new motherboard in as many weeks so really need to get to the bottom of it, any help will be greatly appreciated! unraid1-diagnostics-20230218-0934.zip
  18. Hi, I feel I should know the answer to this but can I use the iGPU on my 13600K in a VM and dockers like Plex still have access to it?
  19. OK, so I carried out a repair and everything is now back looking like it should do but there is no lost+found folder on disk 10 so does that mean I have not lost any files?
  20. Found some problems! Should I do a standard repair? Phase 1 - find and verify superblock... - block cache size set to 1481960 entries Phase 2 - using internal log - zero log... zero_log: head block 443485 tail block 443485 - scan filesystem freespace and inode maps... - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 illegal attribute format -36, ino 8589934725 bad attribute fork in inode 8589934725, would clear attr fork would have cleared inode 8589934725 - agno = 5 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 2 - agno = 5 - agno = 1 - agno = 3 - agno = 0 - agno = 4 illegal attribute format -36, ino 8589934725 bad attribute fork in inode 8589934725, would clear attr fork would have cleared inode 8589934725 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - agno = 0 - agno = 1 - agno = 2 - agno = 3 - agno = 4 Metadata corruption detected at 0x46e19c, inode 0x200000085 dinode couldn't map inode 8589934725, err = 117 - agno = 5 - traversal finished ... - moving disconnected inodes to lost+found ... disconnected dir inode 10841621675, would move to lost+found Phase 7 - verify link counts... Metadata corruption detected at 0x46e19c, inode 0x200000085 dinode couldn't map inode 8589934725, err = 117, can't compare link counts No modify flag set, skipping filesystem flush and exiting.