daan_SVK

Members
  • Posts

    115
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

daan_SVK's Achievements

Apprentice

Apprentice (3/14)

3

Reputation

  1. no, I'm not running any DB containers, my setup is fairly simple. yes, it stays UP for about 10 minutes, then it stops. I might try the Nvidia one as well just to try to isolate the issue.
  2. the Pihole Container has 27hrs uptime so the 10 minute shut down interval is not caused by the Pihole docker restarting. Host Access is enabled on the Docker configuration page, before I enabled it I couldn't get Prometheus to connect to it. When I click the link in Prometheus for Pihole Explorer, it goes to the metrics site with all the parameters as long as the PiHole explorer is running. Grafana still doesnt pull any data though. with my limited knowledge, I don't see anything obvious in the logs I'm afraid. The standard Grafana dashboard works OK.
  3. thanks for y our reply. yes, when I start the Explorer plugin it starts, runs for about 10 minutes and then it stops. Pihole runs on Unraid. Even with the services running, I dont get any data from the Pihole in Grafana:
  4. hi there, I'm trying to set up the Pihole monitoring but am running into two issues: the Pihole target goes down/stopped every 10 minutes or so: and even while is up, I still get No Data in the Pihole Explorer even after updating the JSON model with the Prometheus source. Where would I start looking for the cause? thanks!
  5. Thank you Squid, I will ignore it.
  6. Hey guys, I am getting two new errors in my log and I was wondering if anyone could offer an opinion on how to deal with them Sep 6 06:00:27 Tower nginx: 2021/09/06 06:00:27 [error] 12061#12061: *7083647 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 127.0.0.1, server: , request: "GET /admin/api.php?version HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "127.0.0.1" Sep 6 06:00:27 Tower nginx: 2021/09/06 06:00:27 [error] 12061#12061: *7083649 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 127.0.0.1, server: , request: "GET /admin/api.php?version HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "localhost" I haven't rebooted yet, just wondering if that's something I should start with. I believe those started coming up after I added a few new plugins. Thanks for reading. tower-diagnostics-20210906-1040.zip
  7. thanks for your response, the RAM was MEMtest stress tested before it was installed in this server but I believe it has XMP enabled, does that count as overclocked under these circumstances? this might be a BTRFS specific question but how sever is the corruption? Can it be repaired, given the cache is in Raid1? Wouldn't rebuilding the cache from scratch be a way to rectify the corrupted blocks or are my Appdata backups also corrupted?
  8. so I rebooted the server, started the array in maintenance mode and ran the btrfs check and scrub, resulting in: Status: finished Duration: 0:05:52 Total to scrub: 305.64GiB Rate: 889.13MiB/s Error summary: csum=14 Corrected: 0 Uncorrectable: 14 Unverified: 0 SMART coms back clean but I do see those: 181 Program fail count total 0x0022 100 100 000 Old age Always Never 47244705802 what's the best option here? Replace the drives?
  9. hi there, Just realized my cache pool is read only due to what looks like a file system corruption. I was hoping to run a file system check so I stopped the array but the server got stuck at unmounting the disks. What's my best course of action here with the server stuck at "unmounting" disks? I'd like to get some advise before causing unnecessary damage. thanks for reading. tower-diagnostics-20210615-1448.zip
  10. yes, thanks, I figured it out, just replied too soon to the thread. thanks for all your work and support!
  11. it does indeed seems to stop the logging, I also get this as a result in the log: Apr 15 12:18:12 Tower kernel: NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead.
  12. went down for me too yesterday. this is what I had to do: -download the new OPEN VPN files from here: https://www.privateinternetaccess.com/helpdesk/kb/articles/where-can-i-find-your-ovpn-files-2 -choose a non CA end point -update the files in the Deluge folder in appdata -profit
  13. ah, yes, thank you for pointing that out. No errors in that one: UUID: d1294b70-d13c-4027-b0b5-12417226b0dc Scrub started: Fri Apr 9 09:41:13 2021 Status: finished Duration: 0:04:15 Total to scrub: 238.90GiB Rate: 959.28MiB/s Error summary: no errors found the cach log still shows 43 corrupted entries Apr 8 09:21:28 Tower kernel: sdc: sdc1 Apr 8 09:21:28 Tower kernel: sd 2:0:0:0: [sdc] Attached SCSI disk Apr 8 09:21:28 Tower kernel: BTRFS: device fsid d1294b70-d13c-4027-b0b5-12417226b0dc devid 2 transid 9310751 /dev/sdc1 scanned by udevd (1642) Apr 8 09:21:59 Tower emhttpd: MTFDDAK512TBN-1AR1ZABHA_UGXVL01J1BF7U0 (sdc) 512 1000215216 Apr 8 09:21:59 Tower emhttpd: import 31 cache device: (sdc) MTFDDAK512TBN-1AR1ZABHA_UGXVL01J1BF7U0 Apr 8 09:21:59 Tower emhttpd: read SMART /dev/sdc Apr 8 11:21:28 Tower kernel: BTRFS info (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 23, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 24, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 25, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 26, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 27, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 28, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 29, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 30, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 31, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 35, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 36, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 37, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 38, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 39, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 40, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 41, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 42, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 43, gen 0 Apr 9 09:22:25 Tower emhttpd: read SMART /dev/sdc Apr 9 09:40:16 Tower emhttpd: read SMART /dev/sdc Apr 9 09:40:30 Tower kernel: BTRFS info (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 43, gen 0 what's my option addressing those
  14. thanks JorgeB, I deleted the file I thought caused the mover to hang and run the read only scur, however I dont see any corrupted files listed on the output: [1/7] checking root items [2/7] checking extents [3/7] checking free space tree [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) Opening filesystem to check... Checking filesystem on /dev/sdb1 UUID: d1294b70-d13c-4027-b0b5-12417226b0dc cache and super generation don't match, space cache will be invalidated found 128267644928 bytes used, no error found total csum bytes: 20018700 total tree bytes: 368099328 total fs tree bytes: 303235072 total extent tree bytes: 33718272 btree space waste bytes: 64207774 file data blocks allocated: 1508281032704 referenced 126905487360
  15. I will try to disable XMP and run the RAM on a stock setting. the server was up for over 60 days prior, also it was stress tested with memtest for two days before putting into production. is there something I can do to resolve the cache corruption? this being a pool of two devices, I should be able to run a corrective scrub, correct?