daan_SVK

Members
  • Posts

    116
  • Joined

  • Last visited

Everything posted by daan_SVK

  1. same issue here. Also, what I noticed is that if I open the webterminal while this is happening, the terminal just keeps disconnecting/reconnecting after restarting the gui, the web terminal won't load any more with a 503 Bad gateway error from the browser while this is in the log: Oct 28 15:31:17 Tower nginx: 2021/10/28 15:31:17 [crit] 31423#31423: *20062687 connect() to unix:/var/run/ttyd.sock failed (2: No such file or directory) while connecting to upstream, client: 192.168.1.56, server: , request: "GET /webterminal/ HTTP/1.1", upstream: "http://unix:/var/run/ttyd.sock:/", host: "tower.local", referrer: "http://tower.local/Main"
  2. no, I'm not running any DB containers, my setup is fairly simple. yes, it stays UP for about 10 minutes, then it stops. I might try the Nvidia one as well just to try to isolate the issue.
  3. the Pihole Container has 27hrs uptime so the 10 minute shut down interval is not caused by the Pihole docker restarting. Host Access is enabled on the Docker configuration page, before I enabled it I couldn't get Prometheus to connect to it. When I click the link in Prometheus for Pihole Explorer, it goes to the metrics site with all the parameters as long as the PiHole explorer is running. Grafana still doesnt pull any data though. with my limited knowledge, I don't see anything obvious in the logs I'm afraid. The standard Grafana dashboard works OK.
  4. thanks for y our reply. yes, when I start the Explorer plugin it starts, runs for about 10 minutes and then it stops. Pihole runs on Unraid. Even with the services running, I dont get any data from the Pihole in Grafana:
  5. hi there, I'm trying to set up the Pihole monitoring but am running into two issues: the Pihole target goes down/stopped every 10 minutes or so: and even while is up, I still get No Data in the Pihole Explorer even after updating the JSON model with the Prometheus source. Where would I start looking for the cause? thanks!
  6. Thank you Squid, I will ignore it.
  7. Hey guys, I am getting two new errors in my log and I was wondering if anyone could offer an opinion on how to deal with them Sep 6 06:00:27 Tower nginx: 2021/09/06 06:00:27 [error] 12061#12061: *7083647 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 127.0.0.1, server: , request: "GET /admin/api.php?version HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "127.0.0.1" Sep 6 06:00:27 Tower nginx: 2021/09/06 06:00:27 [error] 12061#12061: *7083649 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 127.0.0.1, server: , request: "GET /admin/api.php?version HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "localhost" I haven't rebooted yet, just wondering if that's something I should start with. I believe those started coming up after I added a few new plugins. Thanks for reading. tower-diagnostics-20210906-1040.zip
  8. thanks for your response, the RAM was MEMtest stress tested before it was installed in this server but I believe it has XMP enabled, does that count as overclocked under these circumstances? this might be a BTRFS specific question but how sever is the corruption? Can it be repaired, given the cache is in Raid1? Wouldn't rebuilding the cache from scratch be a way to rectify the corrupted blocks or are my Appdata backups also corrupted?
  9. so I rebooted the server, started the array in maintenance mode and ran the btrfs check and scrub, resulting in: Status: finished Duration: 0:05:52 Total to scrub: 305.64GiB Rate: 889.13MiB/s Error summary: csum=14 Corrected: 0 Uncorrectable: 14 Unverified: 0 SMART coms back clean but I do see those: 181 Program fail count total 0x0022 100 100 000 Old age Always Never 47244705802 what's the best option here? Replace the drives?
  10. hi there, Just realized my cache pool is read only due to what looks like a file system corruption. I was hoping to run a file system check so I stopped the array but the server got stuck at unmounting the disks. What's my best course of action here with the server stuck at "unmounting" disks? I'd like to get some advise before causing unnecessary damage. thanks for reading. tower-diagnostics-20210615-1448.zip
  11. yes, thanks, I figured it out, just replied too soon to the thread. thanks for all your work and support!
  12. it does indeed seems to stop the logging, I also get this as a result in the log: Apr 15 12:18:12 Tower kernel: NVRM: Persistence mode is deprecated and will be removed in a future release. Please use nvidia-persistenced instead.
  13. went down for me too yesterday. this is what I had to do: -download the new OPEN VPN files from here: https://www.privateinternetaccess.com/helpdesk/kb/articles/where-can-i-find-your-ovpn-files-2 -choose a non CA end point -update the files in the Deluge folder in appdata -profit
  14. ah, yes, thank you for pointing that out. No errors in that one: UUID: d1294b70-d13c-4027-b0b5-12417226b0dc Scrub started: Fri Apr 9 09:41:13 2021 Status: finished Duration: 0:04:15 Total to scrub: 238.90GiB Rate: 959.28MiB/s Error summary: no errors found the cach log still shows 43 corrupted entries Apr 8 09:21:28 Tower kernel: sdc: sdc1 Apr 8 09:21:28 Tower kernel: sd 2:0:0:0: [sdc] Attached SCSI disk Apr 8 09:21:28 Tower kernel: BTRFS: device fsid d1294b70-d13c-4027-b0b5-12417226b0dc devid 2 transid 9310751 /dev/sdc1 scanned by udevd (1642) Apr 8 09:21:59 Tower emhttpd: MTFDDAK512TBN-1AR1ZABHA_UGXVL01J1BF7U0 (sdc) 512 1000215216 Apr 8 09:21:59 Tower emhttpd: import 31 cache device: (sdc) MTFDDAK512TBN-1AR1ZABHA_UGXVL01J1BF7U0 Apr 8 09:21:59 Tower emhttpd: read SMART /dev/sdc Apr 8 11:21:28 Tower kernel: BTRFS info (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 23, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 24, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 25, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 26, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 27, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 28, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 29, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 30, gen 0 Apr 8 12:01:02 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 31, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 35, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 36, gen 0 Apr 8 13:00:54 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 37, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 38, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 39, gen 0 Apr 8 14:01:09 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 40, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 41, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 42, gen 0 Apr 8 15:01:11 Tower kernel: BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 43, gen 0 Apr 9 09:22:25 Tower emhttpd: read SMART /dev/sdc Apr 9 09:40:16 Tower emhttpd: read SMART /dev/sdc Apr 9 09:40:30 Tower kernel: BTRFS info (device sdb1): bdev /dev/sdc1 errs: wr 0, rd 0, flush 0, corrupt 43, gen 0 what's my option addressing those
  15. thanks JorgeB, I deleted the file I thought caused the mover to hang and run the read only scur, however I dont see any corrupted files listed on the output: [1/7] checking root items [2/7] checking extents [3/7] checking free space tree [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) Opening filesystem to check... Checking filesystem on /dev/sdb1 UUID: d1294b70-d13c-4027-b0b5-12417226b0dc cache and super generation don't match, space cache will be invalidated found 128267644928 bytes used, no error found total csum bytes: 20018700 total tree bytes: 368099328 total fs tree bytes: 303235072 total extent tree bytes: 33718272 btree space waste bytes: 64207774 file data blocks allocated: 1508281032704 referenced 126905487360
  16. I will try to disable XMP and run the RAM on a stock setting. the server was up for over 60 days prior, also it was stress tested with memtest for two days before putting into production. is there something I can do to resolve the cache corruption? this being a pool of two devices, I should be able to run a corrective scrub, correct?
  17. You mean somebody else set it up for you? I set it up a long time ago. Outside of being waste of space, this is not an issue, is it? again, this is original configuration. I think some of those disks are over 6 years old. I tend not to fix things if they aren't broken. I was about to replace the oldest drive with the old parity drive. this has to do most likely with the parity corruption you pointed out. It was most likely trying to move the movie listed on the log: Apr 8 12:01:02 Tower shfs: copy_file: /mnt/cache/Movies/Cronos (1993)/Cronos 1993 Bluray-1080p.mkv /mnt/disk3/Movies/Cronos (1993)/Cronos 1993 Bluray-1080p.mkv (5) Input/output error Apr 8 12:01:03 Tower crond[2026]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Apr 8 13:00:54 Tower shfs: copy_file: /mnt/cache/Movies/Cronos (1993)/Cronos 1993 Bluray-1080p.mkv /mnt/disk3/Movies/Cronos (1993)/Cronos 1993 Bluray-1080p.mkv (5) Input/output error Apr 8 13:00:54 Tower kernel: btrfs_print_data_csum_error: 12 callbacks suppressed no SMART errors reported from any of the drives. So the Mover is the culprit here? Should Mover be disabled in similar cases? I was under the impression the array should be fully operational while the parity is calculated. I certainly never experienced any issues on my monthly parity checks. Also, should I be concerned with the cache corruption? How would I go about addressing that? Should I scrub it in maintainance mode once the parity rebuild has been compleated?
  18. no particular reason to be honest, this is how it was originally set up. I started the array again few hour ago, seems to be rebuilding without issues at the moment (at 8.5% now). Server is responsive and all dockers and services are up. I disconnected the USB caddy that was running the pre-clear on the old parity drive in case that might have be the issue. new diagnostic file is attached. thanks again for reviewing it, much appreciated. tower-diagnostics-20210408-1335.zip
  19. thanks Trurl, I waited overnight and the writing to the array stopped. I rebooted it as you suggested and the server came back up claiming the parity rebuild finished but I still get an "Parity Invalid" exclamation mark on the parity drive. I have attached the diagnostics as per your suggestion. Should I just start the array attempting to rebuild the parity again? The disk was propperly precleared originally. tower-diagnostics-20210408-0928.zip
  20. So I popped in a new pre-cleared parity drive, run into a bump with pre-clearing the old parity drive as per this thread and I thought that I'm golden. The system went unresponsive when the parity rebiuld was around 18%. Last thing I noticed was that the CPU usage went to 100% which never happens as I have an 8 core x5600 and just a bunch of light weight dockers. this is where I'm at: - the gui won't load - getting 500 Internal Server Error from the browser - I hooked up a monitor to the server, logged in, ran htop and thats where the console froze and doesn't take any input - the server still seems to be writing to the array or parity, but I can only tell by the solid LED light on the chassis - the old parity drive was hooked up for pre-clearing in an external USB dock and last time I saw it paused - the server still pings fine getting a little spooked here, what are my options? - should I just wait for 2 days in hope the new parity will eventually recalculate and the array comes back alive? - should I disconnect the old parity drive in the caddy that I was pre-clearing thanks for reading, fingers crossed.
  21. this must be it, I've done this in the past once or twice and don't remember having an issue. Pre-clear is running now so I should be able to use it as a replacement for my 6y old WD RED once completed. thank again for your help, much appreciated!
  22. yes, that what I was doing, running it from the main page and got the error. Running it from the Tools seems to be working, the pre-clear started just fine. Odd, I was always under the impression it didn't matter where you running the pre-clear from. thanks again!
  23. hey there guys, I just upgraded my parity drive from 10TB to 12TB. It is rebuilding as we speak. However, before assigning the old 10TB parity drive to the array, i thought I'll clear it just to be safe. Popped it into an external USB case and tried to mount it with Unnasigned Devices and this is what I get: Apr 7 09:59:27 Tower unassigned.devices: Mount of '/dev/sdh1' failed. Error message: mount: /mnt/disks/ST10000DM0004-2GR11L_ZJV697T7: wrong fs type, bad option, bad superblock on /dev/sdh1, missing codepage or helper program, or other error. Attempting to delete the partition yields this: Apr 7 09:49:29 Tower unassigned.devices: Remove parition failed result 'Error: end of file while reading /dev/sdh ' I know the drive is good, used it as parity for two years. Should I be concerned with the above errors before popping it into the array without pre-clearing? It just threw me off. thanks!
  24. ok, I found the answer in Binex's FAQ: this was Q26: https://github.com/binhex/documentation/blob/master/docker/faq/vpn.md thanks again
  25. hi guys, Binhex, my SONARR and RADARR containers no longer connect to the Binhex-Deluge container with PIA configured proxy. This is strange, all the indexers test OK in Jacket, Deluge is also able to download torretns OK but the Sonarr and Radarr containers no longer connect to Jacket or Deluge. When I disable Privoxy in the Binhex-Deluge container everything works again just fine so I know it has to do with the Privoxy setup. here is snip from the log: [v3.0.2.4552] NzbDrone.Core.Download.Clients.DownloadClientUnavailableException: Unable to connect to Deluge, please check your settings ---> System.Net.WebException: The operation has timed out.: 'http://192.168.1.24:8112/json' ---> System.Net.WebException: The operation has timed out. at System.Net.HttpWebRequest.GetResponse() at NzbDrone.Common.Http.Dispatchers.ManagedHttpDispatcher.GetResponse(HttpRequest request, CookieContainer cookies) in D:\a\1\s\src\NzbDrone.Common\Http\Dispatchers\ManagedHttpDispatcher.cs:line 146 --- End of inner exception stack trace --- at NzbDrone.Common.Http.Dispatchers.ManagedHttpDispatcher.GetResponse(HttpRequest request, CookieContainer cookies) in D:\a\1\s\src\NzbDrone.Common\Http\Dispatchers\ManagedHttpDispatcher.cs:line 146 at NzbDrone.Common.Http.HttpClient.ExecuteRequest(HttpRequest request, CookieContainer cookieContainer) in D:\a\1\s\src\NzbDrone.Common\Http\HttpClient.cs:line 123 at NzbDrone.Common.Http.HttpClient.Execute(HttpRequest request) in D:\a\1\s\src\NzbDrone.Common\Http\HttpClient.cs:line 57 at NzbDrone.Core.Download.Clients.Deluge.DelugeProxy.ExecuteRequest[TResult](JsonRpcRequestBuilder requestBuilder, String method, Object[] arguments) in D:\a\1\s\src\NzbDrone.Core\Download\Clients\Deluge\DelugeProxy.cs:line 283 --- End of inner exception stack trace --- at NzbDrone.Core.Download.Clients.Deluge.DelugeProxy.ExecuteRequest[TResult](JsonRpcRequestBuilder requestBuilder, String method, Object[] arguments) in D:\a\1\s\src\NzbDrone.Core\Download\Clients\Deluge\DelugeProxy.cs:line 283 at NzbDrone.Core.Download.Clients.Deluge.DelugeProxy.ProcessRequest[TResult](DelugeSettings settings, String method, Object[] arguments) in D:\a\1\s\src\NzbDrone.Core\Download\Clients\Deluge\DelugeProxy.cs:line 244 at NzbDrone.Core.Download.Clients.Deluge.DelugeProxy.GetTorrentsByLabel(String label, DelugeSettings settings) in D:\a\1\s\src\NzbDrone.Core\Download\Clients\Deluge\DelugeProxy.cs:line 97 at NzbDrone.Core.Download.Clients.Deluge.Deluge.GetItems() in D:\a\1\s\src\NzbDrone.Core\Download\Clients\Deluge\Deluge.cs:line 117 at NzbDrone.Core.Download.TrackedDownloads.DownloadMonitoringService.ProcessClientDownloads(IDownloadClient downloadClient) in D:\a\1\s\src\NzbDrone.Core\Download\TrackedDownloads\DownloadMonitoringService.cs:line 89