BurningSky

Members
  • Posts

    48
  • Joined

  • Last visited

Everything posted by BurningSky

  1. The server recovers by itself, how long would I need to stay in safe mode for. I'll close the browser but I've always had it open before so not sure why that would start to be the issue now?
  2. I've had Unraid hang twice in the last 24 hours, both times for around 20-40 minutes, before recovering again. Syslog looks the same for both times, messages about disk temps being around 30C followed by "nginx: 2024/03/20 14:00:29 [error] 831#831: *31271114 limiting requests, excess: 21.000 by zone "authlimit", client: 192.168.3.30, server: , request: "GET /login HTTP/2.0"" then "php-fpm[7716]: [WARNING] [pool www] child 10760 exited on signal 9 (SIGKILL) after 936.425570 seconds from start" type messages. Any insights would be appreciated! ragon-diagnostics-20240320-1616.zip
  3. Thanks, it happened again so I'll format the cache once I've copied data off
  4. Thanks for the feedback, I'll keep an eye on it. Doesn't point to a disk failure though?
  5. I noticed a container had stopped with a message about the log being RO so having looked at a couple of other similar looking issues decided to delete the docker.img to see if that would help. After that the Docker service wouldn't restart and I noticed in the logs it looked like a cache issue so I rebooted to see if that would resolve it. That let to the cache showing an "Unmountable: unsupported or no file system" error. Based on another forum post I ran "btrfs rescue zero-log /dev/sdi1" and restared the array and the pool appears to be back now but I'm worried if there is a deeper issue? I've attached 2 sets of logs, 1411 was before I ran the rescue command and 1414 is after. Does it look like sdi1 is going to fail or was there another potential cause. ragon-diagnostics-20240208-1414.zip ragon-diagnostics-20240208-1411.zip
  6. Ah of course! Thanks, forgot about the tags webpage!
  7. Found the culprit, it was Pi Alert, so that has been removed now!
  8. I'm having the same issue so might downgrade until there seems to be a resolution, how do I do that on this container? Would setting the repo to koenkk/zigbee2mqtt:1.35.1 work?
  9. I checked this morning and there haven't been any containers that have restarted in line with the messages. I updated 2 containers 18mins ago so they show as more recent status, the others are 29+ hours but in the last 18mins there have been 7 more messages. Is there anything else I should be looking at rather than just status on docker ps?
  10. I just got a message that my /var/log is getting full and when I look there are hundreds of repeats of Jan 21 10:03:35 Ragon kernel: device eth0 left promiscuous mode Jan 21 10:06:51 Ragon kernel: device eth0 entered promiscuous mode I've seen some people mention similar issues with multiple dockers on the same port but I can't see any similar issues, most of my containers are on br0 if they use a repeated port like 8080 with their own IPs. I have a Realtek NIC but I've installed the driver from CA for that. ragon-diagnostics-20240121-1004.zip
  11. After doing some digging I've added pcie_aspm=off to my syslinux config so I will see if that resolves the issue.
  12. I have a mPCIe Coral TPU connected to a PCIe to mPCIe adaptor which is then passed through to Frigate Docker which seems to be working but today I got a notification that my syslog is filling up and I've had a look and see this error relating to the Coral device but I don't know what could be causing this error? The Docker container is running in priviledged mode and passed through via /dev/apex_0 Nov 5 08:46:19 Ragon kernel: pcieport 0000:00:01.3: AER: Multiple Corrected error received: 0000:25:00.0 Nov 5 08:46:19 Ragon kernel: apex 0000:25:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) Nov 5 08:46:19 Ragon kernel: apex 0000:25:00.0: device [1ac1:089a] error status/mask=00000041/00006000 Nov 5 08:46:19 Ragon kernel: apex 0000:25:00.0: [ 0] RxErr Nov 5 08:46:19 Ragon kernel: apex 0000:25:00.0: [ 6] BadTLP ragon-diagnostics-20231105-0852.zip
  13. I didn't realise you couldn't have 2 types of detectors at the same time! Commented out the cuda one and it's working now!
  14. I sent back the Coral USB I was having issues with and have swapped it with a mini PCIE one which I have connected to a PCIE to mPCIE adaptor but I am still having some issues. I have installed the drivers and it is being recognised by the Coral Driver app: Coral TPU1: Status: ALIVE Temperature: 39.30 °C Frequency: 500 MHz Driver Version: 1.2 Framework Version: 1.1.4 I can also see the device under sysdevs: [1ac1:089a] 25:00.0 System peripheral: Global Unichip Corp. Coral Edge TPU I had previously deleted the section for mapping the TPU but I have readded it with the following: Config Type: Device Name: Coral TPU/NCS2 Mapping Value: /dev/apex_0 Description: Use /dev/bus/usb for USB devices and /dev/apex_0 for PCIe devices (you must install the drivers first for PCIe devices). Remove this if you are not using it. In Frigate I added the following to detectors: detectors: coral: type: edgetpu device: pci But I get an error saying that no EdgeTPU was detected, have I misconfigured something? 2023-11-03 12:49:18.767219029 [INFO] Preparing go2rtc config... 2023-11-03 12:49:18.767580529 [INFO] Starting Frigate... 2023-11-03 12:49:18.768623682 [INFO] Starting NGINX... 2023-11-03 12:49:18.957519850 [WARN] Using go2rtc binary from '/config/go2rtc' instead of the embedded one 2023-11-03 12:49:18.960160839 [INFO] Starting go2rtc... 2023-11-03 12:49:19.063972373 12:49:19.063 INF go2rtc version 1.8.1 linux/amd64 2023-11-03 12:49:19.064282260 12:49:19.064 INF [api] listen addr=0.0.0.0:1984 2023-11-03 12:49:19.064866554 12:49:19.064 INF [rtsp] listen addr=0.0.0.0:8554 2023-11-03 12:49:19.064871792 12:49:19.064 INF [webrtc] listen addr=0.0.0.0:8555/tcp 2023-11-03 12:49:19.771751395 [2023-11-03 12:49:19] frigate.app INFO : Starting Frigate (0.12.1-367d724) 2023-11-03 12:49:19.811722914 [2023-11-03 12:49:19] peewee_migrate INFO : Starting migrations 2023-11-03 12:49:19.815743064 [2023-11-03 12:49:19] peewee_migrate INFO : There is nothing to migrate 2023-11-03 12:49:19.840242161 [2023-11-03 12:49:19] detector.coral INFO : Starting detection process: 577 2023-11-03 12:49:19.956163315 [2023-11-03 12:49:19] detector.cuda INFO : Starting detection process: 580 2023-11-03 12:49:19.956171347 [2023-11-03 12:49:19] frigate.app INFO : Output process started: 585 2023-11-03 12:49:19.956179867 [2023-11-03 12:49:19] frigate.detectors.plugins.edgetpu_tfl INFO : Attempting to load TPU as pci 2023-11-03 12:49:19.956187410 Process detector:coral: 2023-11-03 12:49:19.956194534 [2023-11-03 12:49:19] frigate.detectors.plugins.edgetpu_tfl INFO : TPU found 2023-11-03 12:49:19.956218629 [2023-11-03 12:49:19] frigate.detectors.plugins.edgetpu_tfl ERROR : No EdgeTPU was detected. If you do not have a Coral device yet, you must configure CPU detectors. Is there any way to check if the docker is actually seeing the device?
  15. Thanks for that, I'll check the cables for the parity disk. What did you notice in the logs that led you to believe that?
  16. Hoping someone might be able too help me here. I've had a few issues recently with Unraid, but a chunk of them seemed to end up being down to a bad SSD in my cache pool which I have now taken out of the pool (but is still in the server for now). I had noticed that my parity disk was showing increasing numbers of errors but it would flucuate up and down and seemed to settle after the changes I made recently (replaced the SSD with SMART issues and swapped out my SAS coontroller). However, yesterday there were ~3000 errors on parity and now there are over 9000! I've taken a quick look in the syslog and I've started to see these errors repeating: Oct 16 05:26:35 Ragon kernel: I/O error, dev sdd, sector 3019156232 op 0x0:(READ) flags 0x0 phys_seg 29 prio class 2 Oct 16 05:26:35 Ragon kernel: md: disk0 read error, sector=3019156168 Oct 16 07:37:41 Ragon kernel: I/O error, dev sdg, sector 1016 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 Oct 16 07:37:41 Ragon kernel: md: disk4 read error, sector=952 sdd is in there a lot more, which is the parity disk, so looks like it might been failing, which is annoying as it's only a year or so old. Wondering if anyone could take a look and suggest if there are any other issues other than 2 failing disks, there aren't any SMART errors on either disk.
  17. I ordered a Coral TPU to use with Frigate and at first it seemed to work but now it isn't. When I do lsusb on the host the device is showing as a Global Unichip Corp device rather than Google, and then when passed through to a container it just comes up with no name at all. I had installed ich777's Coral drivers but was told to remove that as maybe that causes issues so I'm wondering if maybe they haven't uninstalled. Has anyone come across this issue before and have any idea how to resolve?
  18. I don't think I've misconfigured anything... docker run -d --name='frigate' --net='br0' --ip='192.168.0.48' --privileged=true -e TZ="Europe/London" -e HOST_OS="Unraid" -e HOST_HOSTNAME="Unraid" -e HOST_CONTAINERNAME="frigate" -e 'TCP_PORT_5000'='5000' -e 'TCP_PORT_8554'='8554' -e 'FRIGATE_RTSP_PASSWORD'='xxxxx' -e 'NVIDIA_VISIBLE_DEVICES'='GPU-6b2....5a0' -e 'NVIDIA_DRIVER_CAPABILITIES'='compute,utility,video' -e 'TCP_PORT_8555'='8555' -e 'UDP_PORT_8555'='8555' -e 'TCP_PORT_1984'='1984' -l net.unraid.docker.managed=dockerman -l net.unraid.docker.webui='http://[IP]:[PORT:5000]' -l net.unraid.docker.icon='https://raw.githubusercontent.com/yayitazale/unraid-templates/main/frigate.png' -v '/mnt/cache/appdata/frigate':'/config':'rw' -v '/mnt/user/Media/frigate':'/media/frigate':'rw' -v '/mnt/user/appdata/trt-models':'/trt-models':'ro' -v '/mnt/user/appdata/trt-models':'/trt-models':'ro' -v '/etc/localtime':'/etc/localtime':'rw' --device='/dev/bus/usb' --shm-size=256mb --mount type=tmpfs,target=/tmp/cache,tmpfs-size=100000000 --restart unless-stopped --runtime=nvidia 'ghcr.io/blakeblackshear/frigate:stable-tensorrt'
  19. I have multiple times now and keep getting the same result unfortunately!
  20. I did, I removed and restarted but the results from lsusb both on the host and container are the same, Global Unichip Corp on the host, blank in the container.
  21. Just had a look at lsusb on the host and in the container and noticed it's started misbehaving... Unraid: root@Ragon:~# lsusb Bus 006 Device 002: ID 1a6e:089a Global Unichip Corp. Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 005 Device 002: ID 0781:5567 SanDisk Corp. Cruzer Blade Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 003 Device 002: ID 051d:0002 American Power Conversion Uninterruptible Power Supply Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 1cf1:0030 Dresden Elektronik ZigBee gateway [ConBee II] Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Frigate: # lsusb Bus 006 Device 002: ID 1a6e:089a Bus 006 Device 001: ID 1d6b:0003 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller Bus 005 Device 002: ID 0781:5567 SanDisk Cruzer Blade Bus 005 Device 001: ID 1d6b:0002 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller Bus 004 Device 001: ID 1d6b:0003 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller Bus 003 Device 002: ID 051d:0002 American Power Conversion Back-UPS RS 900G FW:879.L4 .I USB FW:L4 Bus 003 Device 001: ID 1d6b:0002 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller Bus 002 Device 001: ID 1d6b:0003 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller Bus 001 Device 002: ID 1cf1:0030 dresden elektronik ingenieurtechnik GmbH ConBee II Bus 001 Device 001: ID 1d6b:0002 Linux 6.1.49-Unraid xhci-hcd xHCI Host Controller
  22. Yes to both, 3.2 gen 2 USB port. Is there a way to check in the docker if it's actually seeing the device? Can I trigger anything in the docker to see if it connects to the TPU outside of Frigate?
  23. Looks like the module works so I assume it's the usb passthrough? Is there another method to passthrough I should try? python3 examples/classify_image.py \ --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite \ --labels test_data/inat_bird_labels.txt \ --input test_data/parrot.jpg /Users/burningsky/Downloads/edgetpu_runtime/coral/pycoral/examples/classify_image.py:79: DeprecationWarning: ANTIALIAS is deprecated and will be removed in Pillow 10 (2023-07-01). Use LANCZOS or Resampling.LANCZOS instead. image = Image.open(args.input).convert('RGB').resize(size, Image.ANTIALIAS) ----INFERENCE TIME---- Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory. 13.2ms 2.9ms 2.9ms 2.8ms 2.9ms -------RESULTS-------- Ara macao (Scarlet Macaw): 0.75781
  24. I'll test it on another pc tonight Yeah, I could see it listed as Google Inc on port 6 slot 2 and the /dev/bus/usb had a folder structure mirroring the ports on unraid