Kaldek

Members
  • Posts

    101
  • Joined

  • Last visited

Everything posted by Kaldek

  1. I will admit that my current flash drive - a SanDisk Cruzer Fit 32GB has lasted me a very long time without issue, when connected to a USB2 port.
  2. Ah, I seem to have missed the part where you wrote you're running unRAID on a QNAS box.
  3. Likely. Also I'm a bit surprised your motherboard has *no* USB2 headers. They're common even on new stuff, even it's only a single two-port header.
  4. I burned through a few flash drives before using USB2 ports only. My current unit has lasted 3 years now. However, I am switching to a USB DOM (Disk On Module) shortly for some extra reliability. They are more expensive but use quality SLC flash. The only downside is that it's mounted to a motherboard header, and harder to get to. But, it should be unlikely I ever need to touch it.
  5. I'm in the middle of some major array disk maneuvering which will require a "New Config" in a few days to remove some drives from the array. However, I have both a BTRFS cache pool (mirrored 1TB SSDs) and a ZFS RAIDz Pool of 4x 480GB Enterprise grade SSDs. All of my Appdata and Domains lives on the ZFS pool. If I lose that, I'm hosed. So, does the "New Config" option support keeping of both traditional "pools" and also ZFS pools? It just says "Pool Slots" but doesn't clarify if that will retain ZFS pools.
  6. Hi folks, long time user here. Upgraded to 6.12 and then 6.12.2 and decided to create a 4-drive SSD ZFS RaidZ pool using some enterprise grade SSDs I was given, and use that pool for all my VMs and Docker containers. Everything went great, except when I moved the libvirt.img file from the old cache pool to the new zfs pool. Here's what I did: Set the system share to use the new ZFS pool Shut down the Docker engine via Settings-->Docker "mv /mnt/cache/system/docker /mnt/zfs-cache/system" Restarted Docker - no issues. Shut down the VM engine via Settings-->VM Manager "mv /mnt/cache/system/libvirt /mnt/zfs-cache/system" Validated that the file exists within /mnt/user/system/libvirt but physically exists only on the ZFS pool Attempted to restart the VM engine This gave me "libvirt service failed to start" and the system logs gave me a bunch of errors about btrfs saying that the "file already existed" and information about /dev/loop4 and duplicate entities. The issue went away after the reboot, but, why did it happen in the first place? I did not have this issue when I moved the docker.img file. Diagnostics file also attached. unraid-diagnostics-20230710-1317.zip
  7. Concurred; mine had also just filled up with the same consumption caused by the log.json file used by Plex.
  8. Well, DNS kept going after having IPv6 disabled but Docker had a fit and lost its mind. Server reboot required. I'll just reboot it every couple of days until RC7 is out.
  9. It's been 7 days for me with no issues, and the only change I made was to disable IPv6. User JorgeB above mentioned there is an RC7 floating about that fixes a Docker Bug which could cause the issue. He must be a member of the team as he's a moderator.
  10. I can say with some confidence this is not a DNS server issue for me. Nothing changed except unRAID version unRAID was always set to use the router, followed by Google for DNS /etc/resolv.conf shows as blank when the problem occurs
  11. I can confirm the same behaviour. I also tried to stop the array so I could change DNS settings and see if it forced DNS to come back, but it constantly said that Docker was still running. I'm not sure if I said this in my original post either, but it was also impossible to stop the array because Docker would never stop, and as a result it could not unmount the cache pool.
  12. This issue appears to be occurring every few days. I can't ping any hostnames from the CLI, and /etc/resolv.conf is blank. I do not use DHCP for the server address, and my DNS servers are statically assigned. In addition, when it happens I am unable to reboot the server as the shares will never unmount as it constantly tells me /mnt/cache is busy. There are definitely no clients holding shares open when this happens. I have attached diagnostics of when the server is working, and will attach again when it next fails. I have made one change today after the last failure, and that was to disable IPv6. My dual stack ISP connection isn't always the best when it comes to IPv6 working all the time, so I've disabled IPv6 to see if that helps, since this seems to mainly be a network issue. unraid-diagnostics-20230521-1953.zip
  13. For what it's worth, here's the code from my discussion with ChatGPT. This is - as yet - untested. But knock yourselves out if you want to see what was generated. Note that this script is intended to be run "After start of array". #!/bin/bash CONTAINER_NAME="frigate" # Wait for 2 minutes for container to start sleep 120 # Get the current container configuration CONFIG_JSON=$(docker inspect --type container --format '{{json .}}' ${CONTAINER_NAME}) # Extract the current command and entrypoint CMD=$(echo ${CONFIG_JSON} | jq -r '.Config.Cmd | join(" ")') ENTRYPOINT=$(echo ${CONFIG_JSON} | jq -r '.Config.Entrypoint | join(" ")') # Extract all options from the HostConfig property HOST_CONFIG_OPTIONS=$(echo ${CONFIG_JSON} | jq -r '.HostConfig | to_entries | map(select(.key != "Devices")) | map("--" + .key + "=\"" + (.value | tostring) + "\"") | join(" ")') # Replace the USB device option with the new bus ID HOST_CONFIG_OPTIONS=$(echo ${HOST_CONFIG_OPTIONS} | sed 's@--device="/dev/bus/usb/004/002@--device="/dev/bus/usb/004/003@g') # Extract the image name IMAGE=$(echo ${CONFIG_JSON} | jq -r '.Config.Image') # Build the new container run command NEW_CMD="docker run --name ${CONTAINER_NAME} ${HOST_CONFIG_OPTIONS} ${ENTRYPOINT} ${CMD}" # Stop the existing container docker stop ${CONTAINER_NAME} # Run the container with the new configuration eval ${NEW_CMD}
  14. Not technically true since I have it working as long as I boot the container twice with the two different USB bus entries.
  15. Depends what the question is but yes, it's amazing for turning ideas into code. I don't trust it 100% of course, and I'm using it to give me ideas and examples. I get to bypass all the grief I'd get by asking a human. In my view, these generative AI models are necessary. The amount of time we all burn on questions when the respondee of the question has their own emotions around the question and how they want to answer it is utterly insane. ChatGPT in particular has a very simple, concise and objective response to everything asked of it. The trick is knowing how to phrase your questions, hence the term "prompt engineering". I'm much better at this than I ever was at "Google-Fu".
  16. So folks my unRAID server has an Intel server NIC in it with dual XFP ports and runs the ixgbe driver. Over a few uNRAID revisions now, there are random instances where the NIC driver dies (and yes, I uploaded the diagnostics files when it happened). There's really been no solution to this issue, and it of course usually happens when I'm overseas for work and it's 12+ hours before I can remote in to PikVM and bounce the server to get the network back up. I got a little tired of this, so here's the result of me and ChatGPT4 having a bit of a discussion about how to deal with it automatically. The solution documented here is a pair of User scripts, one a "Ping Watchdog" and the other a supervisor for the watchdog (in case the watchdog dies). Here is the watchdog script, called "ping_watchdog" and is running a ping against a pair of IP addresses (my core switch and my gateway) so that one single IP being down doesn't trigger the reboot. Sometimes my gateway is off the air for a while as I do some arcane Mikrotik things on it. This script is set to run at the first array start only and stays running forever (unless it dies for some reason; see the supervisor script below). #!/bin/bash TARGET_IP_1="192.168.0.254" # Replace with your gateway router IP address TARGET_IP_2="192.168.0.240" # Replace with your core switch IP address PING_COUNT=4 # Number of pings to send PING_TIMEOUT=5 # Timeout for each ping in seconds FAIL_THRESHOLD=30 # Number of consecutive failed ping checks before restarting CHECK_INTERVAL=60 # Time in seconds between ping checks failed_pings=0 ping_check() { local target_ip=$1 ping -c $PING_COUNT -W $PING_TIMEOUT $target_ip >/dev/null 2>&1 return $? } while true; do ping_check $TARGET_IP_1 result1=$? ping_check $TARGET_IP_2 result2=$? if [ $result1 -ne 0 ] && [ $result2 -ne 0 ]; then failed_pings=$((failed_pings + 1)) echo "$(date) - Pings to $TARGET_IP_1 and $TARGET_IP_2 failed. Consecutive failed ping checks: $failed_pings" else failed_pings=0 fi if [ $failed_pings -ge $FAIL_THRESHOLD ]; then echo "$(date) - Restarting unRAID server due to $FAIL_THRESHOLD consecutive failed ping checks" /usr/local/sbin/powerdown -r exit 0 fi sleep $CHECK_INTERVAL # Wait for the specified time before the next iteration done Next is the "ping_watchdog_supervisor" which is set to run every hour. If the first script is seen as not running, it kicks it off again. #!/bin/bash PING_WATCHDOG_SCRIPT="ping_watchdog" pid=$(pgrep -f "^/bin/bash.*/tmp/user.scripts/tmpScripts/$PING_WATCHDOG_SCRIPT") if [ -z "$pid" ]; then echo "$(date) - Ping watchdog script not running. Restarting..." /usr/local/emhttp/plugins/user.scripts/start_script.sh "$PING_WATCHDOG_SCRIPT" else echo "$(date) - Ping watchdog script running with PID $pid" fi Coupled together, these two scripts ensure that if my NIC ever dies, uNRAID performs a clean reboot without hurting the array.
  17. Hi folks. About to upgrade to Frigate 0.12.0 in case that answers this but I have an issue with my Frigate install on unRAID. That is, my Coral TPU busID changes after Frigate loads the Tensor code into it. At boot time this device is at /dev/bus/usb/004/003 but it then flips to /dev/bus/usb/004/002 after Frigate loads the Tensor software into it. I'll be damned if I'm going to run this container as privileged and expose all of /dev/bus/usb to it, just to deal with that issue. Right now I'm in the middle of a rather long chat with GPT4 about how we can get around this issue (currently, dumping the JSON config of the running container, finding the BusID, changing it and then restarting the container with the Google-ified BusID using the same JSON code method). Total pain in the ass of course just to flip a couple of bits and then restart a container. Don't suppose 0.12.0 resolves that issue with the Coral TPU by any chance?
  18. Hi mate, I have set up the Docker container from the official repo and it's working well, with a few items that I suspect you are a good source to discuss them with: Access via the tunnel to the myunraid.net URL does not work unless I set TLS to "Yes" rather than "strict" so that it uses the self-signed certificate (and I set TLS verification to off in the Cloudflare portal What is the correct setup if the internal host is accessed via DNS (e.g. host.mydomain.local) rather than IP address? It's literally day 1 here so these are questions I would probably be able to work out later anyway. Figured it can't hurt to ask.
  19. So Linux Tech Tips just released this video about server grade USB 2.0 storage which should be excellent for unRAID OS. What are the community thoughts on this?
  20. Hi there, I've just started using this. It's working great, thanks for your efforts.
  21. Perhaps the better question is whether Limetech are aware they are actively being probed to find a way in. It happened to Solarwinds, it can happen here. It's just way too easy to do this stuff and is asymmetric warfare. We get one chance to detect the breach whilst the attacker has unlimited time to perfect it by finding even the remotest weakness and using that as the beachhead.
  22. As someone whose full time job is InfoSec for a multinational, I need to reply here. Saying not to expose stuff to the Internet is obvious but it doesn't remove the problem. The biggest concern I have with unRAID is a supply chain attack, and unRAID is popular enough now that e-Crime actors will already be looking at targeting the platform. The question is this: who validates all the plugins etc to make sure they don't have malicious code in them or the ability to subsequently download malicious payloads? Never mind Docker or VMs, those are the responsibility of the end user - period. The risk is going to come from the commonly used plugins that everyone uses: Unassigned Devices Nvidia Fix Common Problems Nerd Tools Preclear disks etc So, what's Limetech's response? Better to get ahead of this before it actually happens. if the risk lies with the end user for 3rd party plugins the link to this in customer agreements need to be pointed out.
  23. My server using an Intel dual-port 10Gb XFP NIC is routinely removing the driver with the following syslog message which removes all network connectivity. Mar 17 08:18:19 MU-TH-UR kernel: ixgbe 0000:03:00.0: Adapter removed Mar 17 08:18:19 MU-TH-UR kernel: ixgbe 0000:03:00.0: Warning firmware error detected FWSM: 0xFFFFFFFF I have serial console and local access so I am at least able to bounce the server to bring it back. Diagnostics file attached. Not sure what to do here as I cannot find any references to firmware error 0xFFFFFFFF and can only find references to error 0x0118801B and 0x00000000. I note the driver used in 6.9.1 is Intel version 5.10.21 and there is a 5.11.3 available which mentions "bug fixes" (no details provided). mu-th-ur-diagnostics-20210317-0927.zip
  24. If @ich777 is responsible for this part of unRAID I'd say they'll be working on a fix ASAP. They already provided a permanent fix for Nvidia drivers.