Jump to content

ScottAS2

Members
  • Posts

    37
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

ScottAS2's Achievements

Noob

Noob (1/14)

4

Reputation

  1. This chimes with a discovery I've made in the meantime: bypassing the cache and writing directly to the array seems to solve the problem. Both cache drives (BTRFS/RAID1) are Crucial BX500 1TB SATA SSDs; part number CT1000BX500SSD1. I'm not sure how to find out if they have a DRAM cache, although the data sheet makes no mention of it, which probably means "no".
  2. I believe this is the mistake. You should select the "custom" network that can be created in Unraid's Docker settings. This will allow the Time Machine container to have its own IP in and exchange multicast DNS/DNS Service Discovery messages with devices on your 192.168.178.0/24 network. The "bridge" network will put Docker containers in a 172.17.0.0/16 (by default) internal network that you can think of as NAT-ed behind your Unraid server's IP. In most cases, this is fine and you just forward the appropriate ports through that NAT to a container*, but the mDNS and DNS-SD messages needed for clients to discover and communicate with the Time Machine service won't cross from one network to the other. Even if they did, the Time Machine container on a "bridge" network only knows its 172.17.0.0/16 address, which is no use to the machines on 192.168.178.0/24 that are looking to back up. * In theory, you might be able to set Time Machine up like this - it's "just" an SMB server - but it would be more work to set up the clients and Time Machine is notoriously picky; without the multicast stuff it might decline to accept the server.
  3. Hi all, I've been facing a problem with my Unraid server for a while whereby a large file transfer to the server will somehow upset Docker and cause problems, including: The Docker portions of the webGUI become slow Attempts to stop, remove, or kill Docker containers fail with error messages along the lines of "attempted to kill container but did not receive a stop event from the container" This is regardless of whether I use the webGUI, the command line, or Compose Connections to Docker containers from elsewhere on the network fail. (particularly annoying) my ADS-B receiver Docker container stops working and needs to be manually restarted before it'll work again, even if the offending transfer has already finished. Examples of big transfers in that can cause this problem: A backup coming in to the server from offsite (through rsync directly on Unraid) Windows File History backing up to the server (through Unraid's built-in SMB server functionality) A large download through Lancache on the server (through the Lancache Docker container) You'll notice that the first two of those should have nothing to do with Docker. Frustratingly, the following do not cause the problem: A backup going offsite from the server (rsync again) Mac OS backing up through (a Time Machine Docker container) Media being served out to other devices (built-in SMB) Clearly, my server is running out of some resource that Docker needs, but I'm at a loss as to what it is. Despite the problem being associated with a lot of network traffic, I doubt it's bandwidth itself, since one of the triggers comes from offsite, and it's unlikely my 100Mb connection (less VPN overheads) is saturating the server's gigabit Ethernet. While the CPU does run up during problematic transfers, it doesn't seem to be totally overloaded, and there's bags and bags of free memory. Let's take as a specimen what happened yesterday evening and see what Netdata recorded: As you can see, the ADS-B receiver fell over sometime around 17:00. I believe this was triggered by Windows backing up to Unraid's SMB server. The CPU runs up a bit for about 20 minutes at 16:30, then a bit more an hour later, but it never goes above 60%: There's some disk IO at the same times: Nonetheless, we have all of the memory: And there's a network traffic spike, but nothing gigabit Ethernet shouldn't be able to handle: Can anyone suggest other metrics I should examine? Or is there a way to decrease the general niceness of the Docker daemon so it will just take a bigger share of what resources there are? Vital statistics: Unraid v6.12.6 Dell PowerEdge R720XD Diagnostics are attached unraid-diagnostics-20231228-1006.zip
  4. Yes, that's a reasonable thing on the first run, but on subsequent runs it's just frustrating because you've already examined it and decided (against the clear advice of the plugin) to continue doing it anyway. A third option has occurred to me: perhaps have a button to suppress a particular warning, like Fix Common Problems does. Thanks. I'll take that up with Unraid if I can't figure it out.
  5. Hello all. First of all, thank you for this excellent plugin. @Squid's version was already pretty good, but you've taken it to the next level, @KluthR! I have a couple of tiny problems and one medium-sized one, though. Let's get the small ones out of the way: Firstly, could I please prevail upon you to not raise notifications about configuration warnings at runtime? Every week I get notifications generated telling me "NOT stopping [container] because it should be backed up WITHOUT stopping!". I know! I chose that setting! In defiance of the very clear advice on the UI that this isn't recommended, too. By all means note in the log that the backup isn't stopping the container, or even raise a warning dialog when saving the settings, but please don't raise a notification. It doesn't tell me anything, and is just noise. I'd turn off warnings, but sometimes there are warnings about things I don't already know about, and want to do something about. Talking of which... Last night, for several containers I got three warning notifications in addition to the configuration one mentioned above. They weren't obvious in the log because they had the [ℹ️] that I presume denotes information, not the [⚠️] for warnings, making them hard to spot. They were of the format "[[timestamp]][ℹ️][[container]] Stopping [container]... Error while stopping container! Code: - trying 'docker stop' method". That brings me on to my medium-sized problem: I got a warning for some, but not all, of my containers saying "Stopping [container]... Error while stopping container! Code: - trying 'docker stop' method". The next message notes that "That _seemed_ to work.". Indeed, if I SSH into Unraid and do "docker stop [container]", it does work. By what method does the plugin try to stop containers on its first attempt? Do you have any tips for how I can debug my containers to see why they don't stop on that first attempt?
  6. I nearly forgot to do this: now at over two weeks with no crashes. I'm going to declare this fixed by switching to ipvlan.
  7. 10 days of uptime with ipvlan. Then again, I've been here before...
  8. *sigh* It died again overnight. Let's give the ipvlan thing a go.
  9. So I've just passed seven days with almost everything up on 6.12.4. I wonder if it was related to the macvlan changes, since I am using that for Docker with a parent interface that is a bridge. Does anyone know where I would have been expected to see the "macvlan call traces and crashes" referred to in the release notes if that was the cause of my crashes?
  10. Usually the custom eth0 is the one to go for, because that allows the service to advertise itself on the network. What sort of not work are you getting?
  11. Deleting backups from Time Machine manually isn't all that hard. If you connect to your Time Machine share in the Finder, mount the disk image, and go into it, you'll see a bunch of folders named for the date and time they represent a backup of. You can then delete as you see fit. With regards to Time Machine not recognising the reduced quota, try removing the backup location and adding it again (which you can do without disrupting your history. Although it will say the angst-inducing "waiting to complete first backup" when you re-add the location, don't panic: it'll find the history once it does the first "new" backup).
  12. I suspect this is a powers of 10 vs powers of 2 issue and the 3TB you're declaring in the Docker parameters is slightly larger than the 3TB your drive actually holds. Time Machine's "do I need to delete old backups?" check looks only at the declared size, and pays no regard to the underlying storage (it will also ignore any storage contents that aren't sparsebundles!). I suspect if you declare a slightly smaller size, say 2.5TB*, Time Machine will realise it's full and start deleting. * n TB is about 9.05% smaller than n TiB, but some wiggle room is probably a good idea.
  13. I find it can fiddle around for a few minutes after each backup has completed, but it eventually stops and the disk spins down (sometimes just in time for the next backup to spin it back up 😕). I’ve no idea what it’s doing, though. Updating cached indexes or some other metadata perhaps? How long after your backup does the activity persist?
  14. How did you get on with this, @halorrr? Was resetting disk.cfg the solution? Although I don't know for certain, I would guess that the entries you ask about are the equivalents of the per-disk preferences you get when clicking on the disk name in the "Main" page of the web interface. How did you reset it? Copy the file over from a fresh install?
  15. Yes, this looks just like what I've got. I managed to get uptime to over seven days, slowly bringing back Docker services as I grew more confident. A few hours after I brought up my ADS-B constellation, the server crashed. Thinking I'd found the culprit, I brought up everything except ADS-B, but the server crashed again a few days later. I'm therefore of the opinion that the probability of crashing increases the more heavily-loaded the server is, but I still don't know an exact cause, let alone a solution. If anyone knows the answers to any of the questions I posed above, please speak up!
×
×
  • Create New...