June 3, 20206 yr I have been having some issues with docker where all containers would stop responding, and then I was unable to restart my machine cleanly to get things back up. I had one such occurrence this morning, and when the machine booted back up I have been unable to get Docker containers running at all. I have tried deleting the docker image and starting fresh with the same results. I end up getting the following output from a "docker ps" command via ssh: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? I recently (a month ago or so) bought new hardware that I moved to, and it was working fine for a couple of weeks before I got a couple of these lockups, that resulted in my current state. I also was getting some BTRFS errors on my cache pool a few days ago (I thought do to some unclean shutdowns) so I recently formatted the cache and restored data from a backup. There is also a lot of output being displayed on the console screen, but I'm hoping most of that comes across in the attached diagnostics, as I only get small portions from the screen. I do see a couple lines pass by from time to time related to wireguard, which I uninstalled the plugin for a few weeks ago as I no longer needed it. Hardware information: CPU: Threadripper 3970X RAM: Corsair Vengeance RGB Pro 64GB (4x16GB) DDR4 3200 Motherboard: Gigabyte TRX40 Aorus Pro WiFi (with wifi card removed) Storage: 2xWD SN550 SSD for cache, 3xSeagate IronWolf 10TB, and 1xSeagate EXOS 10TB Add-IN NIC: Mellanox ConnectX-2 10Gb SFP+ Plugins: Unassigned Devices Community Applications Nerd Tools (iperf, screen, sshfs-fuse installed with this) Unassigned Devices Plus Wake On Lan Support I usually run a dozen or so docker containers, but none are able to run at this time. I think that is all the information I can provide, but I am a tad sleep deprived from a new child at home, so I apologize if I overlooked anything obvious. tower-diagnostics-20200603-1112.zip
June 3, 20206 yr Community Expert Docker image is corrupt, but there are other issues on syslog that suggest a possible hardware issue, I would start with memtest
June 3, 20206 yr I'm having a similar issue with the glass-isc-dhcp docker. I sart it and after 1 or 2 minutes it crashes. These are the errors I'm getting from the logs: "Error: Command failed: ./bin/dhcpd-pools -c /etc/dhcp/dhcpd.conf -l /var/lib/dhcp/dhcpd.leases -f j -A -s e" and "npm ERR! Failed at the [email protected] start script."
June 4, 20206 yr Author Thanks for your pointer, it does appear I have a bad stick of RAM. I've swapped in some lesser RAM while I try to get a replacement and have begun testing with the temp setup.
Archived
This topic is now archived and is closed to further replies.