bauercole Posted June 3, 2020 Posted June 3, 2020 I have been having some issues with docker where all containers would stop responding, and then I was unable to restart my machine cleanly to get things back up. I had one such occurrence this morning, and when the machine booted back up I have been unable to get Docker containers running at all. I have tried deleting the docker image and starting fresh with the same results. I end up getting the following output from a "docker ps" command via ssh: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?  I recently (a month ago or so) bought new hardware that I moved to, and it was working fine for a couple of weeks before I got a couple of these lockups, that resulted in my current state. I also was getting some BTRFS errors on my cache pool a few days ago (I thought do to some unclean shutdowns) so I recently formatted the cache and restored data from a backup. There is also a lot of output being displayed on the console screen, but I'm hoping most of that comes across in the attached diagnostics, as I only get small portions from the screen. I do see a couple lines pass by from time to time related to wireguard, which I uninstalled the plugin for a few weeks ago as I no longer needed it.  Hardware information: CPU: Threadripper 3970X RAM: Corsair Vengeance RGB Pro 64GB (4x16GB) DDR4 3200 Motherboard: Gigabyte TRX40 Aorus Pro WiFi (with wifi card removed) Storage: 2xWD SN550 SSD for cache, 3xSeagate IronWolf 10TB, and 1xSeagate EXOS 10TB Add-IN NIC: Mellanox ConnectX-2 10Gb SFP+  Plugins: Unassigned Devices Community Applications Nerd Tools (iperf, screen, sshfs-fuse installed with this) Unassigned Devices Plus Wake On Lan Support  I usually run a dozen or so docker containers, but none are able to run at this time. I think that is all the information I can provide, but I am a tad sleep deprived from a new child at home, so I apologize if I overlooked anything obvious. tower-diagnostics-20200603-1112.zip Quote
JorgeB Posted June 3, 2020 Posted June 3, 2020 Docker image is corrupt, but there are other issues on syslog that suggest a possible hardware issue, I would start with memtest Quote
rojarrolla Posted June 3, 2020 Posted June 3, 2020 I'm having a similar issue with the glass-isc-dhcp docker. I sart it and after 1 or 2 minutes it crashes. These are the errors I'm getting from the logs: "Error: Command failed: ./bin/dhcpd-pools -c /etc/dhcp/dhcpd.conf -l /var/lib/dhcp/dhcpd.leases -f j -A -s e" and "npm ERR! Failed at the [email protected] start script." Â Quote
bauercole Posted June 4, 2020 Author Posted June 4, 2020 Thanks for your pointer, it does appear I have a bad stick of RAM. I've swapped in some lesser RAM while I try to get a replacement and have begun testing with the temp setup. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.