Docker Containers dropping off and crashing

November 17, 20232 yr

Apologies if this is alot but-

So ever since I managed to fix my docker image issue -

I've had containers which were accessing the internet getting tons of disconnecting and then reconnecting for the past few weeks (ECONNREFUSED), specifically Plex, Bitwarden & NextCloud. Also my Bitwarden account keeps saying that my user ID & Pw is wrong.

4 days back, I decided to re-delete the docker image, and reinstall previous apps, however now my containers get unresponsive and after 1 day, my unRAID webui also gets unresponsive. Also realied that ever since the update of appdata backup, I've been getting failed backups and thus do not have a proper appdata backup, probably some app data are still salvagable. Should I standby to nuke the appdata and redo all my docker or is there a way out of this? 😅

P.S I'm using pfSense as a VM in a proxmox mini pc with 2x1gbps Wan connection if this matters

tower-diagnostics-20231117-1204.zip

Quote

November 17, 20232 yr

Community Expert

Nov 17 10:43:26 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Nov 17 10:43:26 Tower kernel: br0: port 1(eth0) entered blocking state
Nov 17 10:43:26 Tower kernel: br0: port 1(eth0) entered forwarding state
Nov 17 10:43:27 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down
Nov 17 10:43:27 Tower kernel: br0: port 1(eth0) entered disabled state
Nov 17 10:43:40 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Nov 17 10:43:40 Tower kernel: br0: port 1(eth0) entered blocking state
Nov 17 10:43:40 Tower kernel: br0: port 1(eth0) entered forwarding state
Nov 17 10:43:40 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down
Nov 17 10:43:41 Tower kernel: br0: port 1(eth0) entered disabled state
Nov 17 10:44:23 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Nov 17 10:44:23 Tower kernel: br0: port 1(eth0) entered blocking state
Nov 17 10:44:23 Tower kernel: br0: port 1(eth0) entered forwarding state
Nov 17 10:44:24 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down
Nov 17 10:44:24 Tower kernel: br0: port 1(eth0) entered disabled state
Nov 17 10:44:26 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Nov 17 10:44:26 Tower kernel: br0: port 1(eth0) entered blocking state
Nov 17 10:44:26 Tower kernel: br0: port 1(eth0) entered forwarding state
Nov 17 10:44:27 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down
Nov 17 10:44:27 Tower kernel: br0: port 1(eth0) entered disabled state
Nov 17 10:44:56 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
Nov 17 10:44:56 Tower kernel: br0: port 1(eth0) entered blocking state
Nov 17 10:44:56 Tower kernel: br0: port 1(eth0) entered forwarding state
Nov 17 10:44:57 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Down
Nov 17 10:44:57 Tower kernel: br0: port 1(eth0) entered disabled state
Nov 17 10:45:02 Tower kernel: e1000e 0000:00:1f.6 eth0: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Nov 17 10:45:02 Tower kernel: br0: port 1(eth0) entered blocking state

NIC keeps dropping the link, sometimes linking at 100Mbits, suggest a cable/switch/NIC problem.

Quote

November 18, 20232 yr

Author

@JorgeB Oh great! Yes this seems to be one of the issues too and a change of cable seem to fix the speed (Was wondering why I was capped at 100mbps for my usenet)

I'm still getting crashes of plex and other docker containers tho, with some showing 'Uptime: 15 hours(unhealthy)' and when I try to restart the container, it will stop and not start any longer.

Edited November 18, 20232 yr by jvlarc

Quote

November 18, 20232 yr

Community Expert

Reboot to clear the logs and post new diags after the problem.

Quote

November 18, 20232 yr

Author

Here is the diags after restarting, all containers cannot be started unless docker image deleted and redownloaded, again will only be up temporarily

tower-diagnostics-20231119-0015.zip

Quote

November 19, 20232 yr

Community Expert

18 hours ago, jvlarc said:

all containers cannot be started unless docker image deleted and redownloaded

Do you mean the docker.img or just deleting and re-adding the containers? I don't see a new docker image being created, nor any relevant errors in the diags.

Quote

November 19, 20232 yr

Author

Docker.img

Quote

November 19, 20232 yr

Community Expert

I don't see anew docker image being created on those diags, but since you're using a docker folder not sure that would show up, it's not a docker.img in any case.

Quote

November 20, 20232 yr

Author

Sorry, I do mean the docker directory, was on mobile earlier, will try to re-delete and install again

Quote

November 20, 20232 yr

Author

Just checked my ZFS pool and I seem to get an error message, am trying to do a scrub now to fix the error

Quote

pool: cache state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub in progress since Mon Nov 20 13:03:40 2023 252G scanned at 0B/s, 161G issued at 241M/s, 252G total 0B repaired, 63.90% done, 00:06:26 to go config: NAME STATE READ WRITE CKSUM cache ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /dev/sdb1 ONLINE 0 0 6 /dev/sdc1 ONLINE 0 0 6 errors: 1 data errors, use '-v' for a list

Have tried to delete and recreate docker directory and it crashes about a few mins in

tower-diagnostics-20231120-1316.zip

Quote

November 20, 20232 yr

Author

Here's the latest diags as the containers have crashed/blocked which doesn't allow me to run the containers

tower-diagnostics-20231120-1609.zip

Quote

November 20, 20232 yr

Community Expert

4 hours ago, jvlarc said:

resulting in data corruption

This is a bad sign and usually the result of a hardware issue, start by running memtest.

Quote

November 20, 20232 yr

Author

I actually did a memtest a few weeks ago with 3 passes, but recently switched over my cache to ZFS, also just booted into safe mode and checked with 'zpool status -x' and found the corrupted data which was a movie file, deleted and now it says no error and now am able to properly delete the docker directory. However the problem still persist.

Will run a memtest now and update.

Quote

November 23, 20232 yr

Author

Quick update as I was running a parity check for the past 2 days and it had 666 errors corrected, did the same thing and deleted docker folder, but still getting the same issue of dockers crashing and becoming unavailable.

Ran the memtest last night and completed this morning without any errors too

What else should I do?

Edited November 23, 20232 yr by jvlarc

Quote

November 23, 20232 yr

Community Expert

I would try with just one stick of RAM, if issues continue try a different one, that will basically rule out bad RAM or board issues when fully loaded with DIMMs.

Quote

November 23, 20232 yr

Author

Gotcha, let me try, but will update in 2 weeks time, thank you as always for the help @JorgeB

Also is there a chance that I've got some corrupted appdata somewhere? Should I nuke the whole app data and restart?

Edited November 23, 20232 yr by jvlarc

Quote

November 23, 20232 yr

Community Expert

3 minutes ago, jvlarc said:

Should I nuke the whole app data and restart?

If that's possible it's worth a try.

Quote

November 25, 20232 yr

Author

Ok so I didn’t nuke my entire appdata but deleting plex appdata seem to stop the crashes, it’s been up, but still gets disconnection and reconnection after some time, looking at the log files, it kept showing ports block and then unblocked constantly

could it be some issue with my reverse proxy setting?

Quote

November 27, 20232 yr

Community Expert

Could be.

Quote

Docker Containers dropping off and crashing

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)