dockers stopped

August 20, 20241 yr

I'm having an issue where all docker containers turn off and fail to start again unless I restart the array. Before this I had issues with frequent crashes from a faulty pcie to sata expansion card. After it was removed the system booted.

tower-diagnostics-20240821-0714.zip

Quote

August 21, 20241 yr

There are BTRFS errors :

Aug 19 14:06:31 Tower kernel: btrfs_validate_extent_buffer: 1850 callbacks suppressed
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 1 wanted 0xc1a226c5 found 0x6bade2d5 level 0
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 2 wanted 0xc1a226c5 found 0xee1c9cd7 level 0
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 1 wanted 0xc1a226c5 found 0x6bade2d5 level 0
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 2 wanted 0xc1a226c5 found 0xee1c9cd7 level 0
Aug 19 14:06:31 Tower kernel: I/O error, dev loop2, sector 20264160 op 0x0:(READ) flags 0x1000 phys_seg 4 prio class 0
Aug 19 14:06:31 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 86, rd 90, flush 0, corrupt 0, gen 0
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 1 wanted 0xc1a226c5 found 0x6bade2d5 level 0
Aug 19 14:06:31 Tower kernel: BTRFS warning (device nvme1n1p1): checksum verify failed on logical 966566641664 mirror 2 wanted 0xc1a226c5 found 0xee1c9cd7 level 0
Aug 19 14:06:31 Tower kernel: loop: Write error at byte offset 10375249920, length 4096.
Aug 19 14:06:31 Tower kernel: I/O error, dev loop2, sector 20264160 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
Aug 19 14:06:31 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 87, rd 90, flush 0, corrupt 0, gen 0

Probably to your docker image.

You should delete and recreate it. If it continues, you should do a memcheck.

Quote

August 21, 20241 yr

Run a pool scrub and post the results and new diags.

Quote

August 21, 20241 yr

Author

The ssd with docker on it gave errors

UID: 80bf3011-b153-4b87-9e8c-711b01222bd3 Scrub started: Wed Aug 21 14:40:36 2024 Status: finished Duration: 0:00:28 Total to scrub: 69.41GiB Rate: 2.48GiB/s Error summary: verify=8 csum=1 Corrected: 0 Uncorrectable: 9 Unverified: 0

tower-diagnostics-20240822-0741.zip

Quote

August 22, 20241 yr

Look at the syslog for the list of corrupt files, delete or restore them from a backup, then run another scrub to confirm 0 errors.

Quote

August 22, 20241 yr

Author

I deleted the docker image

The errors are still present on the one sdd (which contains the docker)

Quote

August 22, 20241 yr

There are other corrupt files, they are listed in the syslog, e.g.:

path: appdata/binhex-plexpass/Plex Media Server/Media/localhost/3/70320c9411d2e1173ca6a8e17276181f3667518.bundle/Contents/Thumbnails/thumb1.jpg)

Quote

August 23, 20241 yr

Author

The ssd corrupted with a no file system error so I restored formated and restored from backup. Scrub is no longer giving an error. However the system went offline last night, plugging into an external monitor doesn't detect a signal until reboot. So it seams the Pcie expansion was not the issue.

I ran memtest86 and it passed. I can't run the unraid memcheck because it's disabled without a graphics card.

attached is the diagnostic after booting again after the crash.

I have the gen13 i3 with latest bios. Its not meant to be part of the intel voltage issue so shouldnt be causing the crash...

Quote

August 23, 20241 yr

Author

diagnostic

tower-diagnostics-20240823-0819.zip

Quote

August 23, 20241 yr

Enable the syslog server and post that after a crash but if it's a hardware problem there likely won't be anything relevant logged.

Quote

August 23, 20241 yr

6 hours ago, Azura said:

I can't run the unraid memcheck because it's disabled without a graphics card.

It is worth pointing out that there is now the "Live Memory Tester" plugin that can be run while Unraid is running. Passing its tests does not necessarily mean you have no RAM issue, but failing any test definitely does.

Quote

August 25, 20241 yr

Author

Just had a reset crash, not the usual type of crash

tower-diagnostics-20240825-1958.zip

syslog

Quote

August 25, 20241 yr

Author

4 minutes ago, Azura said:

Just had a reset crash, after the downloading the logs the system went down again. If theres nothing in the logs Ill start swapping out hardware

tower-diagnostics-20240825-1958.zip 140 kB · 0 downloads

syslog 172.71 kB · 0 downloads

Quote

August 25, 20241 yr

Syslog you posted is the normal one after a reboot, not the persistent syslog.

Quote

August 26, 20241 yr

Author

how do I find the persistant one. I ticked mirror to flash and the guide says its in "/boot/logs". There is no boot folder so Im guessing its the root folder of the usb.

Capturing diagnostic information | Unraid Docs

Quote

August 26, 20241 yr

Author

Quote

August 26, 20241 yr

To download to the share you need to set the local server IP in the Remote syslog server.

Quote

August 28, 20241 yr

Author

Thanks

It crashed again at 3 days, logs attached. Seams odd to be hardware related if its every 3 days.

syslog-192.168.0.121.log

Quote

August 28, 20241 yr

Author

seams like this is the crash point

ug 29 05:09:59 Tower autofan: Highest disk temp is 41C, adjusting fan speed from: 175 (68% @ 1340rpm) to: 150 (58% @ 1171rpm)
Aug 29 05:20:01 Tower crond[1707]: failed parsing crontab for user root: Invalid frequency setting of /usr/local/emhttp/plugins/ca.update.applications/scripts/updateApplications.php >/dev/null 2>&1
Aug 29 05:29:16 Tower emhttpd: spinning down /dev/sde
Aug 29 05:30:25 Tower emhttpd: spinning down /dev/sdc
Aug 29 05:32:42 Tower emhttpd: spinning down /dev/sdd
Aug 29 05:32:45 Tower emhttpd: spinning down /dev/sdf
Aug 29 05:33:39 Tower kernel: mdcmd (57): set md_write_method 0
Aug 29 05:33:39 Tower kernel:
Aug 29 05:35:05 Tower autofan: Highest disk temp is 0C, adjusting fan speed from: 150 (58% @ 1167rpm) to: OFF (0% @ 0rpm)
Aug 29 06:05:01 Tower crond[1707]: failed parsing crontab for user root: Invalid frequency setting of /usr/local/emhttp/plugins/ca.update.applications/scripts/updateApplications.php >/dev/null 2>&1
Aug 29 06:55:01 Tower crond[1707]: failed parsing crontab for user root: Invalid frequency setting of /usr/local/emhttp/plugins/ca.update.applications/scripts/updateApplications.php >/dev/null 2>&1
Aug 29 07:00:25 Tower kernel: veth193667f: renamed from eth0
Aug 29 07:00:37 Tower kernel: eth0: renamed from vetha3fcec2
Aug 29 07:00:40 Tower kernel: veth0cba01d: renamed from eth0
Aug 29 07:00:41 Tower kernel: eth0: renamed from veth0273903
Aug 29 07:00:45 Tower kernel: vethd666a3b: renamed from eth0
Aug 29 07:00:47 Tower kernel: eth0: renamed from veth7589674
Aug 29 07:00:50 Tower kernel: docker0: port 2(vethaad0a38) entered disabled state
Aug 29 07:00:50 Tower kernel: vethade13f7: renamed from eth0
Aug 29 07:00:50 Tower kernel: docker0: port 2(vethaad0a38) entered disabled state
Aug 29 07:00:50 Tower kernel: vethaad0a38 (unregistering): left allmulticast mode
Aug 29 07:00:50 Tower kernel: vethaad0a38 (unregistering): left promiscuous mode
Aug 29 07:00:50 Tower kernel: docker0: port 2(vethaad0a38) entered disabled state
Aug 29 07:01:04 Tower emhttpd: read SMART /dev/sdd
Aug 29 07:01:05 Tower kernel: docker0: port 2(veth11fffe6) entered blocking state
Aug 29 07:01:05 Tower kernel: docker0: port 2(veth11fffe6) entered disabled state
Aug 29 07:01:05 Tower kernel: veth11fffe6: entered allmulticast mode
Aug 29 07:01:05 Tower kernel: veth11fffe6: entered promiscuous mode
Aug 29 07:01:05 Tower emhttpd: read SMART /dev/sdf
Aug 29 07:01:05 Tower kernel: eth0: renamed from veth4dabbdb
Aug 29 07:01:05 Tower kernel: docker0: port 2(veth11fffe6) entered blocking state
Aug 29 07:01:05 Tower kernel: docker0: port 2(veth11fffe6) entered forwarding state
Aug 29 07:01:09 Tower kernel: veth4d63e1d: renamed from eth0
Aug 29 07:01:09 Tower kernel: docker0: port 1(vethee93c1c) entered disabled state
Aug 29 07:01:09 Tower kernel: docker0: port 1(vethee93c1c) entered disabled state
Aug 29 07:01:09 Tower kernel: vethee93c1c (unregistering): left allmulticast mode
Aug 29 07:01:09 Tower kernel: vethee93c1c (unregistering): left promiscuous mode
Aug 29 07:01:09 Tower kernel: docker0: port 1(vethee93c1c) entered disabled state
Aug 29 07:01:31 Tower kernel: docker0: port 1(vethe9bcb46) entered blocking state
Aug 29 07:01:31 Tower kernel: docker0: port 1(vethe9bcb46) entered disabled state
Aug 29 07:01:31 Tower kernel: vethe9bcb46: entered allmulticast mode
Aug 29 07:01:31 Tower kernel: vethe9bcb46: entered promiscuous mode
Aug 29 07:01:32 Tower kernel: eth0: renamed from vethf6a58a6
Aug 29 07:01:32 Tower kernel: docker0: port 1(vethe9bcb46) entered blocking state
Aug 29 07:01:32 Tower kernel: docker0: port 1(vethe9bcb46) entered forwarding state

Aug 29 07:50:01 Tower rc.rsyslogd: Syslog server daemon... Started.
Aug 29 07:50:01 Tower file.activity: Starting File Activity

Quote

August 29, 20241 yr

Unfortunately there's nothing relevant logged, this can be a hardware issue, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Quote

September 4, 20241 yr

Author

Thanks,

I ended up deleting all plugins except the basic ones you need and the crashing stopped. Its run longer than usually. The last plugin I installed was the fan speed plugin. So I have a feeling it was that one.

Quote

September 12, 20241 yr

Author

scratch that, the issue suddenly returned and went into an on / off re-loop. I swapped the gen 12 cpu with a gen 12 computer and its running again. Will take a while to work out the cause but I'm considering it may be instability with gen 13 intel on older motherboards. I'm reading there not as stable.

Quote

dockers stopped

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)