sohailoo Posted September 24, 2023 Share Posted September 24, 2023 (edited) i got some container that stops like once or twice a day for some reason that i can't figure out. most of these containers are not related to each other in any way, this is the syslog from maybe a minute or two after it happened. ignore all btrfs related issues since i already solved them (changed to xfs), there's also an error for 2 scripts, run.sh and kill.sh. ignore them too since they're not related to the issue (its been happening before i added them) and i already removed those scripts. tower-diagnostics-20230925-0209.zip Edited September 24, 2023 by sohailoo Quote Link to comment
Squid Posted September 24, 2023 Share Posted September 24, 2023 It would be worthwhile to run memtest from the boot menu for minimum of a couple of passes. If you boot via UEFI, then you will need to temporarily switch to legacy boot to run memtest. You have a whack of segfaults which a lot of times are caused by bad memory and would also possibly result in what you are seeing. Quote Link to comment
sohailoo Posted September 25, 2023 Author Share Posted September 25, 2023 23 hours ago, Squid said: It would be worthwhile to run memtest from the boot menu for minimum of a couple of passes. If you boot via UEFI, then you will need to temporarily switch to legacy boot to run memtest. You have a whack of segfaults which a lot of times are caused by bad memory and would also possibly result in what you are seeing. yeah it appears that i have a faulty stick. i replaced the ram and tested it to make sure there's no errors, now i'm getting ton of errors in unraid logs tower-diagnostics-20230926-0123.zip Quote Link to comment
JorgeB Posted September 26, 2023 Share Posted September 26, 2023 Looks more like a power/connection issue with parity, could also be this: https://forums.unraid.net/topic/103938-69x-lsi-controllers-ironwolf-disks-disabling-summary-fix/ Quote Link to comment
sohailoo Posted October 1, 2023 Author Share Posted October 1, 2023 On 9/26/2023 at 10:44 AM, JorgeB said: Looks more like a power/connection issue with parity, could also be this: https://forums.unraid.net/topic/103938-69x-lsi-controllers-ironwolf-disks-disabling-summary-fix/ I've done the guide and replaced the cable to the parity just in case. worked fine for 4 days but today exactly at 2:00 am (syslog time) it happened again and 10-20 containers stopped, i noticed it immediately because i had a tab open of code server and it gave me a message while working that it disconnected tower-diagnostics-20231002-0202.zip Quote Link to comment
JorgeB Posted October 2, 2023 Share Posted October 2, 2023 I would try do disable any auto update/auto cleanup/backup or any other user scripts you may have running at that time. Quote Link to comment
itimpi Posted October 2, 2023 Share Posted October 2, 2023 9 hours ago, sohailoo said: I've done the guide and replaced the cable to the parity just in case. worked fine for 4 days but today exactly at 2:00 am (syslog time) it happened again and 10-20 containers stopped, i noticed it immediately because i had a tab open of code server and it gave me a message while working that it disconnected tower-diagnostics-20231002-0202.zip 235.95 kB · 0 downloads This suggests you may have something scheduled to run at that time (e.g appdata backup) so worth checking for that. You also seem to have an invalid entry in /etc/cron.d/root so worth checking that out to see what it is. Quote Link to comment
Vr2Io Posted October 2, 2023 Share Posted October 2, 2023 You should check each docker log to identify the stop reason. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.