Jump to content

some container stopping for some reason


sohailoo

Recommended Posts

i got some container that stops like once or twice a day for some reason that i can't figure out. most of these containers are not related to each other in any way, this is the syslog from maybe a minute or two after it happened. ignore all btrfs related issues since i already solved them (changed to xfs), there's also an error for 2 scripts, run.sh and kill.sh. ignore them too since they're not related to the issue (its been happening before i added them) and i already removed those scripts.

tower-diagnostics-20230925-0209.zip

 

 

Edited by sohailoo
Link to comment

It would be worthwhile to run memtest from the boot menu for minimum of a couple of passes.  If you boot via UEFI, then you will need to temporarily switch to legacy boot to run memtest.

 

You have a whack of segfaults which a lot of times are caused by bad memory and would also possibly result in what you are seeing.

Link to comment
23 hours ago, Squid said:

It would be worthwhile to run memtest from the boot menu for minimum of a couple of passes.  If you boot via UEFI, then you will need to temporarily switch to legacy boot to run memtest.

 

You have a whack of segfaults which a lot of times are caused by bad memory and would also possibly result in what you are seeing.

yeah it appears that i have a faulty stick. i replaced the ram and tested it to make sure there's no errors, now i'm getting ton of errors in unraid logs

tower-diagnostics-20230926-0123.zip

Link to comment
On 9/26/2023 at 10:44 AM, JorgeB said:

Looks more like a power/connection issue with parity, could also be this:

https://forums.unraid.net/topic/103938-69x-lsi-controllers-ironwolf-disks-disabling-summary-fix/

I've done the guide and replaced the cable to the parity just in case. worked fine for 4 days but today exactly at 2:00 am (syslog time) it happened again and 10-20 containers stopped, i noticed it immediately because i had a tab open of code server and it gave me a message while working that it disconnected

tower-diagnostics-20231002-0202.zip

Link to comment
9 hours ago, sohailoo said:

I've done the guide and replaced the cable to the parity just in case. worked fine for 4 days but today exactly at 2:00 am (syslog time) it happened again and 10-20 containers stopped, i noticed it immediately because i had a tab open of code server and it gave me a message while working that it disconnected

tower-diagnostics-20231002-0202.zip 235.95 kB · 0 downloads

 

This suggests you may have something scheduled to run at that time (e.g appdata backup) so worth checking for that.   You also seem to have an invalid entry in /etc/cron.d/root so worth checking that out to see what it is.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...