Jump to content
  • Docker crashes, doesn't start and need to reboot system


    Vitor Ventura
    • Minor



    User Feedback

    Recommended Comments

    Diags appear to be after a reboot, when it happens again post new diags before rebooting.

     

    Also see that btrfs is detecting data corruption in the pool:

     

    Apr 14 16:58:16 Tower kernel: BTRFS info (device sdb1): bdev /dev/sdb1 errs: wr 0, rd 0, flush 0, corrupt 526, gen 0

     

    Run a scrub and delete/restore from a backup the corrupt files, also a good idea to run memtest

    Link to comment
    1 hour ago, JorgeB said:

    Run a scrub and delete/restore from a backup the corrupt files,

    Also make sure you do the above, and reset the btrfs stats to see if new corruptions are detected.

    Link to comment

    Btrfs is detecting new data corruption, since memtest is only definitive if it finds errors, and since you have multiple sticks, fix the corrupt files and reset the stats and and try with just one stick of RAM, if the same try with a different one, that will basically rule out bad RAM.

    Link to comment
    On 4/22/2024 at 10:28 AM, JorgeB said:

    Btrfs is detecting new data corruption, since memtest is only definitive if it finds errors, and since you have multiple sticks, fix the corrupt files and reset the stats and and try with just one stick of RAM, if the same try with a different one, that will basically rule out bad RAM.

     

    new problems happen

    tower-diagnostics-20240714-1151.zip

    Link to comment

    Server ran out of RAM, it appears to be caused by the container which is starting all theses processes:

     

    Jul 14 01:17:03 Tower kernel: [  17461]  1000 17461  8475627     3963   663552        0           200 chromium
    Jul 14 01:17:03 Tower kernel: [  17465]  1000 17465  8473539     3593   651264        0           200 chromium
    Jul 14 01:17:03 Tower kernel: [  17480]  1000 17480 296529358    17344  1032192        0           300 chromium
    Jul 14 01:17:03 Tower kernel: [  17482]  1000 17482 296529074     7857   868352        0           300 chromium
    Jul 14 01:17:03 Tower kernel: [  17484]  1000 17484 296527041     5723   782336        0           300 chromium
    Jul 14 01:17:03 Tower kernel: [  17540]  1000 17540  8473129     5707   688128        0           200 chromium
    Jul 14 01:17:03 Tower kernel: [  17562]  1000 17562 296526822     4187   729088        0           300 chromium

     

    • Thanks 1
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...