• [6.12.6 / 6.12.8] Random crash


    elgatobavaria
    • Closed Urgent

    Hi,
    my Unraid system is random crashing wiht 6.12.6. In the syslog can be found some restarts of samba service. It seems that another user has similar problems --> https://forums.unraid.net/topic/152651-array-randomly-shutting-down/

    Syslog war recorded using a syslog server on my windows pc.
    MemTest already executed for 24h. No Problems found.

    Best regards 
    Mathias


    Update:
    Since there was no respone from anyone within over one week (although the ticket is marked as urgent) i changed my Unraid system to another hardware. But on the new system the problem still exist.

     

    nasratisbona-diagnostics-20240213-1532.zip syslog.txt




    User Feedback

    Recommended Comments

    I'm not seeing anything relevant logged, is that the full persistent syslog? It only covers 20 minutes uptime.

     

    Try enabling the mirror to flash drive option in the syslog server and post that one after the new crash.

    Link to comment

    Hi Jorge, i switched back to the original system and enabled "mirror to flash" functionality. I´ll post if the error occours again. At the meantime i updated to 6.12.8.

    Link to comment

    This morning the NAS was in a crash state again. Attached the mirrored syslog directly from the USB device. I cant see anything directly before the crash. I got a telegram message from Healthchecks.io at 07:15 this morning ( setup on is "send message after 30 minues if no ping occours within 5 minutes )

    syslog

    Link to comment

    Unfortunately there's nothing relevant logged, one thing you can try is to boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

    Link to comment

    @JorgeB The NAS is after 8 days and 9 hours still running in safe-mode with VM (Homeassistant) enabled (Docker service started without container running). I now start enabling one container by one. It seems that a hardware defect can be excluded.

    Is it possible to use cronjob in safe mode ? I want to check the health of the server via https://healthchecks.io/ 
    UPDATE CRONJOB:

    1. get current crontab with the command crontab -l >file

    2. modify the output file using any means, eg. sed, perl etc.

    3. apply new crontab with the command crontab file

    Edited by elgatobavaria
    Link to comment
    13 hours ago, elgatobavaria said:

    Is it possible to use cronjob in safe mode ?

    I believe yes.

    Link to comment

    Dont know what has fixed the issue but after:
    - Safe mode for multiple days
    - Start docker one by one
    - Start VM
    - Disable safe mode
    - Remove unused plugins
    The system seems to run stable. It seems that its not a hardware problem. System stable since multiple days.

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.