Jump to content
  • CRASHES!!!


    koolnerd7
    • Urgent

    anyone have any idea how to stop these crashes? posted the diagnostics, but i doubt they will help. its been locking up almost every update. i tried everything people say to try including ram tests, swap hardware, check with known good hardware, stop all dockers and just run as basic nas, everything! still locks up every few days and i have to hold the power button until it turns off. hard reset i guess its called. anyone know whats going on? is unraid not stable?

    diagnostics-20240618-1934.zip




    User Feedback

    Recommended Comments

    The syslog in the diagnostics is the RAM copy and only shows what happened since the reboot.   It could be worth enabling the syslog server to get a log that survives a reboot so we can see what happened leading up to the crash.  The mirror to flash option is the easiest to set up, but if you are worried about excessive wear on the flash drive you can put your server’s address into the Remote Server field.

    Link to comment

    ok, I just enabled it. I will send the new log from the flash drive on next crash. I've done this before and looked through the log, and compared it to the active one side by side and it looked the same. I'm sure I don't know what to look for though, so any help is much appreciated. thanks

    Link to comment
    On 6/19/2024 at 2:31 AM, itimpi said:

    The syslog in the diagnostics is the RAM copy and only shows what happened since the reboot.   It could be worth enabling the syslog server to get a log that survives a reboot so we can see what happened leading up to the crash.  The mirror to flash option is the easiest to set up, but if you are worried about excessive wear on the flash drive you can put your server’s address into the Remote Server field.

    Server crashed this morning 06/20/2024 at 9:54 am I got a router notification that it disconnected. attached is the syslog file in the logs folder directly from the flash drive. I had syslog server was enabled and should have been mirroring to the flash drive. If i did something wrong please let me know. there were also diagnostic zip files in the logs folder but they were all old.

    syslog

    Link to comment

    Syslog only has this:

     

    Jun 19 07:05:49 Zim rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="16081" x-info="https://www.rsyslog.com"] start
    Jun 19 07:52:22 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 19 11:14:18 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 19 13:16:55 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1
    Jun 19 16:02:51 Zim apcupsd[2096]: UPS Self Test switch to battery.
    Jun 19 16:02:59 Zim apcupsd[2096]: UPS Self Test completed: Battery OK
    Jun 19 21:34:55 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/nextcloud' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/nextcloud' 2>&1
    Jun 20 03:46:07 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 20 07:35:52 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1


     

    Nothing relevant, at what time was the crash?

    Link to comment
    7 minutes ago, JorgeB said:

    Syslog only has this:

     

    Jun 19 07:05:49 Zim rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="16081" x-info="https://www.rsyslog.com"] start
    Jun 19 07:52:22 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 19 11:14:18 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 19 13:16:55 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1
    Jun 19 16:02:51 Zim apcupsd[2096]: UPS Self Test switch to battery.
    Jun 19 16:02:59 Zim apcupsd[2096]: UPS Self Test completed: Battery OK
    Jun 19 21:34:55 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/nextcloud' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/nextcloud' 2>&1
    Jun 20 03:46:07 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 20 07:35:52 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1


     

    Nothing relevant, at what time was the crash?

    I have my router set up to notify me when the server goes offline. when the server crashes its stays powered on but the network disconnects and the whole thing locks up. I got that notification at 9:54am from my router saying the server went offline. that's when I think it crashed. 

     

    I set up the remote server syslog this time to a share which should show the full log next time it crashes. I'm not sure why the flash log is only a few lines. So next crash I'll send the log from the flash and the log from the share, they should be identical, but I guess we'll see. thanks 

     

     

    Edit update: i just checked the remote syslog on the share and its working and its already longer than the one on the flash so hopefully we get some good info on the next crash.

    Edited by koolnerd7
    Link to comment
    On 6/19/2024 at 2:31 AM, itimpi said:

    The syslog in the diagnostics is the RAM copy and only shows what happened since the reboot.   It could be worth enabling the syslog server to get a log that survives a reboot so we can see what happened leading up to the crash.  The mirror to flash option is the easiest to set up, but if you are worried about excessive wear on the flash drive you can put your server’s address into the Remote Server field.

     

    5 hours ago, JorgeB said:

    Syslog only has this:

     

    Jun 19 07:05:49 Zim rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="16081" x-info="https://www.rsyslog.com"] start
    Jun 19 07:52:22 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 19 11:14:18 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 19 13:16:55 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1
    Jun 19 16:02:51 Zim apcupsd[2096]: UPS Self Test switch to battery.
    Jun 19 16:02:59 Zim apcupsd[2096]: UPS Self Test completed: Battery OK
    Jun 19 21:34:55 Zim webGUI: Successful login user root from 192.168.1.207
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/Downloads' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs unmount 'cache/nextcloud' 2>&1
    Jun 20 03:09:16 Zim shfs: /usr/sbin/zfs destroy 'cache/nextcloud' 2>&1
    Jun 20 03:46:07 Zim shfs: /usr/sbin/zfs create 'cache/Downloads' 2>&1
    Jun 20 07:35:52 Zim shfs: /usr/sbin/zfs create 'cache/nextcloud' 2>&1


     

    Nothing relevant, at what time was the crash?

    Just crashed again. this makes 2 crashes in 24 hours.

     

    server went offline at 5:14pm.

     

    server was restarted at 5:36pm

     

    attached is the syslog directly from the flash drive right after crash and before the restart

     

    also attached is the syslog from the remote server share after server restart. 

     

    when the crash happened, the only thing the server was doing was streaming 1 show on plex to one device (TV) but it has crashed over night when the server is idle as well. so not sure if plex is relevant. it has also crashed with docker disabled. I don't use vms so that's always disabled. server is only used for plex and nextcloud and other dockers relating to those two use cases. any help would be appreciated.

     

    Again, most hardware was swapped out with known good hardware including motherboards, cpus, multiple sets of ram, even cases and power supplies. (I have a bunch of hardware I can swap and test again if needed) Not sure what else to do here. seems like an older version of unraid (6.12.4) crashed less than the current version. definitely longer up time with 6.12.4 for sure but there were at least 3 crashes that I can remember.

     

    Is this just an unraid thing and I should just wait for a fix? Is unraid really this unstable?

    Flash drive syslog right after crash syslog after server restart.log

    Link to comment

    Unfortunately there's nothing relevant logged.

     

    8 hours ago, koolnerd7 said:

    was doing was streaming 1 show on plex

    Plex has been known to crash many servers, does it crash with Plex stopped?

    Link to comment
    7 hours ago, JorgeB said:

    Unfortunately there's nothing relevant logged.

     

    Plex has been known to crash many servers, does it crash with Plex stopped?

    It crashes with docker completely stopped. Everywhere i read says thats a hardware issue but ive swapped almost everything. should i just go buy brand new hardware? all of it?!

    Link to comment

    You can try a less drastic approach, since you have multiple RAM sticks, try running the server with just one, if the same try the other one, that will basically rule out the RAM, next suspects would be board/CPU and PSU.

    Link to comment
    3 hours ago, JorgeB said:

    You can try a less drastic approach, since you have multiple RAM sticks, try running the server with just one, if the same try the other one, that will basically rule out the RAM, next suspects would be board/CPU and PSU.

    Ok, currently running 1 stick of RAM. I'll report back if there are any updates. 

     

    If stable with this single stick of RAM, how do I proceed?

     

    1. leave it alone with 1 stick of ram forever

     

    2. replace the RAM with a new set and go back to dual channel... if it crashes with the new RAM, should I test each RAM slot to rule out board/cpu

     

    Is it not possible to be on unraid's side? are issues just defaulted to hardware and unraid is perfect?

    Edited by koolnerd7
    Link to comment
    12 minutes ago, koolnerd7 said:

    If stable with this single stick of RAM, how do I proceed?

    Try the other one, if it crashes it's likely the DIMM, if it doesn't it could be the board, when using dual channel.

     

    13 minutes ago, koolnerd7 said:

    Is it not possible to be on unraid's side?

    It can also be software related, especiall containers, several container are know to crash server sometimes, e.g. Plex, for that, you can boot the server in safe mode with all docker containers/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. 

     

     

     

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...