• [6.8.1] crash on boot (sometimes)


    JorgeB
    • Solved Minor

    There are been several reports of serves crashing on boot after updating to v6.8.1, and they all crash in the same place:

     

    /etc/rc.d/rc.M: line 164:   modprobe -r $DRIVERS

     

    Some examples

     

    https://forums.unraid.net/topic/87378-server-fails-to-start-after-update/

    https://forums.unraid.net/topic/87435-system-hang-681/

    https://forums.unraid.net/topic/87664-sometimes-my-unraid-isn’t-starting-since-last-update/

     

    I just had the same happening to me after upgrading one of my servers to v6.8.1:

     

    iKVM_capture.jpg.63546531bbd2d37540b34bd53115223a.jpg

     

    Rebooted and it crashed again:

     

    26682209_iKVM_capture(1).jpg.89bec61a909f4369c31bfd11b1b61d92.jpg

     

    Ran chkdsk but no errors found:

     

    1670611132_flasherror1.PNG.296176ca39ed9c3f37678083bd8f4210.PNG

    729240766_flasherror2.PNG.c8518c4aea818ef4c30e4c64289b5e2d.PNG

     

    Rebooted again and this time server started, rebooted two times successfully and then it crashed again on the 2 next ones, one more attempt and again it booted again correctly, attaching diags more to see the hardware used, but other reports are with very different hardware, AMD and Intel based servers.

     

    It's strange because it only happens sometimes, my last boots:

     

    X X (ran chkdsk) V V V X X V

     

    X - crashed

    V - booted

     

    tower5-diagnostics-20200123-0752.zip

     

    Video showing where it crashes:

     

    This server uses

    Edited by johnnie.black




    User Feedback

    Recommended Comments



    20 minutes ago, darrenyorston said:

    At the moment I get the error 100% of the time.

    If you read above you can see we already found out what's causing the problem, Intel NIC using the igb driver, doesn't mean that everyone using it will crash, but AFAIK everyone that is having issues is using one.

    Link to comment
    9 hours ago, Hoopster said:

    Both of my onboard NICs are Intel.  One uses the igb driver and the other uses the e1000e driver.

     

    7 hours ago, limetech said:

    Yes that is true though at present only one server has both connected to a switch,  using bonding driver - simple failover mode.

    Did you re-assign the interfaces under network rules?

     

    Only when a re-assignment is done, the file network-rules.cfg is created and a driver reload happens upon reboot (this is the point where the call trace occurs).

     

    As a test you could change the assignment of the interfaces (e.g. swap eth0 and eth1) and reboot the system.

    Link to comment
    6 hours ago, bonienl said:

     

    Did you re-assign the interfaces under network rules?

     

    Only when a re-assignment is done, the file network-rules.cfg is created and a driver reload happens upon reboot (this is the point where the call trace occurs).

     

    As a test you could change the assignment of the interfaces (e.g. swap eth0 and eth1) and reboot the system.

    So strange finding: Both my adapters are using the igb driver.  Switching which MAC was assigned to ETH0/ETH1 in the network-rules.config allowed me to boot into the system. 

     

    However, after rebooting again, got stuck at the same modprobe -r $DRIVERS step.

    Link to comment

    Darn, looks like I have this problem as well. Too bad I did a remote reboot and now I can't get into my system (I know, I should never do a remote reboot). :(

    I have an ASROCK Z390 Taichi board and using both NICs

    Can't really give anymore info until I get home

    Link to comment

    Another Intel i211 onboard nic (x2) as well as Mellanox ConnectX-3.

    No boot was possible after a power cycle or even via safe mode (gui), restoring 6.8.0 was the only change that restored my system.

     

    Link to comment

    I have two onboard Intel nic and noticed the error repeatedly. I wasn't able to boot my machine at all, which I found weird because it rebooted from the upgrade without issue. I hadn't made any configuration changes to the interfaces or anything. I was finally able to consistently boot my device after a few tries in the BIOS.
    I started out as unbootable with primary NIC enabled with ROM type of PXE and secondary NIC enabled with ROM type disabled. I then completely disabled the secondary NIC and was still unable to boot. So I left the secondary NIC disabled and changed the ROM type on the primary from PXE to disabled and I could magically boot again.

    Edited by evenstay
    Link to comment

    Just tried upgrading my home NAS to 6.8.1, have the same issue.

     

    SuperMicro X11SSL-F motherboard

     

    EDIT: and how do I downgrade safely to 6.8.0 without any data or configurations?

     

    EDIT2: After reading through other threads, I restored 6.8.0 from /previous folder that is present in the root directory of the USB stick and it solved my issue (as expected)

    Edited by n0stalghia
    Link to comment
    1 hour ago, limetech said:

    Please see if Unraid OS 6.8.2 solves this issue.

    Solved for me.

     

    I do get some questionable driver related messages

    Jan 26 20:13:30 vesta kernel: igb: loading out-of-tree module taints kernel.
    Jan 26 20:13:30 vesta kernel: igb 0000:06:00.0 eth1: mixed HW and IP checksum settings.
    Jan 26 20:13:30 vesta kernel: igb 0000:07:00.0 eth2: mixed HW and IP checksum settings.

     

    Edited by bonienl
    • Thanks 1
    Link to comment
    9 minutes ago, RyanOver9000 said:

    I installed 6.8.2 today and am now receiving this error. :( Going to try to do a completely fresh upgrade and see what happens.

     

    Deleted the network-rules.cfg and it booted right up. Thanks for the help.

    Link to comment
    On 1/23/2020 at 6:18 PM, tompapajr said:

    Running into the same issue here.  Brand new install (6.8.1) on a SuperMicro X10DRL-i  which has 2x Intel i210 Gigabit Adapters and a Realtek RTL8211E (Dedicated for IPMI). Tried also booting Safe Mode without any success.  Worked fine on first boot-up and configuration.  First time I rebooted the system, I got stuck at modprobe -r $DRIVERS. 

     

    Let me know if theres anything I can do to help.

    6.8.2 also resolved my issues too.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.