• unRAID 6.6.5 Total system lockup unresponsive from console, ssh, network, parity check stops, no disk activity at all, no VM working, nothing functioning what so ever.


    Rudder2
    • Solved Urgent

    Total system lockup unresponsive from console, ssh, network, parity check stops, no disk activity at all, no VM working, nothing functioning what so ever.  I suffer this lock up when ever I install 6.6.0+.  Though it was because of the RealTek NIC problem so downgraded to 6.5.3 and then updated to 6.6.3 and had worse network issues so down graded to 6.5.3.  Then Upgraded to 6.6.5 and the RealTek NIC error has been rectified but the Lockup problem remains. 

     

    The system will run for about 5 or 6 days lock up then run 6 hours lock up and run 3 hours lock up then 30 minutes and lock up.  I can't get system logs because they are not stored on flash.  I already downgraded to the stable 6.5.3 image.  My system log on first boot after the 4 lock up in one day is attached.  The first was after the first lock up and the last after the 4th.  Don't know if they will help because they were taken after a reset button press to get the system back up.

     

    BIOS was updated to the latest BIOS for my motherboard after the second lockup because I saw a hardware error about microcode in the log file 1923.  That same error is in log file 2357.

    rudder2-server-diagnostics-20181114-1923.zip

    rudder2-server-diagnostics-20181114-2357.zip




    User Feedback

    Recommended Comments



    On 11/16/2018 at 12:20 PM, limetech said:

    This would be a good question for you to pose to ASRock. 

    So, ASRock basically told be to buy a board that they designed for Linux because they designed this board for winblows and refused to answer the question about the Microcode version in their BIOS. 

    Link to comment

    Is there any solution yet ? I am still getting the complete lockups when using a win10 vm (see my bug report) and can't find any solution, tried so many things already.

    Link to comment
    On 11/19/2018 at 6:20 PM, jonathanm said:

    Is there a list?

    They haven't responded back to me yet.  I looked at all their current boards and not one lists Linux. 

     

    I'm hoping they message me back but I'm starting to think ASRock is not the manufacture to use if you want to run Linux.  I've read on the Ubuntu forms that others have had the same issue with ASRock.  I wasn't even asking them about a Linux related problem...Was just curious why they haven't produced a microcode updated BIOS since R24, if it was because of comparability or just haven't gotten around to it to help with our diagnostics.  I like ASRock boards but I think this will be my last...I don't have a winblows computer in my house...Just a winblows 10 VM on my unRAID server to do Steam in home streaming for games not compatible with Linux.  I gave up on MicroShaft after windblows 7.  I've been Linux 100% since 2012. 

     

    For the future of my computer building can someone suggest a more Linux friendly mother board Manufacture?

     

    It's just odd that I only get machine check messages when I'm on unRAID 6.6.0 - 6.6.5.  I think we figure that out we will figure out why my system locks up when upgraded.

    Link to comment
    2 minutes ago, Technikte said:

    Is there any solution yet ? I am still getting the complete lockups when using a win10 vm (see my bug report) and can't find any solution, tried so many things already.

    I never tried to disable my Winblows 10 VM.  For you, when you disable you VM it works?

    Link to comment

    Example:

    Well when I start my VM and stay on OS installation page for key or so...and wait at some point it randomly crash the vm and complete lockup the host (unraid).

    Link to comment

    i have the exact same problems. from time to time the server just freezes and even no ssh is available. The curser just stops blinking. Even directly on the Server you can't type anything. Can't even drop a log file bcz of no response. i have an Supermicro Board.

    Edited by cogliostro
    Link to comment

    I’ve had lockup’s after auto updates to dockers run. This is on a ml30 g9

    Edited by 1812
    Link to comment
    On 11/26/2018 at 5:33 PM, 1812 said:

    I’ve had lockup’s after auto updates to dockers run. This is on a ml30 g9

    I have these lockups also, after doing docker updates.

     

    Threadripper with Latest Bios on Asrock x399

    Link to comment

    Installing 6.6.6 today and putting fix common problems in to diagnostics mode just in case since I haven't had a successful install of a 6.6.x update.  Really trying to capture the logs as it happens this time.  I will also power down the server for a couple minutes and power it back on after the update instead of issuing a reboot.  Fingers Crossed!

    Link to comment
    On 11/19/2018 at 6:20 PM, jonathanm said:

    Is there a list?

    ASRock is no longer responding to my messages because I'm using an OS not listed as supported.  This is unbelievable.

    Link to comment
    1 minute ago, Rudder2 said:

    ASRock is no longer responding to my messages because I'm using an OS not listed as supported.  This is unbelievable.

    Are any linux variants listed? Unraid is running on slackware, FWIW.

    Link to comment
    1 minute ago, jonathanm said:

    Are any linux variants listed? Unraid is running on slackware, FWIW.

    No, only window is listed on the board I have.  I bought this board 4 years ago for use with unRAID and it's been working GREAT since I installed it till the problems I'm having with 6.6.x.

    Link to comment
    16 hours ago, Rudder2 said:

    Lasted 30 minutes again.  What ever happened it's worse now.  Here are the logs again...Not sure if they will help.

     

    Here are the logs from the 2nd try.

    rudder2-server-diagnostics-6.6.6-20181203-2029.zip

    FCPsyslog_tail-2018-12-03 6.6.6 2nd try.txt

     

    What's next to try?

    Earlier in the topic I said we could give you a version of 'bzroot' which does not include the "early microcode update" function, that is, whatever bios reloads (if anything) will be what processor uses.  If you want to try this, send me an email and I'll give you a link with instructions.  [email protected]

     

    Here's the support issue for us though:  Suppose this works: no more crashes for you.  This pretty much proves there is a hardware/microcode/bios issue with that particular motherboard.  We can probably figure out a way to make microcode update a boot option, however people will then tend to use it (!).  Why are we in this mess to begin with?  It's because of serious security issue known as Meltdown/Spectre (plus numerous variants).  Trust me when I say, the changes in the linux kernel are complex and extensive, and some depend on working with certain versions of microcode.  Not loading the latest microcode might seem to work, but you might be vulnerable to other kinds of crashes for which we'll be blamed as well.  Ultimately the correct solution, if the next few kernel updates still crash, is to buy a newer motherboard.

    Link to comment

    Another question: are there more people using the ASrock Z97 extreme6 motherboard and experience the same issue or not?

     

    I am using an ASrock Z87 extreme6 motherboard (the predecessor) and an Intel core i3 processor without issues.

    Link to comment

    I assume you may have to see if any bios modders out there updated the microcode for this board outside asrock.

     

    Asrock will not help you out.

     

    Many people on their forums reporting issues with windows 10's software updates to patch spectre with this board too.

     

    Seems to be a bad implementation that isn't agreeing with intel's microcodes.

     

    I would lay blame on asrock for this one.

    Link to comment
    11 hours ago, Dazog said:

    I assume you may have to see if any bios modders out there updated the microcode for this board outside asrock.

     

    Asrock will not help you out.

     

    Many people on their forums reporting issues with windows 10's software updates to patch spectre with this board too.

     

    Seems to be a bad implementation that isn't agreeing with intel's microcodes.

     

    I would lay blame on asrock for this one.

    Thank you for this information.  I was afraid of this when they stopped replying to me about BIOS info and pointed to the fact I'm running Linux.  I'm trying an experimental 6.6.6 right now with out the Microcode added and this should shed more light on the problem, we hope. 

     

    I'm loosing faith in ASRock.  If this proves to be their problem, what manufacture would you recommend?  Always thought of ASRock as a great company.  As said in Indiana Jones and The Last Crusade "He Chose Poorly"

     

    Ironically, my Shuttle, which isn't known for their awesomeness and stopped updating the BIOS in 2016, XPC barebones computer build running a 4770K, hasn't run in to issues and it was built 6 months before my super expensive unRAID server build running KUbuntu 18.04 with kernel 4.15.0-38, the latest for 18.04.  I just noticed that it is way older of a Kernel than unRAID. 

    Link to comment
    1 hour ago, Rudder2 said:

    Thank you for this information.  I was afraid of this when they stopped replying to me about BIOS info and pointed to the fact I'm running Linux.  I'm trying an experimental 6.6.6 right now with out the Microcode added and this should shed more light on the problem, we hope. 

     

    I'm loosing faith in ASRock.  If this proves to be their problem, what manufacture would you recommend?  Always thought of ASRock as a great company.  As said in Indiana Jones and The Last Crusade "He Chose Poorly"

     

    Ironically, my Shuttle, which isn't known for their awesomeness and stopped updating the BIOS in 2016, XPC barebones computer build running a 4770K, hasn't run in to issues and it was built 6 months before my super expensive unRAID server build running KUbuntu 18.04 with kernel 4.15.0-38, the latest for 18.04.  I just noticed that it is way older of a Kernel than unRAID. 

    Honestly I've had a lot of success with ASRock.  While I'm disappointed to hear that their support has been less than helpful, I still feel they are a good choice for motherboards generally speaking.  This seems more like a one-off issue with a particular model.  Just stay away from Gigabyte other "cheaper" brands.  Generally speaking I tend to lean towards ASRock, Asus, and SuperMicro as my favorites (not in any particular order either).

    Link to comment
    On 12/4/2018 at 1:33 PM, limetech said:

    Earlier in the topic I said we could give you a version of 'bzroot' which does not include the "early microcode update" function, that is, whatever bios reloads (if anything) will be what processor uses.

    Just for information in this thread, this didn't change the problem.  I installed the custom update that made it so the MicroCode wasn't installed by the Kernel.  My server still locks up every 23 minutes of run time like it started with 6.6.6, It use to run 5 or 6 days before the first lock up 6.6.0-6.6.5. 

     

    Also for shits and giggles I ran a Memory Test again and it passed with out issues.  Anything else to try would be greatly appreciated.

     

    Here are my logs:

    FCPsyslog_tail-2018-12-05 6.6.6 custom BZRoot.txt

    rudder2-server-diagnostics-6.6.6-20181205-1124-custom BZRoot.zip

    Link to comment
    3 minutes ago, jonp said:

    Honestly I've had a lot of success with ASRock.  While I'm disappointed to hear that their support has been less than helpful, I still feel they are a good choice for motherboards generally speaking.  This seems more like a one-off issue with a particular model.  Just stay away from Gigabyte other "cheaper" brands.  Generally speaking I tend to lean towards ASRock, Asus, and SuperMicro as my favorites (not in any particular order either).

    Thank you for the information.  I will not ax ASRock as a considerations for future upgrades.  They migh also feel like it's a 4 year old board so they don't think they should support it.  I bough ASRock because of friends that had 10 year old computers running on their boards with out a hitch.  I planned on getting 8 to 10 years out of my unRAID hardware before upgrading or replacing.

    Link to comment

    I tested my system without my VMs running.  It's has been 100% stable for the better part of a week.  I start my Windows 10 VM and it locked up the system.

     

    It's starting to look like a Problem with Windows 10 VM with nVidia GTX 960 passthrew  when I upgrade to 6.6.x not a hardware issue or a problem with the core unRAID.  If it is we need to figure out why this is...I need my Windows 10 VM for college...Luckily classes are ending so I don't need it for the next 3 weeks and I will have more time to try to diagnose the problem.  I've had this Windows 10 VM with passthrew for over a year.

    2018-12-12 Windows 10 VM Configuration Page.pdf

    Windows Steam Streaming.xml

    Edited by Rudder2
    Adding VM XML and PDF of Configuration
    Link to comment

    Now i changed my ramm, changed my ports of the parity drives and SSDs. Still recieving this random lockdown freezes. Even in console window direct attached to the Server no response at all. This error persists from 6.6.2 to 6.6.6

     

    Is there any solution to this issue? 

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.