6.10.0-rc2 Random Lockups


Go to solution Solved by JorgeB,

Recommended Posts

For the past few months I've been plagued with random lockups that are so severe even the console becomes unresponsive.  No ping, no response from keyboard and mouse... nothing.  Initially I was running the most recent version of 6.9 but out of desperation I migrated to 6.10.0-rc2 to see if it would make a difference.  I was still experiencing lockups on that too so to rule out the flash drive I also migrated that to a new device a few weeks ago.  I also replaced the power supply with a new unit around that same time as well.  Today I came home to another lockup, so this is getting pretty frustrating. 

 

The system is a self-built MiniITX system that's constructed around an ASRock H470 MiniITX motherboard, i7-10700K, 32-gigs of RAM, LSI SAS controller, a pair of WD-Black 1TB NVMEs for the cache drives and a combination of Hitachi 10TB SAS and 8TB SATA drives on a SAS backplane connected to the LSI HBA. 

 

I do have syslogging enabled and as far as I can see it looks like the system unexpectedly rebooted at 9:22 this morning, and then sometime between then and this afternoon completely locked up.  I don't see any signs of a kernel panic et el, but perhaps folks here will have a keener eye then me.

 

Diagnostics zip attached.

 

Any ideas?

unraid-diagnostics-20220211-2354.zip

Link to comment

Unfortunately it locked up again last night.  I was actually in the middle of using the code-server docker when the whole Unraid system suddenly disappeared from the LAN... no response to ping or anything.  Switching over to the local console, I could still interact via keyboard in that it would let me enter root to login in, but after hitting enter on root it would just return a blank line and never prompt for a password.  All the consoles (ALT-F1, ALT-F2 etc) behaved the same way.

 

Ultimately I had to hit the reset button to force a reboot, and it's back up and running yet another parity check.

 

Is it possible that the corefrq plugin wasn't completely removed from use without a reboot, or does it sound like I have something else going on?

 

In any event, most recent diagnostics are attached just in case.

unraid-diagnostics-20220214-1347.zip

Link to comment

Interesting... that may be it although I seldom do hardware transcoding.  From the sounds of it though, the i915 inflicted lockups can occur at any time, with or without the iGPU being taxed at the time so maybe that really is my problem.

 

For now I've blacklisted the i915 module, removed the Intel GPU Top and GPU Statistics plugins, removed /dev/dri from my Plex docker, and within the Plex settings disabled hardware acceleration.

 

lsmod shows no sign of i915 nor does dmsg or any mention in syslog, so I think I've got the i915 module pretty well out of the loop now.

 

Hopefully that finally does the trick for me.  I'll keep an eye on that thread for updates... thanks!

Link to comment
  • 3 weeks later...

Okay its been three weeks since I blacklisted i915 and stopped using the Intel GPU.  I am happy and relieved to report I have not experienced a single lockup since then.

 

Hopefully future RC versions of Unraid 6.10 implement a fix so I can put my GPU back to use, but in the meantime I am thrilled to have a reliable system again.

 

Thanks!

 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.