Upgrade to 6.9.2, Getting freezes requiring hard reboot every week or so


Recommended Posts

Mine is still up and running.

 

Finished the parity without issue.

 

One other thing I did was turn off all my scheduled scripts in the user scripts plugin. I don't think that should have anything to do with anything though.

 

11 hours ago, muzo178 said:

 

just as before...

 

zero, zilch, zip, nada, nothing.

 

:(

Have you removed any remaining files from the /boot/config/modprobe folder

Or is there anything in your go file?

Link to comment
37 minutes ago, jkirkcaldy said:

Have you removed any remaining files from the /boot/config/modprobe folder

Or is there anything in your go file?

 

In my go file I'm only modprobing it87 drivers for getting the fan speed. Removed everything else pertaining to i915.

 

I just checked my flash backup from 6.10-rc2, and there was a i915.conf file in modprobe.d. Maybe removing that after the upgrade might make a difference, but I really don't have the patience to go upgrade and then have to downgrade again. :) 

 

I'll try with rc3 if gets released or 6.10 final...

 

Do you have flash backups from 6.9 or 6.10 that you can check if you have this file in modprobe.d @Tristankin?

 

On another note, if it is not needed, the upgrade process should be removing that. It's not like we do fresh flash setups on every upgrade....

Edited by muzo178
Link to comment
5 hours ago, muzo178 said:

Do you have flash backups from 6.9 or 6.10 that you can check if you have this file in modprobe.d @Tristankin?

 

 

6.9.2 Backup file, let me know if you want it shared, but the following should answer your queries.

 

Go is empty, and i do have the touch file. I haven't tried 6.10 but I understand the touch operation is no longer required.

 

image.png.858196a1ef0f4aa513640ba5f9855935.png

 

image.thumb.png.38df24604a1c8a1ecd5390d321af61e7.png

Link to comment

Theoretically they are just two methods of loading the module, the only difference is 6.9.x is loaded when a file exists where in 6.10.x it is loaded when the system detects the relevant hardware. I think the main reason for this was to make sure the module was loaded for people booting with a GUI.

I think the main difference is the upgrade in kernel potentially. Early 5.x kernels were plagued with intel binary blob issues but this should have been cleared up by the kernel in 6.9.2. Obs not though. I might try an upgrade when I get a free weekend and report back.

Link to comment

Having the same issues and have been for months. Has been driving me nuts, especially since I created the setup to host a pfsense VM for routing and every time my system dies, the whole family loses their mind with the internet being out. I moved the i915.conf file out of /boot/config/modprobe.d and restarted the system today and removed any gpu plugins I had installed. This removed the device from being able to be added in the docker container. So I added the modprobe i915 to the go file and it showed back up. I will see if this route is better than the other and try a few variations of this and report back.

Edited by sleepinglion251
  • Like 1
Link to comment

I have encountered the same issue. Running 6.8.3 flawlessly for 462 days. I "upgraded" to 6.9.2 and almost immediately experienced system freezes, slow degradation in connectivity (i.e. webgui not responding, unable to connect via telnet), failure to shutdown the array or the system gracefully,  and unable to get diagnostics. After a couple of days I rolled back to 6.8.3 and the system has been running great. So I know it isn't a hardware problem.  Unfortunately,  I can't get a diagnostic dump to help figure out the problems with 6.9.2. 

My signature has my hardware configuration listed if that helps.

 

Dale

Link to comment

I will add my experience to this thread as another data point.

 

My system ran flawlessly for 18 months.  Then in July of this year (with no hardware changes) it started locking up randomly anywhere between 5-15 days of uptime.  I had been running unRAID 6.9.2 since release (three months) with no problems so I initially suspected a hardware issue. 

 

I have done the following yet the random lockups continue:

  • replaced power supply
  • ran a long memtest on my ECC RAM (no problems found)
  • rolled back the BIOS on the MB
  • Installed unRAID 6.10.0 RC1
  • set Docker to use ipvlan instead of macvlan

Lockups only began while on 6.9.2 and have continued with 6.10 (both with Linux kernel 5.x)

 

Intel GPU Top and GPU Statistics plugins are installed.

 

I do have an Intel CPU with iGPU and use the i915 drivers.  I called modprobe i915 in the go file for 6.8.3, used the touch method (i915.conf in modprobe.d) for 6.9.2 and am now running 6.10.0-rc2 (upgraded from rc1 today).  I have not seen any lockups when Plex is hardware transcoding.  There is never anything useful in the syslog when it locks up.  However, the IPMI logs report OS Stop/Shutdown which makes me believe the issue is kernel related.

Edited by Hoopster
Link to comment

Yeah, seems to be many people in the same boat with the i915 module on 6.9.x and 6.10 releases.

Probably would be a good time to submit a bug report, and it would also be good if the mods and community stop blaming these hangs on hardware issues when the system shows empty logs on intel hardware with the i915 module loaded. All pretty easy touchpoints to pick up on in the diagnostics report.

Link to comment
  • 2 months later...

Hello, I am experiencing some issues with unraid lately. The primary problem is that my plex server (container) will show as unavailable on devices that I've shared the server with. Not sure what the issue is but I've tried to restart the container and then that hangs up in the gui and does nothing. So I try to restart the entire server via web gui and that also hangs. The only way to fix it is to go to the server and do a physical reset on the server. I'll post my info below.

 

Unraid OS 6.9.2

Motherboard:

Intel DQ87PG, Version AAG74154-401
Version PGQ8710H.86A.0030.2013.0403.1355
BIOS dated: Wed 03 Apr 2013 12:00:00 AM PDT

Processor:

Intel Core i5-4670K

 

Please know that I am not very knowledgeable about linux/docker and I have not collected any logs yet.

Link to comment
  • 1 month later...

Just adding, I was on 6.9.2 and tried 6.10-rc4.

 

Both causing crashes almost daily that are hard lock-ups requiring full reboot (with parity check flag after)… but only on my Intel iGPU system.

 

Same as users above - nothing in logs or syslog.

 

I migrated from a 28-Core Dual Xeon to an i-7700k to use QuickSync, which works amazing when the system doesn’t lock up (almost unlimited transcodes… that even a 28-core Dual Xeon couldn’t keep up with on CPU, in software).

 

Dual Xeon solid on 6.9.2 for months.

Swap to i-7700k hard reboots required as per the thread above (i-7700k system has been running multiple months on Windows, Memtest86 multiple passes, Prime95 multiple days to stress test… so it’s not a hardware issue).

 

I’ve got both systems, as well as two licenses for unRAID and am willing to troubleshoot further… but this largely seems like a kernel issue of some sort and requires some sort of lower-level logging to solve.

 

If you need any more data or my to add to the bug report, happy to.

Link to comment
2 hours ago, dg6464 said:

Just adding, I was on 6.9.2 and tried 6.10-rc4.

 

Both causing crashes almost daily that are hard lock-ups requiring full reboot (with parity check flag after)… but only on my Intel iGPU system.

 

Same as users above - nothing in logs or syslog.

 

I migrated from a 28-Core Dual Xeon to an i-7700k to use QuickSync, which works amazing when the system doesn’t lock up (almost unlimited transcodes… that even a 28-core Dual Xeon couldn’t keep up with on CPU, in software).

 

Dual Xeon solid on 6.9.2 for months.

Swap to i-7700k hard reboots required as per the thread above (i-7700k system has been running multiple months on Windows, Memtest86 multiple passes, Prime95 multiple days to stress test… so it’s not a hardware issue).

 

I’ve got both systems, as well as two licenses for unRAID and am willing to troubleshoot further… but this largely seems like a kernel issue of some sort and requires some sort of lower-level logging to solve.

 

If you need any more data or my to add to the bug report, happy to.

Looks like I'll be staying on 6.8.3 for a while. I thought about upgrading my processor and motherboard to an 11th or 12th generation Intel but it sounds like even that won't guarantee a solution.

Link to comment

@Tristankin… how was the rollback to 6.8.3 from 6.10-rc’s? You mentioned some complications with dockers updating and such.

 

My 6.10-rc4 system is fully updated from all aspects, plug-ins, dockers, etc.

 

Has it been a tough process to roll back for you?

 

If I have one more full crash… I may have to consider it.

Link to comment
On 4/3/2022 at 1:22 PM, dg6464 said:

@Tristankin… how was the rollback to 6.8.3 from 6.10-rc’s? You mentioned some complications with dockers updating and such.

 

My 6.10-rc4 system is fully updated from all aspects, plug-ins, dockers, etc.

 

Has it been a tough process to roll back for you?

 

If I have one more full crash… I may have to consider it.

Rolling back was easy. 6.8.3 has been the last rock solid release for me. My only complaint is the Dockers don't show if a new release is available but I am able to force update. 

Link to comment
  • 1 month later...

Was having the same issue on 6.9.2. Almost just gave up entirely and settled on non-HW transcoding until I saw this thread. Rolling back to 6.8.3 has worked flawlessly. My thought is that it has something to do with bugginess in the new kernel version, but I could be wrong. Wish the developers would look into it as it would be nice to be able to run the most current version of unraid, but if it works it works I guess.

Link to comment
  • 4 weeks later...

I seem to have a similar issue recently. My unraid server is just a data server and doesn't run 24/7, no vm nor docker but it randomly lock up and screen goes black. My system is just using intel IGP too from Pentium G4560 cpu. It was perfectly running fine when I ran it for days last March when I added 2 drives which from the time table it was at UnRaid 6.8.3. I haven't tried running 6.9.2 for long hours until last week that I start to notice the freeze on the system. I thought might be bad RAM but my memtest is all fine and I couldn't find any notable errors on syslog either. And I can't complete a 4 parity checks without freezing my server. Atm trying 6.9.2 on safe mode if I can finish the parity check. After this I will try to rollback to 6.8.3

Edited by sleanzles
Link to comment
18 hours ago, Tristankin said:

 

Let us know how you fare.

6.9.2 safe mode froze after few hrs. I tried 6.10.2 since I cannot find 6.8.3 to download and it took 19hrs before it froze my parity check at 48%. For a second I really thought I'm getting there.... Sigh...

 

Where can I get unRaid 6.8.3? :(

Edited by sleanzles
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.