Jump to content

unRAID 6.6 Lockup after upgrade


cammelspit

Recommended Posts

Every two or three days now after upgrading to 6.6 I get an unrecoverable lockup of my server. My pfSense VM is STILL FUNCTIONING so the machine hasn't locked up entirely when this happens but the unRAID UI is entirely non-responsive, all dockers are unresponsive, no shares of any kind are accessible and I am then forced to locally log into the server and reboot from the command line. My server has been up and going for months under the previous release. I read on Reddit some other people were having lockups that seem to be triggered by NFS shares so I have disabled NFS shares since I don't strictly need them, as a troubleshooting measure but it will be a couple days before I can tell if that really is the problem or not. I don't have access to the GUI when this happens so I can't pull any relevant logs, is there some form of extra logging I can turn on so I can try and get to the bottom of this? Also, what is the easiest way to downgrade back to 6.5.3? Seems 6.6 needs some more time in the oven.

 

Sorry, I'm still learning here and am not super great with Linux so any help in figuring this out would be greatly appreciated. Any diagnostics or whatever I can do?

Link to comment

And just minutes ago I get the notification that there is now a 6.6.1 update. If I do this update and it doesn't fix my issue, will the option to downgrade to 6.5.3 still be there? I was searching the tools and saw that there is indeed a rollback function but I don't want to do the 6.6.1 if it only keeps the last upgrade in there, which would be 6.6 and that's not quite working for me ATM. You know, I just need to have by bases covered before I do it. I have a backup of my USB and all but that would be a hassle compared to just rolling it back.

Link to comment
31 minutes ago, cammelspit said:

And just minutes ago I get the notification that there is now a 6.6.1 update. If I do this update and it doesn't fix my issue, will the option to downgrade to 6.5.3 still be there? I was searching the tools and saw that there is indeed a rollback function but I don't want to do the 6.6.1 if it only keeps the last upgrade in there, which would be 6.6 and that's not quite working for me ATM. You know, I just need to have by bases covered before I do it. I have a backup of my USB and all but that would be a hassle compared to just rolling it back.

 

The rollback through Tools-->Update OS will only show the 6.6.0 version after upgrading to 6.6.1.  When upgrading, the previous version files are copied into the "previous" folder on the UNRAID flash drive and Restore simply copies those files to the flash drive root to do a rollback. Only the previous version is kept.

 

After upgrading to 6.6.1, you will lose the ability to rollback to 6.5.3 through the GUI restore method.  However. 6.5.3 is still posted for download at unraid.net and you can rollback manually to this version from 6.6.1. by copying the appropriate files to the root of your flash drive.  You could also copy the contents of the current "previous" folder (which currently contains 6.5.3) to some other folder on the flash drive and manually rollback to these files if needed as "previous" will contain 6.6.0 files after the upgrade.

Link to comment
51 minutes ago, Hoopster said:

 

The rollback through Tools-->Update OS will only show the 6.6.0 version after upgrading to 6.6.1.  When upgrading, the previous version files are copied into the "previous" folder on the UNRAID flash drive and Restore simply copies those files to the flash drive root to do a rollback. Only the previous version is kept.

 

After upgrading to 6.6.1, you will lose the ability to rollback to 6.5.3 through the GUI restore method.  However. 6.5.3 is still posted for download at unraid.net and you can rollback manually to this version from 6.6.1. by copying the appropriate files to the root of your flash drive.  You could also copy the contents of the current "previous" folder (which currently contains 6.5.3) to some other folder on the flash drive and manually rollback to these files if needed as "previous" will contain 6.6.0 files after the upgrade.

Thanks for the tip, I'll just back it up and do a proper rollback only if needed. It'll be a couple days before I know if it's properly stable anyways. I'll post back next week and let you know how it went. 👍

Link to comment

Whelp, it just locked up again. The VM is working still for my pfSense, otherwise, I would not be able to write this right now but my pfSense VM is using its own passed through quad port Intel NIC. If there is a patch, why wouldn't it get applied before release? I mean, I haven't the foggiest on how to actually do that so I kind of rely on LT to do that sort of thing for me. Damn, I guess rolling back to 6.5.3 is my only reasonable choice, it's a shame too because I LOVE the new CPU pinning stuff, a LOT.

Link to comment
23 minutes ago, cammelspit said:

If there is a patch, why wouldn't it get applied before release?

Because there is not necessarily a patch and Limetech did not know there was a potential issue with Realtek NICs until after the release of 6.6.x.  Obviously, some Linux kernel update or driver included in 6.6.x has had a negative impact on Realtek NICs in unRAID servers (if this is indeed the source of your issue).

 

The Linux driver has historically not received much attention from Realtek.  Limetech does their best to find the most compatible driver, but, in the past, Realtek drivers have been problematic for other reasons as well.

 

There is a bug report on it now which Limetech asked to be logged, so, they will do their best to look into and see if there is a patch or driver update that can help resolve this issue if Realtek drivers are the cause for these lockups and lost network connectivity issues.

Link to comment

I did the diagnostics thing and I have that ZIP so I will have to chime in and post it for them. My server is using a board with a Realtek NIC built in. I use the Realtek for unRAID and I use my 4 port intel for pfSense, which is my guess as to why the internet kept working despite unRAID itself being inaccessible. I originally did this just to avoid having to fiddle with virtual network adapters and all but now I am happy I did this because it makes my whole setup more robust against this kind of failure. I would bet this is my failure too since obviously, unRAID is still running technically since the VM didn't stop working, only any and all communication going over the Realtek NIC.

 

If this remains stable I may end up just adding in another Intel NIC of either the same kind or maybe get something more advanced and add it to work around the issue.

 

I took your advice BTW and was easily able to copy over the previous folder from my backup of 6.6.0 and copy it over the 6.6.1 previous folder and it reverted perfectly back to 6.5.3. Seems to be working properly at the moment but I will still update if it happens again.

 

I know my first foray into the forums here was not a great one, I'm sure you remember that one, but I do understand. Realtek is just so common I find it a little odd that they are not supported better. I'm just glad that if this were to happen during the middle of the night while my wife is working it wouldn't have interfered with her work as she needs the internet to do it. Gotta celebrate the small victories I guess. 😩

 

Thanks guys!

Link to comment
40 minutes ago, cammelspit said:

Realtek is just so common I find it a little odd that they are not supported better

Realtek NICs on motherboards are very common, more common than Intel.  That usually comes down to cost.  The Realtek chip is simply a cheaper option for MB manufacturers.  Other than cost, the biggest difference is that Realtek chipsets tend to offload more processing to the CPU  (software) whereas the Intel chipsets process more on the chip (hardware). 

 

Realtek seems to care more about Windows than Linux as far as driver support.  They frequently update their Windows drivers whereas the Linux driver updates are far, far less frequent. 

 

My first unRAID server had a Realtek NIC and I had no issues.  I have since moved on to other motherboards and I always look for one the does not have a Realtek NIC simply to avoid potential issues since they tend to crop up from time to time with Linux.

 

Although he is taking about add-on NICs, this is an interesting read that still applies to MB chipsets:

 

https://dfarq.homeip.net/intel-nic-vs-realtek/

Link to comment

Pretty good read actually. I was generally aware of Realtek NICs and but have never seen the need to get an add-in card until I started switching out my networking equipment for pfSense, rewired my whole house with multiple runs of higher quality cabling for future expandability and some better switches. I've just never encountered this kind of problem before, though my Linux experience is limited to an old install I did in the 90s just to play around with it and a dual boot on my main PC I kept for about 6 months before I decided to switch back and forth just to play a game or two was too much hassle. I am pretty convinced this has to be the problem and I will likely be spending some cash on a new NIC. This is for the server that benefits two households so spending a little cash to get it working smoothly and reliably is WELL worth it IMO. Heck, the only reason I have this specific MB is the fact that it was free. 😁 I guess I was just not really so acutely aware of the reasons why everyone always recommends Intel brand NICs for applications like this. If it can reduce my headache by even a little, I'll prolly toss cash at it till it works. 😝

Link to comment

It looks like Limetech has identified a patch which may fix this issue.  See the bug report for details and link to the patch.

 

[PATCH net] r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO

 

r8169 is the Realtek Linux driver included in unRAID, so, this is a Realtek-specific patch.  Could be the solution once Limetech is able to get it into an unRAID release.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...