Jump to content

Unraid Server Crashing Constantly After Updating to Latest Unraid OS


Mwidders

Recommended Posts

I am by no means an expert at Unraid I would consider myself a novice at best.  I was a Windows Server Admin so I understand much of the premises, but Linux has never been my strongest focus or understanding.  That said I love Unraid as once I got it going it has been a breeze in comparison to Windows / Windows Server as all Microsoft products are complete trash these days...but I digress.

 

I do not know what is happening, but after upgrading to the latest Unraid version I am constantly having my system crash where i have to physically use the reset button to get it back up, only to have it do it again and again.  I even downgraded to the last version which WAS stable for me, but now this issue has followed it backwards too.  

 

I have attached the diagnostics can someone help?  And there could be more than one issue as I spent a lot of time tinkering over the past year figuring out how to make this work for me.  If you see any other issues in this diagnostics I would not be surprised and would love to have any feedback on improvements or changes I need make.  Thanks everyone who helps in advance!

 

RJW

plexserver-diagnostics-20230906-0014.zip

Link to comment

I have the same problem. Ugraded to 12.4 and system crashes after about 12 hrs repeately. Only solution is hard restart

Previous versions were stable and ran 24/7 constantly.

 

attached is diagnostic file and syslog including latest crash.

system stopped 08:12 restarted 13:57

 Am rolling back to see if it stabilises

 

Any assitance gratedfully rceived.

Thankyou.

sentinel-diagnostics-20230906-1406.zip syslog-192.168.1.110 (4).log

Link to comment

No it has been off for months.  Could it be using it even though I turned it off?  I promise you I turned that thing off more than a few months ago and when I checked it was off still.  See screenshot.  I just turned on the logs you mentioned it is resetting more often.  I'll get you those diagnostics next.

 

If the macvlan thing is an issue would it be best to make a new flash drive and just redo my containers inside the docker?  I have no VMs or anything else and I've done it before as it kind of 'cleaned' out my Unraid.  Not sure if that is a viable solution, but I did so many bad things while learning. ;)

MACVLAN.png

Link to comment
9 hours ago, Mwidders said:

I am not calling you a liar at all, BUT I swear I have not had that on for months I used it for 5 minutes and realized it was a mistake turned it off have not touched it since.

This is the problem with multiple users in the same thread, if you see my first reply the part about macvlan was replying to the other user.

 

To you I asked:

On 9/6/2023 at 10:34 AM, JorgeB said:

Enable the syslog server and post that after a crash.

 

Link to comment

I'm having the same issue. I was completely stable on 6.11.5 and finally bit the bullet to upgrade to 6.12.4

 

Now my server is becoming unresponsive every day, and I'm forced to hold the power button down to shutdown.

 

Diagnostics and Syslogs attached. Please help me. My family is becoming very annoyed that Plex/Chat/Mealie/HomeAssistant/etc keep going down for them.

athena-syslog-20230913-1621.zip athena-diagnostics-20230913-0921.zip

Link to comment

Funny you mention that.  So I have not turned on the pool at all since yesterday and thus far it has not crashed.  If by end of today it still has not crashed what does that tell me?  That it is likely a HDD or could it just as easily be a Docker app problem (on my end) and next step is to try each docker app one at a time?  Or should I just start the pool and wait and see and then after that try each app? 

 

Is there no better way to get data on what is crashing? Just anyway I can narrow this thing down some.  Was my syslog not helpful?  I believe I can get more logs if I rotate them more often?

 

Would be nice if I had a clue as to what device it is that is causing the issue or if it is a HDD at all because if it is then I would start replacing the old drives (needed to be done anyhow).

Link to comment

One more thing, I started running extended smart self testing just for the heck of it.  One of my cache drives said this:

Warning: ATA error count 0 inconsistent with error log pointer 1 ATA Error Count: 0 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days.

 

Not sure if that matters, but none of the others had it.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...