Unraid Server Crashing Constantly After Updating to Latest Unraid OS

Mwidders · September 6, 2023

I am by no means an expert at Unraid I would consider myself a novice at best. I was a Windows Server Admin so I understand much of the premises, but Linux has never been my strongest focus or understanding. That said I love Unraid as once I got it going it has been a breeze in comparison to Windows / Windows Server as all Microsoft products are complete trash these days...but I digress.

I do not know what is happening, but after upgrading to the latest Unraid version I am constantly having my system crash where i have to physically use the reset button to get it back up, only to have it do it again and again. I even downgraded to the last version which WAS stable for me, but now this issue has followed it backwards too.

I have attached the diagnostics can someone help? And there could be more than one issue as I spent a lot of time tinkering over the past year figuring out how to make this work for me. If you see any other issues in this diagnostics I would not be surprised and would love to have any feedback on improvements or changes I need make. Thanks everyone who helps in advance!

RJW

plexserver-diagnostics-20230906-0014.zip

Rweng009 · September 6, 2023

I have the same problem. Ugraded to 12.4 and system crashes after about 12 hrs repeately. Only solution is hard restart

Previous versions were stable and ran 24/7 constantly.

attached is diagnostic file and syslog including latest crash.

system stopped 08:12 restarted 13:57

Am rolling back to see if it stabilises

Any assitance gratedfully rceived.

Thankyou.

sentinel-diagnostics-20230906-1406.zip syslog-192.168.1.110 (4).log

JorgeB · September 6, 2023

2 hours ago, Rweng009 said:

I have the same problem. Ugraded to 12.4 and system crashes after about 12 hrs repeately.

You are having macvlan call traces, change docker network to ipvlan.

4 hours ago, Mwidders said:

I have attached the diagnostics can someone help?

Enable the syslog server and post that after a crash.

Mwidders · September 9, 2023

Thanks Jorge for answering me. I do not believe I have macvlan I turned that thing off months ago, but I will confirm it is gone. I will enable syslog server now and get you diagnostic of that for tomorrow as it now crashes once every 24 hours at most. Thanks again!

JorgeB · September 10, 2023

11 hours ago, Mwidders said:

I do not believe I have macvlan I turned that thing off months ago

It was still using macvlan on the diags posted.

Mwidders · September 10, 2023

No it has been off for months. Could it be using it even though I turned it off? I promise you I turned that thing off more than a few months ago and when I checked it was off still. See screenshot. I just turned on the logs you mentioned it is resetting more often. I'll get you those diagnostics next.

If the macvlan thing is an issue would it be best to make a new flash drive and just redo my containers inside the docker? I have no VMs or anything else and I've done it before as it kind of 'cleaned' out my Unraid. Not sure if that is a viable solution, but I did so many bad things while learning.

Mwidders · September 10, 2023

I actually have the logs on already. Where do I get them?

Mwidders · September 10, 2023

Never mind that was an idiotic question. I will send it soon as it crashes again which will not be long.

Mwidders · September 10, 2023

Jorge,

I am not calling you a liar at all, BUT I swear I have not had that on for months I used it for 5 minutes and realized it was a mistake turned it off have not touched it since. Here let me post these diagnostics from 5 minutes ago and it is absolutely off still, I promise.

plexserver-diagnostics-20230910-1615.zip

JorgeB · September 11, 2023

9 hours ago, Mwidders said:

I am not calling you a liar at all, BUT I swear I have not had that on for months I used it for 5 minutes and realized it was a mistake turned it off have not touched it since.

This is the problem with multiple users in the same thread, if you see my first reply the part about macvlan was replying to the other user.

To you I asked:

On 9/6/2023 at 10:34 AM, JorgeB said:

Enable the syslog server and post that after a crash.

Mwidders · September 12, 2023

Ahhhh, sorry buddy totally missed that. Let me get you the proper files. I am not that dumb I am just being lazy in my reading ;). Again appreciate the patience. It is happening 24/7 now. Getting logs.

Mwidders · September 12, 2023

I keep trying to make syslog like you suggest it keeps crashing and nothing is being saved. It's on I know it, I have the correct folder path suggested a cache drive I even turned on the flash drive because I cannot get a LOG for the life of me. What do I do?

Mwidders · September 12, 2023

Not sure if this helps:

https://docs.google.com/document/d/1L6i5fQnfl0HpLrxReD6pa1wIVeVA0eO788VHJqJbbIE/edit?usp=sharing

It is the sys log I grabbed it has a few red entries in there and a few repetitive yellow ones...

JorgeB · September 13, 2023

Is that from the syslog server? Looks like the standard syslog, also next time please attach the file here.

Gazeley · September 13, 2023

I'm having the same issue. I was completely stable on 6.11.5 and finally bit the bullet to upgrade to 6.12.4

Now my server is becoming unresponsive every day, and I'm forced to hold the power button down to shutdown.

Diagnostics and Syslogs attached. Please help me. My family is becoming very annoyed that Plex/Chat/Mealie/HomeAssistant/etc keep going down for them.

athena-syslog-20230913-1621.zip athena-diagnostics-20230913-0921.zip

JorgeB · September 13, 2023

14 minutes ago, Gazeley said:

I'm having the same issue.

Please create your own thread to avoid confusion since this one is still active, before that enable the syslog server and post that there after a crash.

Mwidders · September 15, 2023

Here we go had to use usb wasn’t saving on any other shares including cache drives. syslog

Mwidders · September 15, 2023

Again Jorge thank you so much for being so patient. I’m leaning towards hardware, but hopefully the logs will let you tell me what is up.

JorgeB · September 15, 2023

Some call traces and they do look more hardware related to me, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Mwidders · September 15, 2023

Funny you mention that. So I have not turned on the pool at all since yesterday and thus far it has not crashed. If by end of today it still has not crashed what does that tell me? That it is likely a HDD or could it just as easily be a Docker app problem (on my end) and next step is to try each docker app one at a time? Or should I just start the pool and wait and see and then after that try each app?

Is there no better way to get data on what is crashing? Just anyway I can narrow this thing down some. Was my syslog not helpful? I believe I can get more logs if I rotate them more often?

Would be nice if I had a clue as to what device it is that is causing the issue or if it is a HDD at all because if it is then I would start replacing the old drives (needed to be done anyhow).

Mwidders · September 15, 2023

One more thing, I started running extended smart self testing just for the heck of it. One of my cache drives said this:

Warning: ATA error count 0 inconsistent with error log pointer 1 ATA Error Count: 0 CR = Command Register [HEX] FR = Features Register [HEX] SC = Sector Count Register [HEX] SN = Sector Number Register [HEX] CL = Cylinder Low Register [HEX] CH = Cylinder High Register [HEX] DH = Device/Head Register [HEX] DC = Device Command Register [HEX] ER = Error register [HEX] ST = Status register [HEX] Powered_Up_Time is measured from power on, and printed as DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes, SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Not sure if that matters, but none of the others had it.

JorgeB · September 15, 2023

Post the SMART report.

Mwidders · September 15, 2023

I have two drives that appear to have errors...

unraidplexserve-smart-20230915-0919.zip unraidplexserve-smart-20230915-0918.zip

JorgeB · September 15, 2023

Both look fine to me, note that you cannot run SMART tests no NVMe devices.

Mwidders · September 16, 2023

How about this one?

syslog

Unraid Server Crashing Constantly After Updating to Latest Unraid OS

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation