unraid goes offline - starting to happen more frequently


Recommended Posts

Recently, my unraid box has started going off line. I am unsure if it is a hardware or software issue. Additionally, I don't know if it is that the whole system goes down or just the networking. It happens randomly and I haven't been able to determine if it is caused by anything specific. My concern is when this happens, the only solution is to power cycle it which I know isn't good for the health of the data array. 

 

Today I hooked up a monitor to be able to better determine what is causing this. But additionally, I figured I would post my diagnostics here to see if there is something going on behind the scenes. 

 

Any insight would be really helpful. 

unraid-diagnostics-20230531-1133.zip

Edited by dgtlman
Link to comment
2 hours ago, JorgeB said:

Enable the syslog server and post that after it happens again.

Hi All,

 

My server has randomly started to do the same today after completing a data rebuild after upgrading a hdd.

 

The server seems to remain powered on but is showing as offline and the GUI is unreachable, only a hard reboot solves it.

 

I enabled the syslog server after the 2nd time it happened then after an hour it happened again.

 

I honestly have no idea what to look for nor where to start any help would much appreciated. 

syslog

olympus-diagnostics-20230531-1927.zip

Edited by Olympus_Media
Link to comment

UPDATE

 

After 2nd offline and force reboot the server remained online for an hour and went offline again but came back after couple mins. My internet did not go offline, the server threw a notification of unclean shutdown even ddoe it didn't loose power and started a parity check.

 

It is currently almost 4hrs into the parity check and has not gone offline yet. All i have done in the meantime is upgrade from 6.11.1 to 6.11.5.

Link to comment

Here is the syslog. I also noticed from the monitor that I had hooked up, that the system went down from a kernel panic. Hopefully this log defines what is causing that and the easiest way to resolve this. 

 

The kernel panic happened on 6/1. I was unable to post this since then. Since then, it captured more of the log. Sorry if this add to the complexity of figuring things out. 

 

Thanks

syslog-192.168.1.50.log

Edited by dgtlman
clarification
Link to comment
Jun  2 01:41:10 iron kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
Jun  2 01:41:10 iron kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]

 

Macvlan call traces are usually the result of having dockers with a custom IP address and will end up crashing the server, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)).

Link to comment
11 hours ago, JorgeB said:

Last logged call trace is from June 2nd:

 

Jun  2 01:41:10 iron kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
Jun  2 01:41:10 iron kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]

 

 

How could it have been macvlan causing it on 6/4 if the log you are referencing is from 6/2?

Link to comment

I meant that there aren't any call traces after that date, that usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.