Random crashing after long term stability


nomisco

Recommended Posts

Looking for some pointers as to what may be causing my server to crash. At the moment, from memory, it will run for between a day and 7 to 10 days before crashing.

 

It's a headless machine, so as it's sat behind my TV, I've connected it to the HDMI port and added a keyboard to see if anything happens onscreen when it crashes. It just sits at the login prompt (untouched since boot), but that is also unresponsive to the keyboard.

 

There are plugins on the machine which may not have been on there when it was stable, but the nature of the crashing means it difficult to rule some out by removing them as I don't know which ones it could be - if any.

 

I'm a total Linux noob, so I set the log to be mirrored to the flash drive. Looking at the syslog file in the logs directory does not show any event when it crashes (I've taken the flash drive to my Windows PC to look at the file). Attached are the system diagnostics, plus the syslog file just after it crashed which I may have inadvertently deleted from the flash before rebooting the server.

 

Just to add that I've switched between the stable and RC release with the same outcome.

 

unraid-diagnostics-20211222-0932.zip syslog

Edited by nomisco
Link to comment

Thank you Jorge. I only use that driver for Plex transcoding but can do without. I remember adding some lines to a config file or two; but can't remember how I'd remove them.

 

I suppose a reboot of the server would be required after they're removed?

 

Update:  I removed some lines from the 'Go' file:

 

#enable module for iGPU and perms for the render device
modprobe i915
chown -R nobody:users /dev/dri
chmod -R 777 /dev/dri

 

And also the relevant lines from the Plex container.

 

I'll give this a try, and report back either way, but it may be a week! Any other suggestions in the meantime would be welcome.

Edited by nomisco
Update info
Link to comment

Unfortunately the server crashed again in the night, just after 3AM. I've attached the log from the flash which shows just over an hour and twenty minutes of errors. The midnight spin up will be for the Plex library scan, then the disks go to sleep until an hour and ten minutes later when I get this massive block of errors until the box dies. I can't see what is causing the error, but it may be more obvious to someone here. Hopefully.

 

Or perhaps someone can suggest a more verbose logging effort if they have a suspect in mind.

 

Thanks

 

 

 

syslog.txt

Link to comment

2 Days ago a mate of mines Unraid Server became unresponsive, then this morning mine did. Same thing happened a couple of weeks ago, His server died then the next day mine did. Very odd.

 

His is running a AMD CPU and mine an intel. Both on 6.10.0-rc2

 

If it happens again what should i grab? Syslog at the moment shows nothing interesting.

 

Link to comment
  • 3 weeks later...

Thought it might be worth an update:

 

Now over two weeks of uptime without anything unusual in the logs. I re-enabled the i915 driver about a week ago and it's being used for Plex HW transcoding - the same as the last couple of years.

 

I may reinstall the MyServers Plugin to see what happens. I am still running 6.10 RC2.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.