Server hangs completely at random times


Recommended Posts

UnRAID version: 6.5.3 

 

Plugins:

Community Applications: 2018.07.22

Fix Common Problems: 2018.07.28

Nerd Tools: 2018.02.17 (screen installed)

Unassigned Devices: 2018.06.01a

 

Docker apps:

binhex-krusader (not started)

deluge

nginx

nzbget

PlexMediaServer

radarr

sonarr

 

Hardware:

Motherboard: ASRock H110M-ITX

CPU: i3 6100T (stock cooler, not overclocked)

RAM: 2x 4GB 2133Mhz

Hard drives:

* 3x 4TB Seagate Barracuda 3.5 (1 being used for parity)

* 1x 4TB Seagate IronWolf

No GPU installed

 

---

 

Since I first set my server up I've been seeing random and complete server hangs.  None of the Docker instances will be available, nor will the GUI, and I'm unable to log in via the console - I'll enter the username and never get a password prompt.  I have to perform a hard shut down and turn it back on.  It seems to happen every 3-4 days or so, the last time it happened I put it into Diagnostics mode, so I've got the .zip and the syslog attached.  Usually I'm not using the machine (it sits on a desk somewhere in my house untouched), so I don't often realise it's happened until I go to do something with one of the Docker apps.

 

My network sits behind a pfSense device, so the only way to access the server is via VPN or by being in the physical location on the network.  As far as I can see, there's never any errors shown in the console - the IP address and username prompt are always the last things displayed.

 

I have not yet tried safe mode, and I haven't found any reproduction steps yet (sorry!).

FCPsyslog_tail.txt

htpc-diagnostics-20180730-1836.zip

Link to comment

Hi @Daniel Samuels,

 

Welcome to the forum! LOTS of great help here and folks are friendly.

So, I realize this will not be a direct answer, but it has been my experience that when things go "randomly" awry, it is almost always hardware related.

I recently had a similar situation on a new build and discovered that the USB port I had my flash drive in was defective.

System "appeared" to boot, but all kinds of weirdness, like you are talking about.

Until one of the Pros gets a chance to review your Diags file, you may want to have a look see at some of the hardware.

Again, this is not a point and shoot answer, just experience in general saying, look to the hardware first when there is randomness in the error(s).

I'm sure it will all work itself out once the Pros get a chance to chime in, they really are great.

 

Link to comment

I had random crashes and reboots on my Ryzen 1800 Desktop system Windows 10. It could be idling with nothing running or sometimes during a graphic intensive game and it would just reboot or crash. I was unable to repeat it consistently. I stress tested it for hours and sometimes it would crash and sometimes not. I started monitoring CPU temperatures and started logging it. I recorded temperatures a year ago with the system and noticed the current CPU temperatures were slightly higher idle and significantly higher during a load. So I pulled my CPU heat sink and reapplied new thermal paste. I noticed the temperatures went down slightly on idle but was much lower under load.

 

I no longer have any random reboots or crashes now. Thermal paste degrades over time? Misaligned CPU Heat-sink? Not sure but it fixed it.

 

Link to comment
43 minutes ago, perPLEXed said:

Thermal paste degrades over time? Misaligned CPU Heat-sink?

 

Some thermal paste degrades a lot over time, but I think most thermal paste works quite well for the expected lifetime of the system - many industrial systems are expected to work well for 10-20 years without need for replacing any thermal paste.

 

It's more likely that there was an alignment issue or that not all of the chip had thermal paste. Either of these could result in a big variation in temperature between different parts of the chip potentially making one part hot enough that it becomes unstable while the temperature sensor still sees a temperature that does not require throttling.

  • Like 1
Link to comment
  • 1 month later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.