Jump to content

Strange crashes on Dell R510


Recommended Posts

About once a day (though not nearly at the same time each day), I'm having a very strange type of....crash? The server will drop off the network - UnRAID's webUI and the web UIs of the Docker containers will stop responding, and the server will no longer respond to ping. If I get on the server's console (which I can still do over iDRAC), I can't even ping the gateway. Oddly, the ping command doesn't output anything while it's running, only the statistics when I hit Ctrl-C. I try to run `diagnostics`, it says "Starting diagnostics collection..." but never makes it any further, it'll stay there indefinitely and never write the diagnostic .zip file. My only recourse is to force reboot it. 

 

Attached is the diagnostics file pulled from the machine after I rebooted it the last time this happened (about 30 minutes ago), though I know that's not nearly as useful as the diagnostics pulled immediately after it fails. I would pull those if I could!

 

The server is a Dell PowerEdge R510 with these specs:

 

2x Intel Xeon X5670 (purchased from ServerMonkey, originally 2x L5640)

128GB RAM (8x16GB purchased from ServerMonkey, originally an oddball 40GB config)

Dell H200 flashed to IT mode (by previous owner)

2 x WD 8TB WD80EMAZ drives shucked from WD Easystore enclosures, 1 data and 1 parity

Samsung 850 EVO 500GB, cache

OCZ Rally2 16GB connected to internal USB port, boot

iDRAC6 Enterprise in Shared NIC mode.

 

I've run an all night memtest, which it passed. It's also passed Dell's onboard hardware diagnostics. 

 

Docker containers (running, also have a few that sit stopped):
abiosoft/caddy:php-no-stats
linuxserver/plex
linuxserver/sonarr
linuxserver/radarr
binhex/arch-delugevpn
tautulli/tautulli
linuxserver/jackett
siwatinc/homebridge_gui_unraid

 

Haven't yet tried running without the Docker containers, I just now disabled Docker to see if that helps at all. 


Plugins:

CA Auto Update Applications
CA Backup / Restore Appdata
CA Cleanup Appdata
Community Applications
ControlR
Dynamix SSD TRIM
Dynamix System Temperature
Fix Common Problems
Nerd Tools
Network Statistics
Preclear Disks
ssh Plugin
Statistics
Theme Engine
Unassigned Devices
Unassigned Devices Plus
unBALANCE
User Scripts

 

Thank you in advance for any help you can provide!

jarvis-diagnostics-20200419-1738.zip

Link to comment

OK, so I'm coming up on 48 straight hours with the box up and running fine with Docker turned off. Granted, the box has seen almost no use, given that Plex and my Caddy web server aren't running. Nothing at all out of the ordinary has appeared on my syslog server. 

I'm currently debating whether to spin all of my Docker containers back up and wait for something juicy to hit the syslog, or spin up just a few containers and gradually add them (with perhaps a day's wait in between) to see if I can narrow down what might be causing the issue.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...