Jump to content
UncleBacon

HARD power downs

11 posts in this topic Last Reply

Recommended Posts

Hello all,

 

I am very new to unRAID but not necessarily new to computers and servers. I am in the process of converting my buddy's Ubuntu Server over to unRAID. The server boots, created the array, tried installing/using dockers and plugins, everything runs great. Love unRAID. I have yet to purchase a licence (19 days to go) because I am having a hell of a time with this thing and hard power downs. Literally powers off like it was unplugged, no errors, nothing on the terminal (I recorded a video), Fix Common Problems Troubleshooting mode doesn't record errors, even tried a syslog tail. Nothing. Thus far, I've found this ONLY happens when copying large files to the server, in this case, large media files. The server can run full parity checks/corrections with no problem and will run for days if left alone. All things point to hardware but I wanted some opinions or other ideas before digging in to what's left (ie: MoBo and CPU). Here is the server and my process thus far:

 

Components:
 - Motherboard: Supermicro C7SIM-Q
 - CPU: Intel Core i5 CPU 750 @ 2.67GHz
 - RAM: 4x G.Skill RipjawsX F3-12800CL9-4GBXL (16GB total)
 - SATA HBA: * Supermicro AOC-SAS2LP-MV8 * (I understand the implications of this but hear me out)
 - Parity Drive: 1x WD Green 3TB
 - Array HDDs: 1x WD Red 3TB, 3x WD Green 2TB
 - SSDs: 1x ADATA SP550 120GB (cache)
 - Flash: Sandisk Cruzer Fit 16GB (boot)
 - Video card: NVIDIA GeForce 7900 GT/GTO
 - PSU: Seasonic X Series fanless

 

Secondary Components tried:
 - RAM 2: 2x Crucial 8GB DDR3L-1600 UDIMM (CT102464BD160B)
 - PSU 2: Antec EA-750 750w

 

Combinations of hardware tested (no specific order and multiple combinations):
 - No video card
 - 8GB RAM
 - No SATA HBA card (SDD/HDDs direct to MoBo)
 - No cache drive
 - Updated BIOS to newest
 - Updated SATA HBA firmware to newest
 - Different PSU (see PSU 2)
 - Disconnect cache drive entirely
 - Flash drive in new port
 - Different RAM (see RAM 2)

 - New array with RED drive as parity

 

Tests and different settings:
 - Memtest86 for 6+ passes (24hrs), no errors on RAM 1

 - Virtualization enabled/disabled

 - C states enabled/disabled

 - BIOS to defaults

 

This has been an ongoing process for over a week. Diags attached. Appreciate ANY ideas/thoughts/comments/snide remarks.

lyserver-diagnostics-20190109-1716.zip

Share this post


Link to post

Are you absolutely sure you don't have a problem with the electricity? Is the server on an UPS?

Share this post


Link to post
5 minutes ago, UncleBacon said:

Sorry, yes. Forgot to mention it's on a CyberPower 1500AVR UPS.

Maybe a fault UPS then? Have you tried without it?

Share this post


Link to post
3 minutes ago, trurl said:

Maybe a fault UPS then? Have you tried without it?

Yup with and without. The UPS is actually mine and the readouts appear normal. I use it for my own equipment. The server is at my house now, so we are thinking to move it back to his and try it there.

Share this post


Link to post
3 hours ago, UncleBacon said:

Appreciate ANY ideas/thoughts/comments/snide remarks.

Remove / reattach CPU heatsink to verify square solid attachment with good contact.

Share this post


Link to post

Pls check does dust fully block airflow between CPU fan and heatsink.

Edited by Benson

Share this post


Link to post

Good suggestions here, I pulled the CPU heatsink, might need a bit more paste. I'll clean it all off, reapply, test and report back. The case is dust free and airflow is good with 2x 120mm fans on top, and I think a 60mm in the back. Plus fans in the front for the drives.

Share this post


Link to post

So I cleaned the CPU and heatsink, new thermal compound. Runs fine but as soon as I start a copy to the server it power's down. It's also back at my buddy's place so different power entirely. I'm really at a loss... 

Share this post


Link to post
9 hours ago, UncleBacon said:

So I cleaned the CPU and heatsink, new thermal compound. Runs fine but as soon as I start a copy to the server it power's down. It's also back at my buddy's place so different power entirely. I'm really at a loss... 

Difficult to see how this can be anything other than hardware related?  The only things that spring to mind would be PSU problems or heating problems (causing thermal shutdown).  However you seem to have checked for those as far as I can see.  Have you checked all fans are spinning - some systems shutdown if a fan appears jammed.

Edited by itimpi

Share this post


Link to post

I have checked and cleaned all the fans. There was one somewhat noisy and slow fan so I removed it completely. I am leaning toward hardware too, just seems strange the previous OS (Ubuntu Server) ran with no issues, exact same hardware. The only difference would be the flash drive but I've tried multiple drives with unRAID.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now