Jump to content

Please Help! Server first unreachable, now won't turn on at all...


xthursdayx

Recommended Posts

Hi folks. For some reason my server has become unreachable, and I'm unsure how to move forward with fixing this. I didn't notice this until this evening when I tried to reach the server via Plex on another machine. When the server couldn't be reached I started to investigate and found that I can neither reach the server web UI page, nor telnet into the server. To be honest I'm kind of freaking out here. Any ideas how I can move forward with remedying this situation?

 

Thanks for any help and advice!

 

*Update**

I restarted my server using the button the front of the machine which is set to initiate a shutdown. However, after shutting down it will not restart. I took my case apart, unplugged the power cord and held the power button down to discharge any remaining power and checked that nothing visible had happened to the physical structure of my motherboard, memory or processor - it hasn't. Now nothing happens when I push the power button on my case. If I unplug and then plug back in the power cord the server attempts to start up, but fails and shuts down after around 3-5 seconds. 

 

Ideas?

 
Link to comment

Unplug all power plugs to  drives and anything else except the MB.  See if will start then.  But @ashman70 is probably correct...  It is most likely the PS.  If you have to buy one, get a good quality one that has a single 12V rail.  There is a whole thread devoted to discussion of PS:

 

Link to comment
Failed power supply?
 


That's kind of what I was wondering, but the server was still running when the initial problem started and I couldn't access it... But I maybe it was just giving enough power to stay on, but had lost full power to run the system.

Do you know any way that I can check if the problem is the power supply?
Link to comment
Just now, zandrsn said:

Do you know any way that I can check if the problem is the power supply?

Shotgun troubleshooting. Replace it and see if it fixes the problem. Bad PSU's can be REALLY unhealthy to everything they connect to, I've seen them blow up motherboards and HDD circuit boards. It's probably the most important piece in the machine quality wise. You'll never notice a good one, a bad one will give you unending headaches.

Link to comment
12 hours ago, ashman70 said:

Agreed, it may appear to be functioning, but as we know, appearances can be deceiving.

 

Indeed they can... Thanks for you guys' advice. I guess I'll throw a new power supply in there and see how it goes. Let's hope that the failing power supply didn't cause any of the other issues jonathanm mentioned.

Link to comment
On 4/26/2017 at 3:13 PM, trurl said:

Overheating CPU can also have this symptom. Make sure CPU fan is running.

 

The CPU doesn't have it's own dedicated fan, just cooling fins, and a large case fan. However It's been fairly cool here in Canada recently, so I don't think it overheated, but it's a possibility.
 

I ended up taking the entire system apart and seeing if the PSU would work just plugged in directly. It did, and the system ended up starting up as usual when I put it back together. However after around 24 hours the system exhibited the same problem (with the server inaccessible first, and then not starting after a shutdown). It seems like something is causing the PSU to fail at some point after the system is running. Ideas?

Link to comment
Does the PSU have a fan, is it clean and running?    PSU/PS have been know to fail and are usually an inexpensive item to replace.  Just be sure you get one that has an appropriate power rating and preferably a single 12V rail.  

 

My PSU is appropriately rated for my server's power draw, and has a single 12V rail. It has a fan, and is clean, and when it was removed from the system and plugged in this fan ran normally. It seems like the PSU is operating fine by itself and only fails in the context of the rest of my system, and then only after being used for a little while...

Link to comment
 

My PSU is appropriately rated for my server's power draw, and has a single 12V rail. It has a fan, and is clean, and when it was removed from the system and plugged in this fan ran normally. It seems like the PSU is operating fine by itself and only fails in the context of the rest of my system, and then only after being used for a little while...

 

 

Just most recently I have been able to get the server to run again, though I am never able to access the Unraid OS, either via webUI or telnet. I am able to access the motherboard's webUI and IPMI set up page. There weren't any existing error logs when I first checked, but when I just requested that the server shut down and restart via the remote control option these events were logged:

 

Event ID   Ascending	  Time Stamp   Ascending	  Sensor Name   Ascending	  Sensor Type   Ascending	  Description   Ascending810	04/30/2017 20:06:46	CPU Temperature	Temperature	Upper Non-Critical - Going High - Deasserted809	04/30/2017 20:06:46	CPU Temperature	Temperature	Upper Critical - Going High - Deasserted808	04/30/2017 20:06:46	CPU Temperature	Temperature	Upper Non-Recoverable - Going High - Deasserted807	04/30/2017 20:06:40	CPU Temperature	Temperature	Upper Non-Recoverable - Going High - Asserted806	04/30/2017 20:06:39	CPU Temperature	Temperature	Upper Critical - Going High - Asserted805	04/30/2017 20:06:39	CPU Temperature	Temperature	Upper Non-Critical - Going High - Asserted804	04/30/2017 20:04:50	CPU Temperature	Temperature	Upper Non-Critical - Going High - Deasserted803	04/30/2017 20:04:50	CPU Temperature	Temperature	Upper Critical - Going High - Deasserted802	04/30/2017 20:04:50	CPU

 

I don't really understand how the CPU could be so hot given that the computer has been off and is running with the case open at the moment.. any thoughts? I'm still unable to access unraid or any of my files. At the moment, every time I restart the server it no longer immediately fails, but I'm not able to reach unraid.

 

Link to comment
Have you tried to access it by IP address?
 
Do you have an attached keyboard and monitor?


No keyboard or monitor attached. I've tried access it via the IP address (which is what I meant by web UI), and it won't connect. Though I can connect to the motherboard's IP address config page.
Link to comment
2 hours ago, NAStyBox said:
 
A CPU will not stay cool at room temperature. Keep in mind, when I worked in a datacenter regularly we kept the room in the 60's. During the summer months I'd actually move my desk in there because our AC sucked. We still had ridiculously loud high rpm fans running on our CPUs. 

Is this a 1 or 2U rack mount case? If so, there's probably a cover that goes between the fans and the CPU. If that's not there, it will heat up. If it's a desktop, they may only have fins on the CPU but it's very likely there's another fan in the case ducted to the fins. What model machine are we talking about? 

 

 


This is a homebuilt server in a mini-itx case. It has an ASRock C2750d4i motherboard (http://www.asrockrack.com/general/productdetail.asp?Model=C2750D4I#Specifications) and 6 WD red 4tb drives. The case has 2 90mm fans in the front and one 120mm fan in the back that pulls air over the CPUs fins. The only time I've ever had problems with temperature previously was doing a heat spell last summer when my AC broke. It was much hotter than it is now though.

As I mentioned before it seems start up normally now (other than those temp alerts), there is just no way for me to connect to Unraid...

 

 

Link to comment

Alright, I only had the cover off temporarily. It's back on now. The server is back on after resting two hours. I can access the IPMI config page via ip address or telnet, however I can't ping or run ifconfig because the for some reason java won't let me run the console CLI and through telnet busy box tells me that I don't have the correct permissions to ping (I'm logged in using the default root password), and when I ifconfig, I'm told that there is no such command....

Link to comment

Let's start back with the basics.  I would pull that Flash Drive and see if I can have it boot-up on another computer.  (You won't 'hurt' the other computer as it won't write anything to any hard drive since it won't see the excepted valid drive configuration.)   With all of your issues that have been going on, there is a possibility that the Flash has gotten clobbered. 

Link to comment
 
Something is way off. Is the server really up or are you in some sort of shell? Because I just verified logged in as root and no problems with any of those commands. Someone with more intimate knowledge of UNRaid will have to step in here. Sorry I wish I could offer more help. 


Well the Unraid server isn't up, so I'm not logging into Unraid. What I'm logging into is the ASRock motherboard's base command line.
Link to comment
Let's start back with the basics.  I would pull that Flash Drive and see if I can have it boot-up on another computer.  (You won't 'hurt' the other computer as it won't write anything to any hard drive since it won't see the excepted valid drive configuration.)   With all of your issues that have been going on, there is a possibility that the Flash has gotten clobbered. 


Okay, I'll try this out. Do I need to try to boot another computer from the flash drive, or just see if I can access it from another running computer?
Link to comment


Okay, I'll try this out. Do I need to try to boot another computer from the flash drive, or just see if I can access it from another running computer?


So, I booted another computer from my Unraid USB and was able to boot into the Unraid OS GUI and see my normal server configuration (though of course the disks were missing). So it seems like my flash drive is working fine.
Link to comment

Now (1) either select from the boot screen  the Memtst option   or  (2)   On the Flash Drive, make a backup copy of the /syslinux/syslinux.cfg  file and then cut-and-paste the 'menu default' to be under the Memtest64+ option.  Now boot to the Memtst program and allow it to run for 24 hours.

Link to comment
Now (1) either select from the boot screen  the Memtst option   or  (2)   On the Flash Drive, make a backup copy of the /syslinux/syslinux.cfg  file and then cut-and-paste the 'menu default' to be under the Memtest64+ option.  Now boot to the Memtst program and allow it to run for 24 hours.


Okay, great. Just to clarify, I can conduct this Memtst using another machine correct?
Link to comment

No. It has to run on your server hardware.  We are testing the memory of the server to see if it is good.  (This is not unRAID software but a recent version of freeware(?) software that has been around for more than a generation.)  However, if the hardware locks up, it will be a may be a clue as to what the problem is.  It would be best that you connect a monitor to the server so that you can see the output of the Memtst as it runs.  That might give a better picture of what is going on.  (It also will minimize the load on the PS as none of the drives will be in play.) 

Link to comment
No. It has to run on your server hardware.  We are testing the memory of the server to see if it is good.  (This is not unRAID software but a recent version of freeware(?) software that has been around for more than a generation.)  However, if the hardware locks up, it will be a may be a clue as to what the problem is.  It would be best that you connect a monitor to the server so that you can see the output of the Memtst as it runs.  That might give a better picture of what is going on.  (It also will minimize the load on the PS as none of the drives will be in play.) 



Okay, so I hooked a monitor and keyboard up to the server so that I could choose to start the memtst and monitor the output. However, the monitor is not recognizing the signal from the machine. It's plugged directly into the VGA output of the motherboard. Is it possible that something is wrong with the motherboard? I can still access the motherboard's IPMI config page via its internal IP address.
Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...