mbryanr Posted February 1, 2012 Share Posted February 1, 2012 Warning: I'm breaking a rule that I believe in - I don't have a syslog capturing the crash. Prior to this everything has been running perfectly, until the monthly parity check last night...when the server just powered off. No smart errors on any drives. I had one previous occurrence during the previous monthly parity check, but powered back up and no further issues, therefore I ignored the problem. Now everytime I have my girlfriend power it back on, it goes to resync, and powers off within the hour. I did get sync notification that it was 4.7% complete @58745 KB/s, but it died shortly after. Almost like someone hit the power button. What happens: The server will begin the parity sync, and then just powers off without warning. Nothing is reported in the syslog prior to the power outage, as I have it sent to my synology syslog server. The start up syslog is identical to every previous syslog <except now the replay transactions> Hardware setup: Corsair 430CX (v1) 8 drives - (3) 7200rpm; 5 (~5400 rpm) See sig for the other components I'm thinking power supply or bad power connection...any help or suggestions? Quote Link to comment
kizer Posted February 1, 2012 Share Posted February 1, 2012 To me it would sound like bad power. Over whelming the power supply with all of the drives coming on. Then again thats just a first blush guess. Quote Link to comment
mbryanr Posted February 1, 2012 Author Share Posted February 1, 2012 My thoughts as well..when I get home, I'll open the case up and see if there is anything obvious. I have a new PSU on order. I need one anyways to expand. One more item...the server is on an UPS. Of course I can't hear the beeps since I'm not there. Quote Link to comment
kizer Posted February 1, 2012 Share Posted February 1, 2012 Have you added any new drives since the last time you ran a parity check. Also if you did/didn't did you have an issue the last time? Quote Link to comment
mbryanr Posted February 1, 2012 Author Share Posted February 1, 2012 1. I did have the problem 1x during the last monthly parity, but was able to complete after re-starting the server. 2. Checking my preclear reports, I last added a drive on 9/14/11. I see I have the following syslogs 9/25 - 11/28 No Errors 11/28 - 12/4 No Errors 12/4 - 1/1 <<----Power Out 1x, but able to complete parity check on the next attempt. 1/1 - 1/24 <attached> 3. Based on a failure curve, 1/1/12 may have been my early warning... syslog-20120124-212032.txt.zip Quote Link to comment
mbryanr Posted February 1, 2012 Author Share Posted February 1, 2012 9/25 - 11/28 Syslog syslog-20111128-200239.txt.zip Quote Link to comment
mbryanr Posted February 1, 2012 Author Share Posted February 1, 2012 Latest start up syslog. Stopped the parity check until I can investigate further, and get the new PSU installed. syslog-2012-02-01.txt Quote Link to comment
Rajahal Posted February 2, 2012 Share Posted February 2, 2012 My money is on a bad power supply, but another helpful troubleshooting step is to disable all add-ons temporarily. Maybe something like APCUPSD is triggering the powerdown script. If the new power supply doesn't solve the problem, then you might want to take a look at your power button. If the button is sticking or otherwise defective, it could be powering down the server at random. An easy test of this would be to disconnect the power button's motherboard headers, then manually power on the server with a screwdriver or other bit of metal (be careful to properly ground yourself). Quote Link to comment
mbryanr Posted February 2, 2012 Author Share Posted February 2, 2012 Thanks Raj. Those are some great troubleshooting tips. I dislilke putting all my hopes in one part swap. While easy, if the new PSU doesn't fix it, Im back at the same point with no other ideas. Still I put my money on the PSU. I'll update when I attempt those, interesting that it has run for almost two hours now - since I stopped the monthly parity check. Quote Link to comment
pitchinwedge Posted February 2, 2012 Share Posted February 2, 2012 mbryanr, I've got the same problem. Server just shuts down at random. Gonna try switching out my PSU tonight, but I skeptical to think my 3 mo old Corsair AX750 is defective. So, as you just stated, any other ideas? Could it be bad SATA cables or the mobo chipset? Quote Link to comment
mbryanr Posted February 2, 2012 Author Share Posted February 2, 2012 ^^I noticed your thread and was going to hijack it with my issue. I'll follow yours as well to see if there is any progress or other ideas. Right now, the server is running, but I cancelled the parity check and everything is ok. That is the only time it has shown up for me. When I get away from the TV, I'll open the case and check connections, and the power button. I have a PSU coming in Monday, so I have some time to troubleshoot. I'll give your thread a closer look and see if there could be anything else in common. Quote Link to comment
mbryanr Posted February 4, 2012 Author Share Posted February 4, 2012 Have the server completely apart now. I found the CPU fan partially dislodged, and def. not making full contact with the CPU. I'm in the process of cleaning, organizing and installing the new PSU, 3-in-2 cage and 2 new drives. But I would suspect the CPU fan causing an overtemp, and subsequent shutdown during parity check. Quote Link to comment
joshpond Posted February 4, 2012 Share Posted February 4, 2012 Reading your post and got to the last one and was about to get you to have a look at temps. Either PSU or more likely overtemp shutoff. Once running parity checks a lot of extra heat is generated. Josh Quote Link to comment
mbryanr Posted February 4, 2012 Author Share Posted February 4, 2012 I believe a partially bad psu would have caused other problems as well besides just a shutdown. Finishing up and post back. Sent from my SAMSUNG-SGH-I897 using Tapatalk Quote Link to comment
mbryanr Posted February 4, 2012 Author Share Posted February 4, 2012 I've been running preclear on (2) new drives. I'm going to mark this as solved. Thanks for the assistance. btw, I probably dislodged the cpu fan latch when I was cleaning the server a while back. Once again, unRAID has proven to be more reliable than the user! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.