[SOLVED] unRAID powers off during parity check. Help please

Warning: I'm breaking a rule that I believe in - I don't have a syslog capturing the crash.  :(


Prior to this everything has been running perfectly, until the monthly parity check last night...when the server just powered off.



No smart errors on any drives. I had one previous occurrence during the previous monthly parity check, but powered back up and no further issues, therefore I ignored the problem. Now everytime I have my girlfriend power it back on, it goes to resync, and powers off within the hour. I did get  sync notification that it was 4.7% complete @58745 KB/s, but it died shortly after. Almost like someone hit the power button.


What happens:


The server will begin the parity sync, and then just powers off without warning.  Nothing is reported in the syslog prior to the power outage, as I have it sent to my synology syslog server. The start up syslog is identical to every previous syslog <except now the replay transactions>


Hardware setup:

Corsair 430CX (v1)

8 drives - (3) 7200rpm; 5 (~5400 rpm)

See sig for the other components


I'm thinking power supply or bad power connection...any help or suggestions?


1.  I did have the problem 1x during the last monthly parity, but was able to complete after re-starting the server.

2.  Checking my preclear reports, I last added a drive on 9/14/11.

I see I have the following syslogs

  9/25 - 11/28 No Errors

  11/28 - 12/4 No Errors

  12/4 - 1/1 <<----Power Out 1x, but able to complete parity check on the next attempt.

  1/1 - 1/24 <attached>


3.  Based on a failure curve, 1/1/12 may have been my early warning...







My money is on a bad power supply, but another helpful troubleshooting step is to disable all add-ons temporarily.  Maybe something like APCUPSD is triggering the powerdown script.


If the new power supply doesn't solve the problem, then you might want to take a look at your power button.  If the button is sticking or otherwise defective, it could be powering down the server at random.  An easy test of this would be to disconnect the power button's motherboard headers, then manually power on the server with a screwdriver or other bit of metal (be careful to properly ground yourself).

Thanks Raj. Those are some great troubleshooting tips. I dislilke putting all my hopes in one part swap. While easy, if the new PSU doesn't fix it, Im back at the same point with no other ideas. Still I put my money on the PSU. ;D


I'll update when I attempt those, interesting that it has run for almost two hours now - since I stopped the monthly parity check.

^^I noticed your thread and was going to hijack it with my issue. ;)  I'll follow yours as well to see if there is any progress or other ideas.  Right now, the server is running, but I cancelled the parity check and everything is ok.  That is the only time it has shown up for me.  When I get away from the TV, I'll open the case and check connections, and the power button. I have a PSU coming in Monday, so I have some time to troubleshoot.


I'll give your thread a closer look and see if there could be anything else in common.


Have the server completely apart now. I found the CPU fan partially dislodged, and def. not making full contact with the CPU.


I'm in the process of cleaning, organizing and installing the new PSU, 3-in-2 cage and 2 new drives.  But I would suspect the CPU fan causing an overtemp, and subsequent shutdown during parity check. 



