October 20, 201312 yr Hello everybody, I've been experiencing some problems with my server lately. What i can explain it this: somehow, at a certain time every day, the server shuts off completely. I have to reboot it two times in order to have it running properly. I have limited knowledge in troubleshooting unRaid so i'll try and give you as much details as i can. I don't fiddle with my setup a lot - i've set it up in january this year, and basically let it run since then. About a month ago i've upgraded to the latest version and installed BTSync as well, to backup the pictures on my phone. For BTSync I was a bit short on time when i did this so the install was pretty basic (no passwords, no specific folder etc, i just had it running). A few days ago i've decided to complete the install of BTSync, and that is roughly when i began to noticed the server being shut down after one day or so, i guess. I've already tried removing all the BTSync content but that did not fix anything (i'm not even entirely sure BTSync or my blotched configuration of it is to blame). I've added a logfile, noticing the "Oct 20 10:42:20 DEIMOS emhttp: unclean shutdown detected" from this morning but that is as far as i can go. Thanks for you help! log.txt
October 20, 201312 yr Open up the server and verify that all of the fans and cooling fins are clean and working. All modern CPU's and power supplies have over-temperature sensors on them and will shutdown when they become overheated. If that does not fix the problem, report back and we can move on having eliminated one possible cause.
October 20, 201312 yr The difficulty here is of course knowing what happened and since the reboot wipes the syslog maybe you can use btsync to capture it in real-time on a remote computer/smartphone. Try setting up btsync to sync /var/log on another location, hopefully it'll capture an error before the shutdown. I guess you have a parity check running after each reboot as it's an unclean shutdown right? What other plugins are you using? Do you still use simplefeature with 5.0? Did you modify your go script in anyway? Did you have some kind of logic protection in Simple Feature? ... I think it was in e-mail notification, you could tell unraid to shutdown if a certain condition arises like Frank said, in case of high temp detected.
October 20, 201312 yr Author Frank: yeah, i thought about that, but the server's clean. Plus, it's running in the basement, the temp there is rarely above the 18°C (especially at this time of the year) Darts: that's one thing i forgot to mention: i was unable to start the BTSync service at all since the problems started. I do have a parity check at each reboot. - no other plugin running that i know of (ie that i did manually install by myself) - still using simplefeature - i did not modify any script - simplefeature sends me an email in case of temp above the 50°C mark. 65°C is the critical temp. I did not receive such email notifications so far (and yes, the spam folder has been checked ^^)
October 20, 201312 yr Frank: yeah, i thought about that, but the server's clean. Plus, it's running in the basement, the temp there is rarely above the 18°C (especially at this time of the year) Darts: that's one thing i forgot to mention: i was unable to start the BTSync service at all since the problems started. I do have a parity check at each reboot. - no other plugin running that i know of (ie that i did manually install by myself) - still using simplefeature - i did not modify any script - simplefeature sends me an email in case of temp above the 50°C mark. 65°C is the critical temp. I did not receive such email notifications so far (and yes, the spam folder has been checked ^^) If you are using the Simple Features 'over-temperature' e-mail warning, that temperature is for the Disk Temperatures, not CPU temperature. (The reason, I stress opening up the case and visually checking the fans is that I had a CPU cooling fan fail in a Linux machine once. The symptoms that you describe are exactly the same thing that happened in that instance!) Next thing, I would try is to backup the entire Flash drive. Blow everything away on the Flash Drive and install ver 5.0 again and see if you have still have the problem. (You can easily go back to your present configuration by restoring the backup that you made of your Flash Drive.)
October 20, 201312 yr I personally had issues with simplefeatures and V5.0, shares were available but no access to the webinterface after some hours. Apparently there are some incompatibilities between SF and 5.0 Like Frank I would recommend a full sweep of your flash drive, refresh it just to be sure. I had it formatted without the quick format option, then ran checkdisk set on sector check. (just in case to make a flash drive unraid ready : http://lime-technology.com/wiki/index.php/Building_an_unRAID_Server#Prepare_the_USB_Flash_drive ) Also, have a look at your /config/plugins folder. For the test you could also clean this folder, you'll get back to the original gui (recommending switching to the new webgui from Tom afterwards, available here) this will get rid of any SF><5.0 glitches. And could you also please post the content of your /config/go file just to make sure it's OK?
October 20, 201312 yr Author Frank: i get your point, but i tell ya, my case is clean it was worth mentionning however. Good call about wiping the flashdrive, i'll give this a try! Darts: i did not know about a new webgui, i might give this a try as well. I was about to post the content of the go file but hey, whaddayaknow... the server shut itslefs down again
October 20, 201312 yr Frank: i get your point, but i tell ya, my case is clean it was worth mentionning however. Good call about wiping the flashdrive, i'll give this a try! Darts: i did not know about a new webgui, i might give this a try as well. I was about to post the content of the go file but hey, whaddayaknow... the server shut itslefs down again The case being clean doesn't necessarily mean the CPU fan is working correctly unless you've already checked it.
October 20, 201312 yr Author ...granted. The CPU fan is working correctly, no problem there. Ditto the case fans, for that matter.
October 20, 201312 yr If a wipe of the flash drive doesn't do it, you can rule out simplefeatures causing it as well. If that doesn't fix it, you're likely looking at a hardware problem. One thing you can do is leave a telnet window open and run tail -f /var/log/syslog Nevermind, I just saw the log you added. EDIT: Actually, looks like you took that log after a restart. Run tail -f /var/log/syslog and let it run until the server shuts down. That will capture what is written to the syslog when the server shuts down. How long does the server run before it shuts down?
October 20, 201312 yr Author Yup, the wiped flash is the next thing on my list As for the time the server runs before it shuts down, it is kind of hard to say. My guess would be between 8 and 10 hours or so. I've turned it on this morning at around 10.30AM, it ran fine until i've checked again at around 9.00PM.
October 21, 201312 yr One more thing to check. If you have a UPS connected to the server, check to see if the battery is still good. Today, I just ordered a new one for my test bed server. The server had shutdown without a known reason yesterday. As I went through the possible reasons, I realized that this was an old UPS and that it still had the original battery (30th week of 2008 as it turned out) installed. To check it, I shut the server down normally. I unplugged the server and plugged in a 150W lamp. When I pulled the UPS's plug from the wall, the lamp went out instantly. I have a new battery on order... I assume that I had a power anomaly that caused the UPS to switch to battery and when the battery couldn't deliver the power required, the UPS shutdown completely.
October 21, 201312 yr Author Hey Frank, also worth mentionning in case other need troubleshooting for the same issue, but my server is not connected to a UPS cheers!
October 21, 201312 yr my server is not connected to a UPSWhich means any power blip will probably cause issues similar to what you are seeing.
October 21, 201312 yr Author I get that, but honestly, this is may be my first running unRaid, but i had a few computers and server running over the years and i've never ever suffered a rig shutting down completely every day because of tension shifts. The PSU alone is able to regulate the electricity efficiently against power variations anyway. I don't reject your theory at all, but i'm certain there are other options to explore first
October 21, 201312 yr The PSU alone is able to regulate the electricity efficiently against power variations anyway. I don't reject your theory at all, but i'm certain there are other options to explore first Other options like a failing PSU that can't regulate properly?
October 21, 201312 yr Author yup, something along like these lines, exactly i'd rather try and find out if my existing gear is working correctly rather than guessing if i'd near more gear like a UPS i don't have.
October 22, 201312 yr Author So far the server is holding up, the uptime is now 19h42 since last boot yesterday evening in comparison with these last few days this is a pretty darn good improvement, i'd say. Yesterday i wiped the whole usb flashdrive and started from scratch, so i guess the update i did from 4.7 to 5.0 wasn't perfect (i just overwrote the new files over the old ones). I'm running without simplefeatures now, so that might be another reason for stability. The only modification i did was to use the webGUI. Darts: do you still run BTSync?
October 22, 201312 yr Can't imagine my life without it Great news for your server, my money is on another glitch between SF & 5.0 though
Archived
This topic is now archived and is closed to further replies.