Jump to content

Unraid Freezes every night


LSL1337

Recommended Posts

Started using it 2 weeks ago.

In the first week, it didn't freez for 4 days

after that, when i was copying my data to the shares, it froze.

Now it freezes every night (with no activity even).

last friday it even froze during the day, with minimal activity (100kb/s torrent)

 

any idea?

 

memtest doesn't show anything

today I installed win10 on an SSD, and did stress test, no freez (not through unraid VM)

 

i have no idea what causing it.

 

my to next guesses is to change out the RAM, or the USB flash drive

 

i copied unraid to a new USB flash drive i had lying around, and after boot, it asks for a new registration key.

I don't want to blacklist my original Fit Sandisk USB drive for this "investigation" which i specifically bought for this reason.

 

if i leave the Log tab open on the webgui overnight, there is nothing new in there by morning.

i can't seem to find any other logs from the previous starts.

 

Mover doesn't causes this (weekly), or some scheduled tasks maybe? but friday it dies during the day.

I'm on the latest version, but it did the same on the previous one (6.2.3 maybe?)

 

so what now?

unraid seemed cool and all, but if it can't fockin work for 24 hours with 0% CPU or disk activity, that is just ridiculous.

 

Thanks

Link to comment

This isn't a defect report as it's isolated to you. 

 

You're having an issue with your setup and this is the first post you've made on the topic.  I'm sure in time a mod will move it into the general support section. 

 

In the meantime if you look at the guidelines in the general support section it will show you some helpful stickied posts, particularly this one which will highlight the need to post some diagnostics.

 

And no need to swear.  Unraid can work for 24 hours, Mine has been up for over three months before now and if it weren't for the more frequent upgrades we're seeing these days I have no doubt that I would be able to surpass the 2 or 3 weeks I seem to manage at the moment before I reboot to upgrade.

 

Bottom line, we don't know anything about your hardware or it's specs, what you've got running, dockers or VMs, so at the moment, nobody can really help sort this out for you.

Link to comment

okay, this may have been the wrong section, but how can i capture a syslog, if the machine freezes? after reboot, there is nothing there from the previous boot, is there?

it's not a hardware fault (maybe the USB drive), cos under windows everything is fine even with stress testing

Two ways

 

Install the Fix Common Problems Plugin and put it into troubleshooting mode or from the local keyboard / monitor

 

tail -f /var/log/syslog > /boot/syslog.txt

 

FCP would probably be the way to go as in troubleshooting mode it logs extra info every 10 minutes to try and gather a trend of what is going on.  (if its software related).  Hardware related, the extra info isn't really relevant.

Link to comment

the nas is still next to me, so i can hook up a monitor and keyboard, to type it in, thx, will do!

k.  Nothing will appear to happen on the screen, but the syslog will be written to the flashdrive as it changes.  After the crash, hopefully something will appear within it.  (And maybe at the crash something will also appear on the screen itself -> take a pic)

 

the nas is still next to me, so i can hook up a monitor and keyboard, to type it in, thx, will do!

 

When it freezes does it still accept local connected commands from that keyboard?

Very unlikely.
Link to comment

the nas is still next to me, so i can hook up a monitor and keyboard, to type it in, thx, will do!

k.  Nothing will appear to happen on the screen, but the syslog will be written to the flashdrive as it changes.  After the crash, hopefully something will appear within it.  (And maybe at the crash something will also appear on the screen itself -> take a pic)

 

the nas is still next to me, so i can hook up a monitor and keyboard, to type it in, thx, will do!

 

When it freezes does it still accept local connected commands from that keyboard?

Very unlikely.

 

I wasn't 100% clear if it was just the webui freezing that's why I was asking....

Link to comment

Hello

 

So, last night it crashed again.

This morning I specifically didn't check anything new (run to work), but I didn't change anything since last time.

 

Last week after every freez:

webui can't connect (unraid, torrent, plex)

from windows cmd, no response to local IP ping.

on the local monitor (hdmi), the cursor was blinking, but i couldn't enter anything (weird) The usb keyboard was connected to the front usb, next time i will test it in the back usb. maybe. (and take a pic from the local monitor, but last time there was nothing there, seemed like the same screen which is up after the boot, but i didn't memorize it )

 

my log file:

booted around 8PM, went to bed around 1 am (Europe), and last log entry after 2 am.

I don't see anything special in there, it just stops.

 

https://www.dropbox.com/s/srqpjta9vwuipvw/log_20161121.txt?dl=0

 

Yesterday, when i was testing in Win10, it didn't freez, but i didn't kept it overnight, but it was on FULL aida stress test, which is much much more than it will ever face with unraid, specially idle.

THe server hovers around 1-5% cpu usage, so it can't really be CPU error. (my guess)

I ran memtest for an hour (2 pass) last week, nothing.

 

(I have mover scheduled around 3:30 I think, and the daily scheduled jobs (plugin) around 4, so it crashed before that.

 

what's next for troubleshooting?

the CPU is around 50C idle with a big heatsink, the fans turn on around around 55C (just a sidenote, in W10, with 1-3% cpu usage, it downclocked to 0,9Ghz, and my temps were at around 40C with no cpu fan spinning)

there is another fan for the system@5V (ssd, South bridge, ram)

i3 4170T I think, so pretty low power/heat.

 

one more thing: when i reseted this morning, it seemed quiter right after the reset. maybe something was spinning up one fan while it was frozen? I'll check it next time but still, if the fans spin up even a little, the cpu can't go above 60C during extreme AIDA stress test)

there were no plex transcodes/encodes in progress either, so cpu usage during hte night should be very low still

 

any ideas?

 

are these signs consistent with a flash drive error?

 

I just checked Plex scheduled tasks, and it starts at 2am. today I will turn it off, maybe.

as I think back, in the first few days, when it didn't crash EVERYnight, i was using the linuxserver.io Plex docker, and afer a week, i changed it to binhex(?) docker. But I definitely had freezes on the first days as well, just not every night

Link to comment

You already beat me to it, but I still think this applies.

 

Do yourself a favor and all of us. Take a few minutes and post everything about your system in your Forum Signature. That will help us help you in the future because you only have to type it in once. ;)

 

https://lime-technology.com/forum/index.php?action=profile;area=forumprofile

 

Also please give us a complete listing of all dockers/plugins that you have installed. Obliviously something is crashing your system or you could simply run your machine with no Dockers/Plugins and see if that makes a difference

 

 

Link to comment

It's a mid-high end brand new EVGA, semi passive, 550W...

This is more typing than would have been required to give me the exact model. The main reason we ask for the exact model is so we can make sure it is single +12V rail. But if you only have one drive it shouldn't matter for now, unless of course it is defective or being used improperly.
Link to comment

I'm at work, i don't know my PSU by heart

G2 or GS, the cheapest semi passive (eco) one i could find. still 100 dollar+

 

 

ohh nvm, i remembered, its a Corsair RMx550W. They were out of EVGA-s at the store.

 

my first test will be the Plex scheduler

after that a new 8 gig stick in either slots

(btw, as i said, memtest did not turn up anything)

 

after that i will stop every docker

after that i'll stop every plug-in

 

day by day, i hope I'll find the error

 

thanks, it will cover me for a week to "test" all scenarios

Link to comment

Don't fret. Its probably something simple and for me that's always my luck. I'd just work backwards to:

What have I changed since it worked flawlessly?

Are my drives full or could be causing the mover to fail if it is?

What big dockers do I have running that could be causing memory or drive hang?

 

Just keep in mind we are asking questions simply because we aren't sitting at your machine and can't look, but want to help.

Link to comment

Ok, edited sig, now everything is in.

I just started with unraid, it never really workd flawlessly for me.

The first days, i thought it is the CPU cos i tried undervolting it straight away at -0,05V, just to try it out. After that, i went back to stock, and even froze on that, than i went +0,05, it still froze on that. After that i cracked up the cooling a little, still, same problem. (nothing in W10 STRESS test)

 

I don't think it is hardware, but you can't be sure about RAM.

 

I think plex (binhex plexpass) is my biggest docker, and deluge+qbittorrent

plex has scheduled tasks around the time my last freez occured, and it matches up with the weired every night scenario, so that's my next best guess.

 

can it be caused by faulty USB drive?

Link to comment

Ok, edited sig, now everything is in.

I just started with unraid, it never really workd flawlessly for me.

The first days, i thought it is the CPU cos i tried undervolting it straight away at -0,05V, just to try it out. After that, i went back to stock, and even froze on that, than i went +0,05, it still froze on that. After that i cracked up the cooling a little, still, same problem. (nothing in W10 STRESS test)

 

I don't think it is hardware, but you can't be sure about RAM.

 

I think plex (binhex plexpass) is my biggest docker, and deluge+qbittorrent

plex has scheduled tasks around the time my last freez occured, and it matches up with the weired every night scenario, so that's my next best guess.

 

can it be caused by faulty USB drive?

 

Your Sig dosnt mention the RAM - only the Type of RAM... DDR3... yes, ok... and which manufacturer? Was it new? Old? Used? What is it?  ;)

Link to comment

Sooooo, good news everyone :)

 

It didn't lock up by this morning:

 

Things i've changed:

 

I was checking the CPU clocks (ssh), because the idle temp was little higher than what I was seeing in windows 10. (40 vs 55).

It turns out, that my cpu doesn't clock down from 3300Mhz, or just barely.

I turned off the P-State driver (cfg file). (reverts back to ACPI or something driver)

After reboot, my clocks stayed at 800Mhz almost all the time (torrent, smb, etc), they only go to 3300Mhz, if I transcode in plex (very very rarely).

 

I started to check the running processes (top, ssh).

My most cpu hungry process was 'scrapemagnet' or something. google didn't turn up much, it was something plex related.

I stopped the plex docker, it stopped as well.

 

So, i started looking at plex MS.

I noticed that somehow i had around 50-70 plug-ins installed (I only installed like 2 manually). Some of it was VoD plug-ins, or redtube and stuff like that. One was a bittorrent downloader (which had the scrapemagnet process in it).

 

So, I deleted all of it, and today my server didn't lock up.

I changed the plex schedule to 9am-11am (it's now 10am), and no freez yet.

I think some of those plex plug-ins had a problem with the daily schedule, and would lock up my whole system. (shouldn't it just kill the docker?)

Tomorrow I will temporarily revert back to the P-State Intel CPU driver/governor, to make sure it wasn't the root cause of my problems, but becaouse of the timing, i suspected plex anyway, but it's better to make sure.

I hope my server will be up by tomorrow morning as well :)

 

thanks for the help/ideas guys!

Link to comment
  • 2 weeks later...

OK

 

last week I had a 5 day uptime, but it froze 2 days ago.

 

since than i turned on troubleshooting again, and after around 30 hours, it frooze again.

now my freez is outside of the daily scheduled window.

 

this time i was running the TOP command on the local machine, so you can see the processes and time at the time of the freez.

 

any idea?

 

anything better than killing all my plugins and dockers, and after a week, turning them back 1by1?

 

log:

https://www.dropbox.com/s/3xqhb3suzhltwrj/lsl-nas-diagnostics-20161204-1552.zip?dl=0

screen:

https://www.dropbox.com/s/6l6v95arh9d1uur/IMG_20161204_162826.jpg?dl=0

 

 

thanks.

 

ps: emhttp process is weired, i wasn't using the web gui for sure for 2 hours before the freez. Is it normal that the process was running with usage?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...