Hardware Error..


Recommended Posts

5 hours ago, Squid said:

Nothing to worry about.  Certain combinations of cpu's / motherboards / lack of fiber in your diet tend to issue an MCE when initializing the CPU at boot up time.  This is what's happening to you and can be safely ignored (except the the fiber)

Thank you for the response, I have some questions for you though.

 

Are you positive about this? Here's what went down. I was wanting to know how to monitor temps in unRAID. Found Dynamic Temps. Get that going, not knowing 100% if I had the correct probes set for CPU and Mobo temps. Started getting CPU overheating and throttling warnings... I was in panic mode after seeing this for 3 weeks. So I went and changed out the 7 year old AIO thinking that maybe the pump died(however no noticeable warmth inside the case AT ALL, which I think the case would at least be blowing warm air if the CPU was reaching 80c) with no changes, temps stayed the exact same, still getting the same warnings 2 times per day. 

 

Chatting around in an unRAID discord channel, someone mentioned using IPMI, wasn't aware that my MOBO and hardware had to support IPMI, but was trying to see if my hardware did support it. Not thinking that it does. But ever since then, I've been getting the MCE warnings. Another thing, you say "MCE when initializing the CPU at boot up time." However my machine hasn't booted up since doing any of this, it's been running for weeks without rebooting. I didn't notice though, that my server stops responding quite frequently the last couple of days, and it seems to be like "soft resetting" somehow? Meaning the Uptime counter, has reset itself a couple of times without me actually rebooting the system or turning it off. My Plex users and myself are also noticing severe interruption the last couple of days. 

 

Someone, please? I'm a noob with unRAID but I'm TRYING. I'm just having some issues way out of my league right now. If I need to replace hardware I have no problem doing that, but I also don't have money to just keep wasting on trying different things.

Link to comment
12 minutes ago, BennyD said:

Are you positive about this?

Everything but the fiber.  That's between you and your doctor.

 

It's quite common for an MCE to be issued at initialization time.  Most likely due to a micro-code update to the processor.  

 

Random reboots.  (There is no other way for uptime to be reset)  Primary cause would be memory issues (run memtest for a few passes) / power supply / etc.

17 minutes ago, BennyD said:

7 year old AIO

Can't really tell from the diagnostics, but AIO *implies* overclocking.  As I'm sure you're aware all overclocking introduces instability, and that instability increases over time.  If you are OC (including running RAM at XMP/AMP profiles), then stop and see if the situation improves.

Link to comment

No over clocking here. Though thinking about the XMP profile. There may actually be one set, however if there is, it's been set that way since day one of starting to use unRAID over a year ago. And it's extremely strange, these "random reboots" because the system isn't actually rebooting, it takes way longer to boot back up than it takes for me to be able to re-gain access to the webui. If that makes sense to you. So, I usually can't access the webui for a couple of minutes after powering on the server, but with whats going on lately, the webui just becomes unresponsive, and in just a minute it works again, convincing me that it's not doing a full reboot, and by unresponsive, I mean the server can't be accessed at all for a couple of minutes. Like I said though, getting it to boot up normally takes longer than what is happening now. I'm so confused and paranoid... LOL

 

I would kind of like to just get some new hardware, but I don't know what I should really buy.

Edited by BennyD
Link to comment

@Squid


Did you happen to notice anything in my diagnostics that would be causing "Your CPU is overheating and has been throttled down (This may however be a transient occurance). You may need to clean your filters and/or increase your cooling capacity." from Fix Common Problems? I am getting this notification at least once per day.

Link to comment
Just now, Squid said:

Can't say I looked for / noticed it.  But if that message is showing up within FCP, then your CPU is definitely throttling itself down because of heat.

It just doesn't make sense to me... I've tried 2 different coolers, there is PLENTY of airflow in the case. Hell, none of the air exhaust or anywhere on the mobo or around the CPU is ever even warm, that's why I am asking. It's really confusing me, Aircooled and Liquid cooled temps are the exact same.

Link to comment
1 minute ago, Squid said:

It's not in your posted diagnostics, but since it's appeared since, this message is within your syslog somewhere

 


Package temperature above threshold

 

If I were to post the proper log, would you be able to help pinpoint? I'm trying not to go spend $800 if I don't NEED to. LOL

Link to comment

What happened at 1:23 am this morning, 21:16 yesterday, and 00:57 yesterday.  In all 3 circumstances, it "unthrottled" it immediately afterwards...  But it's a message directly from the CPU.  Only suggestion is that some bios' have their temperature for overheating on the CPU set really low

Link to comment

@Squid Sorry to keep tagging and bothering you... I am seeming to be having more random reboots now, and it is screwing up Dockers here and there, I've had to rebuild/re-setup and completely re-import and re-build my Plex meta-data and such... I'm not really sure what is going on... I have new hardware arriving Tuesday... Hoping that my current hardware is the problem... Attached is my Syslog from right after the last random reboot....

 

Thank you so much for your time and helping this community out! I'll be buying you a few beers soon!

 

 

tower-syslog-20200426-0334.zip

Edited by BennyD
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.