(SOLVED) server random restart/freeze


forbi

Recommended Posts

Hi to all.
 

First of all, I apologize for my english, it's not my native language.
Second, thanks for this awesome OS!

 and now, as for my problem: I assembled a new server from scratch, (below I attach the specifications) and already from the first day, I noticed that the server had restarted overnight. I didn't give it any weight, thinking of a "physiological" restart (coming from the windows world) but then the problem came back, with reboots / freezes that occur from less than an hour of uptime to 8-10 hours without a single crash. The server remains physically powered on, but I can't access it via GUI or mapped network folder, and the only way to regain access to it is to brutally shut it down. At each restart obviously a 9-10 hours parity check is performed, which never finds errors in the disks (fortunately) and from the logs I can't understand what the problem is because it wrote only the startup data, as I "lose" those saved on the ram; I also tried to transcribe the logs on USB, but I find only a practically empty .txt.
I also installed the nerdtools and mcelog as suggested on this forum, but it doesn't work on amd and I don't find the equivalent of this processor.  I only installed Krusader, jdownloader and plex as docker and the recommended apps (CA, auto update, backup, cleanup, dinamix exc) but the first freeze occurred with the new installation and only krusader as docker so I don't think the problem is with the apps or caused by them.

Ps, in the logs sometimes I find hardware errors that indicate issue on 1 core of the processor, but being new I believe and I hope it is not that, also because I do not always find the same error. I have also already changed the ram (2x4gb to 1x8gb), so even in this case I'm pretty sure it's not their fault.

help me pls :(

 

M/B: Gigabyte Technology Co., Ltd. B450 I AORUS PRO WIFI

BIOS: American Megatrends Inc. Version F50. Dated: 11/27/2019

CPU: AMD Ryzen 3 1200 Quad-Core @ 3100 MHz

HVM: Disabled

IOMMU: Enabled

Cache: 384 KiB, 2048 KiB, 8192 KiB

Memory: 8 GiB DDR4 (max. installable capacity 128 GiB)

Network: bond0: fault-tolerance (active-backup), mtu 1500
 eth0: 100 Mbps, full duplex, mtu 1500

Kernel: Linux 4.19.94-Unraid x86_64

OpenSSL: 1.1.1d

PSU: corsair vs350 (350w)

a random videocard, used just to boot the sistem (the CPU is not an APU), i can't remember what i put in XD.


EDIT:
Crushed again just now(about an hour of uptime),and with a full sistem restart this time, added another startlog.
 

thevault-syslog-20200118-1317.zip thevault-syslog-20200120-2011.zip thevault-syslog-20200119-2254.zip thevault-syslog-20200119-1046.zip

thevault-syslog-20200120-2110.zip

Edited by forbi
Link to comment

power should be enough for this system (data posted top plus 4 hdd and 1 nvme for cache). and cooling to, system always report a temp of 35-45°.
i did not run a memtest but I entirelly changed the ram and both the ones used first and the following one are brand new.

edit: added the syslog (i forgot i setup this too)

syslog.log

Edited by forbi
Link to comment

oh, and after a brutal restart, fixcommonproblems output this to me:
Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged 
but i can't see mcelog because it does not support amd XD

5 minutes ago, trurl said:

Do memtest just to eliminate that as a possibility. You really should want to know your RAM is OK. Everything goes through memory, your data, the executable code, everything.

i will do it asap!

 

 

Link to comment

 

ok, that's strange, the istant I click "memtest86+" the system reboot, in loop.

 

 

4 minutes ago, PeteUnraid said:

This error I googled and it lead me to some people saying they changed the PSU here:

mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000000000108

 

https://community.amd.com/thread/216084#comment

 

 

I had also read that thread, but it also restarts in idle or in any case with very light loads (15-20% cpu load)

Link to comment
3 minutes ago, forbi said:

Jan 19 11:55:34 TheVault rsyslogd: [origin software="rsyslogd" swVersion="8.1908.0" x-pid="12497" x-info="https://www.rsyslog.com"] start
Jan 19 13:27:59 TheVault kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Down
...

...

...

Jan 20 22:17:12 TheVault root: CPU is unsupported
Jan 20 22:17:13 TheVault root: Fix Common Problems: Warning: Syslog mirrored to flash

 

Was that wall of text supposed to be a response to this request?

28 minutes ago, trurl said:

Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post.

 

In the Unraid webUI, click on Tools on the main menu, then click on Diagnostics. From that page, you can download the zipped Diagnostics file I am asking for.

 

Attach that zip file to your NEXT post.

Link to comment
3 minutes ago, forbi said:

 

ok, that's strange, the istant I click "memtest86+" the system reboot, in loop.

 

 

 

 

I had also read that thread, but it also restarts in idle or in any case with very light loads (15-20% cpu load)

In the thread they say it doesnt matter can still be psu if you read entire thread. 

Link to comment
12 minutes ago, forbi said:

no sorry @constructor, i missclicked the copy-paste but the zip you asked for is literally 3 post above yours :D

sorry "@Newbie", but I don't see what I asked for anywhere in this thread. And the reason I asked for it to be attached to your NEXT post is so I wouldn't have to go looking for it in a previously edited post. But I did look and I still don't see it. All I see are a lot of syslogs.

 

The diagnostics I am asking for includes syslog, but it also includes a lot of other information that gives a more complete understanding of your situation, such as SMART for all disks, the output of some simple commands, the settings you have made in the webUI, and other things.

  • Like 1
Link to comment

ok now i'm at work with a memtest running on the server at home, i will update asap.

 

1 hour ago, johnnie.black said:

Also, Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates.

 

More info here:
https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/

 

bios is up to date (f50 for the aorus b450 pro wifi, early dec release) and I will try that to, thanks!

Edited by forbi
Link to comment
  • forbi changed the title to (SOLVED) server random restart/freeze
  • 2 months later...
On 1/21/2020 at 1:32 AM, johnnie.black said:

Also, Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates.

 

More info here:
https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/

 

This may be resolved but if anyone else runs into this issue. I have to agree with this. I am just getting to research unraid to be sure I would not have these issue. 

I have the same board with bios f50. I had to disable these just to get Linux to install. W10 works fine. 

AMD Support - States the new 3000 generation does not have the issue of freezing any more. Willing to swap my CPU.

GB Support - We do not support Linux(open source). And had the nerve to say "If windows works use it". Not willing to do anything.

In either case with these options disabled in the bios my system no longer freezes when in Linux. I am okay for now. Now I just want to not need wine to play a game. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.