January 20, 20206 yr Hi to all. First of all, I apologize for my english, it's not my native language. Second, thanks for this awesome OS! and now, as for my problem: I assembled a new server from scratch, (below I attach the specifications) and already from the first day, I noticed that the server had restarted overnight. I didn't give it any weight, thinking of a "physiological" restart (coming from the windows world) but then the problem came back, with reboots / freezes that occur from less than an hour of uptime to 8-10 hours without a single crash. The server remains physically powered on, but I can't access it via GUI or mapped network folder, and the only way to regain access to it is to brutally shut it down. At each restart obviously a 9-10 hours parity check is performed, which never finds errors in the disks (fortunately) and from the logs I can't understand what the problem is because it wrote only the startup data, as I "lose" those saved on the ram; I also tried to transcribe the logs on USB, but I find only a practically empty .txt. I also installed the nerdtools and mcelog as suggested on this forum, but it doesn't work on amd and I don't find the equivalent of this processor. I only installed Krusader, jdownloader and plex as docker and the recommended apps (CA, auto update, backup, cleanup, dinamix exc) but the first freeze occurred with the new installation and only krusader as docker so I don't think the problem is with the apps or caused by them. Ps, in the logs sometimes I find hardware errors that indicate issue on 1 core of the processor, but being new I believe and I hope it is not that, also because I do not always find the same error. I have also already changed the ram (2x4gb to 1x8gb), so even in this case I'm pretty sure it's not their fault. help me pls M/B: Gigabyte Technology Co., Ltd. B450 I AORUS PRO WIFI BIOS: American Megatrends Inc. Version F50. Dated: 11/27/2019 CPU: AMD Ryzen 3 1200 Quad-Core @ 3100 MHz HVM: Disabled IOMMU: Enabled Cache: 384 KiB, 2048 KiB, 8192 KiB Memory: 8 GiB DDR4 (max. installable capacity 128 GiB) Network: bond0: fault-tolerance (active-backup), mtu 1500 eth0: 100 Mbps, full duplex, mtu 1500 Kernel: Linux 4.19.94-Unraid x86_64 OpenSSL: 1.1.1d PSU: corsair vs350 (350w) a random videocard, used just to boot the sistem (the CPU is not an APU), i can't remember what i put in XD. EDIT: Crushed again just now(about an hour of uptime),and with a full sistem restart this time, added another startlog. thevault-syslog-20200118-1317.zip thevault-syslog-20200120-2011.zip thevault-syslog-20200119-2254.zip thevault-syslog-20200119-1046.zip thevault-syslog-20200120-2110.zip Edited January 20, 20206 yr by forbi
January 20, 20206 yr Community Expert Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post. Then setup Syslog Server so you can save syslogs: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601
January 20, 20206 yr Community Expert Are you sure you don't have a power or cooling issue? Have you done memtest?
January 20, 20206 yr Author power should be enough for this system (data posted top plus 4 hdd and 1 nvme for cache). and cooling to, system always report a temp of 35-45°. i did not run a memtest but I entirelly changed the ram and both the ones used first and the following one are brand new. edit: added the syslog (i forgot i setup this too) syslog.log Edited January 20, 20206 yr by forbi
January 20, 20206 yr Community Expert Do memtest just to eliminate that as a possibility. You really should want to know your RAM is OK. Everything goes through memory, your data, the executable code, everything.
January 20, 20206 yr Author oh, and after a brutal restart, fixcommonproblems output this to me: Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged but i can't see mcelog because it does not support amd XD 5 minutes ago, trurl said: Do memtest just to eliminate that as a possibility. You really should want to know your RAM is OK. Everything goes through memory, your data, the executable code, everything. i will do it asap!
January 20, 20206 yr Community Expert 20 minutes ago, trurl said: Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post.
January 20, 20206 yr This error I googled and it lead me to some people saying they changed the PSU here: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000000000108 https://community.amd.com/thread/216084#comment
January 20, 20206 yr Author ok, that's strange, the istant I click "memtest86+" the system reboot, in loop. 4 minutes ago, PeteUnraid said: This error I googled and it lead me to some people saying they changed the PSU here: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 5: bea0000000000108 https://community.amd.com/thread/216084#comment I had also read that thread, but it also restarts in idle or in any case with very light loads (15-20% cpu load)
January 20, 20206 yr Community Expert 3 minutes ago, forbi said: Jan 19 11:55:34 TheVault rsyslogd: [origin software="rsyslogd" swVersion="8.1908.0" x-pid="12497" x-info="https://www.rsyslog.com"] start Jan 19 13:27:59 TheVault kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Down ... ... ... Jan 20 22:17:12 TheVault root: CPU is unsupported Jan 20 22:17:13 TheVault root: Fix Common Problems: Warning: Syslog mirrored to flash Was that wall of text supposed to be a response to this request? 28 minutes ago, trurl said: Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post. In the Unraid webUI, click on Tools on the main menu, then click on Diagnostics. From that page, you can download the zipped Diagnostics file I am asking for. Attach that zip file to your NEXT post.
January 20, 20206 yr Author no sorry @constructor, i missclicked the copy-paste but the zip you asked for is literally 3 post above yours
January 20, 20206 yr 3 minutes ago, forbi said: ok, that's strange, the istant I click "memtest86+" the system reboot, in loop. I had also read that thread, but it also restarts in idle or in any case with very light loads (15-20% cpu load) In the thread they say it doesnt matter can still be psu if you read entire thread.
January 20, 20206 yr Community Expert 12 minutes ago, forbi said: no sorry @constructor, i missclicked the copy-paste but the zip you asked for is literally 3 post above yours sorry "@Newbie", but I don't see what I asked for anywhere in this thread. And the reason I asked for it to be attached to your NEXT post is so I wouldn't have to go looking for it in a previously edited post. But I did look and I still don't see it. All I see are a lot of syslogs. The diagnostics I am asking for includes syslog, but it also includes a lot of other information that gives a more complete understanding of your situation, such as SMART for all disks, the output of some simple commands, the settings you have made in the webUI, and other things.
January 20, 20206 yr Author ehmmm sorry Trurl... I'm very very tired and I read things understanding other... here is it ps any idea why memtest istantly reboot the pc? thevault-diagnostics-20200120-2303.zip
January 20, 20206 yr Community Expert 3 minutes ago, forbi said: any idea why memtest istantly reboot the pc? The version of memtest that ships with Unraid only works when booting in legacy mode. If you want a version that works in UEFI mode then you need to download it from the memtest86+ web site.
January 20, 20206 yr Community Expert I have seen other reports of that. I think it had something to do with UEFI. I think the recommendation was to download memtest to another flash and boot it from that.
January 21, 20206 yr Community Expert Also, Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates. More info here: https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/
January 21, 20206 yr Author ok now i'm at work with a memtest running on the server at home, i will update asap. 1 hour ago, johnnie.black said: Also, Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates. More info here: https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/ bios is up to date (f50 for the aorus b450 pro wifi, early dec release) and I will try that to, thanks! Edited January 21, 20206 yr by forbi
January 21, 20206 yr Author some update: memtest runned for about 12 hours, 0 error. now i'm trying with c-states disabled, 2.5 hours uptime, I keep my fingers crossed hoping it's "just" this to solve the problem
January 23, 20206 yr Author 50 continuous hours of uptime, I consider the problem solved by the suggestion of johnnie.black! thank you!
April 2, 20206 yr On 1/21/2020 at 1:32 AM, johnnie.black said: Also, Ryzen on Linux can lock up due to issues with c-states, make sure bios is up to date, then look for "Power Supply Idle Control" (or similar) and set it to "typical current idle" (or similar), or completely disable C-sates. More info here: https://forums.unraid.net/bug-reports/prereleases/670-rc1-system-hard-lock-r354/ This may be resolved but if anyone else runs into this issue. I have to agree with this. I am just getting to research unraid to be sure I would not have these issue. I have the same board with bios f50. I had to disable these just to get Linux to install. W10 works fine. AMD Support - States the new 3000 generation does not have the issue of freezing any more. Willing to swap my CPU. GB Support - We do not support Linux(open source). And had the nerve to say "If windows works use it". Not willing to do anything. In either case with these options disabled in the bios my system no longer freezes when in Linux. I am okay for now. Now I just want to not need wine to play a game.
Archived
This topic is now archived and is closed to further replies.