March 31, 20206 yr So I build an UnRaid server after having watched SpaceinvaderOne's videos. I watched all his videos for a month (more than once) and decided to build my own UnRaid server, as a replacement for my Synology NAS + Mac Pro running Plex. It's been online for 4 days, and has crashed twice. Totally unresponsive, all IP's and no response to anything. Krusader was transferring files as it happened. But I am not convinced, this to be the problem. I deleted Binhex-Krusader and installed a different version, but still crashed with new version of Krusader. My config is as follows: Power: Corsair HX850i MB: AsRock X570 Extreme4 CPU: Ryzen 9 -> 3900x GPU: Nvidia Quadro p2000 RAM: 4 x 16GB Corsair Vengeance LPX DDR4 3600 MHz Netværkskort: Dual 10Gbe Solarflare HDD: 2 x XPG SX8200 Pro 2TB 3D NAND NVMe Gen3x4 PCIe M.2 (cache) 3 x 14 TB WD Red 1 x 6 TB WD Red (unregistered for CCTV) Dunno where to find logs after reboot to check reason for crash? H.E.L.P. Edited March 31, 20206 yr by elcapitano
March 31, 20206 yr Very first thing is to read this post https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/?tab=comments#comment-819173
March 31, 20206 yr Community Expert 4 minutes ago, elcapitano said: RAM: 4 x 16GB Corsair Vengeance LPX DDR4 3600 MHz In some cases Ryzen is known to be unstable with overclocked RAM, see here for more info.
March 31, 20206 yr Community Expert Also 18 minutes ago, elcapitano said: H.E.L.P. Go to Tools - Diagnostics and attach the complete diagnostics zip file to your NEXT post. Then 19 minutes ago, elcapitano said: Dunno where to find logs after reboot to check reason for crash? Setup Syslog Server to get your logs saved somewhere you can retrieve them after crash: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601
March 31, 20206 yr Community Expert Looks like you are running Nvidia Unraid build. Does it happen with the standard Unraid build?
March 31, 20206 yr Author Would that still make Plex GPU transcoding possible? Maybe I didn't get the memo. If not, my incentive to move across to UnRaid is reduced. Will look it up . . .
March 31, 20206 yr Author I truly appreciate your answer. Are you saying, there is no need for the Nvidia build?
March 31, 20206 yr The Nvidia Build is required to use a Nvidia GPU with Docker (the alternative being hardware passthough to a VM). The Nvidia Build is however a community supported plugin and not an Officially Supported unRAID build. One of the normal steps in debugging is to disable all plugins (and revert to a Stock unRAID build) to determine if the issue is with unRAID itself or one or more of the plugins.
March 31, 20206 yr Author Thanks . . I removed the Nvidia Build Plugin. Went 24 hours before crash, sp I guess we will know soon. But, I gotta say, wow, very disappointing if I can't use this plugin. My entire incentive to move across . . .
March 31, 20206 yr Community Expert 1 minute ago, elcapitano said: Thanks . . I removed the Nvidia Build Plugin. Went 24 hours before crash, sp I guess we will know soon. But, I gotta say, wow, very disappointing if I can't use this plugin. My entire incentive to move across . . . When you say you removed the plugin do you mean that you reverted to a standard Unraid build? If you merely meant that you actually removed the plugin that would not revert to the standard Unraid build.
March 31, 20206 yr 10 minutes ago, elcapitano said: Thanks . . I removed the Nvidia Build Plugin. Went 24 hours before crash, sp I guess we will know soon. But, I gotta say, wow, very disappointing if I can't use this plugin. My entire incentive to move across . . . I suspect your problem is caused by hardware stability issues (BIOS settings, overclocked RAM, etc. ) in general and has nothing to do with the UnRaid Nvidia plugin/build. There are many, many unRAID users running that build with hardware similar to yours. As @johnnie.black pointed out, with all four RAM slots on the MB populated, the fastest RAM speed a 3rd Gen Ryzen can support is DDR4-2667. If you are attempting to run the RAM at its rated (overclocked speed) of DDR4 3600 that will cause crashes.
March 31, 20206 yr Author Thanks, that actually makes sense. Will check BIOS... All I did, was update the BIOS, before building the server.
April 1, 20206 yr Author Thanks for the heads up. This is how the RAM is installed: This is the from vendors website: Do you have any suggestion as to how it should be set in BIOS?
April 1, 20206 yr Author Recon the nvidia suggestion was correct. I had multiple entries like this before the system became unresponsive: Apr 1 08:06:30 MASTER kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Apr 1 08:06:30 MASTER kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Removed the GPU Statistics Plugin, and the log entries reduced.
April 1, 20206 yr Community Expert 6 hours ago, elcapitano said: Recon the nvidia suggestion was correct. I had multiple entries like this before the system became unresponsive: Apr 1 08:06:30 MASTER kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Apr 1 08:06:30 MASTER kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Removed the GPU Statistics Plugin, and the log entries reduced. Are you still having problems? If so post new diagnostics
April 1, 20206 yr Author Thanks, but for now, it seems that it was the GPU Diagnostics Plugin that introduced multiple warnings. I don't see many of the nvidia warnings any more. Syslog is being recorded off server, so if it happens again, I should have better info.
April 2, 20206 yr Author Froze again . . Krusader disconnected shortly before the server froze. Nothing else from syslog. In the process of transferring media from NAS to UnRaid array, using Krusader. master-diagnostics-20200402-1240.zip
April 2, 20206 yr Community Expert Not related, but I see some things in syslog that suggests you are trying to add admin user. Only root has access to the webUI and command line, and only users you create in the webUI have access to shares over the network.
April 2, 20206 yr Community Expert Apr 2 08:38:04 MASTER kernel: WARNING: CPU: 8 PID: 223 at net/netfilter/nf_conntrack_core.c:945 __nf_conntrack_confirm+0xa0/0x69e Apr 2 08:38:04 MASTER kernel: Modules linked in: nvidia_uvm(O) xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost tap xt_nat macvlan ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod nct6775 hwmon_vid k10temp bonding sfc mdio igb(O) nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) edac_mce_amd crc32_pclmul pcbc aesni_intel aes_x86_64 glue_helper crypto_simd ghash_clmulni_intel cryptd drm_kms_helper drm kvm_amd kvm syscopyarea sysfillrect sysimgblt fb_sys_fops rsnvme(PO) agpgart i2c_piix4 ccp ahci i2c_core libahci wmi_bmof pcc_cpufreq nvme crct10dif_pclmul nvme_core wmi crc32c_intel button acpi_cpufreq [last unloaded: mdio] Apr 2 08:38:04 MASTER kernel: CPU: 8 PID: 223 Comm: kworker/8:1 Tainted: P O 4.19.107-Unraid #1 Apr 2 08:38:04 MASTER kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Extreme4, BIOS P2.30 02/03/2020 Apr 2 08:38:04 MASTER kernel: Workqueue: events macvlan_process_broadcast [macvlan] Apr 2 08:38:04 MASTER kernel: RIP: 0010:__nf_conntrack_confirm+0xa0/0x69e Apr 2 08:38:04 MASTER kernel: Code: 04 e8 56 fb ff ff 44 89 f2 44 89 ff 89 c6 41 89 c4 e8 7f f9 ff ff 48 8b 4c 24 08 84 c0 75 af 48 8b 85 80 00 00 00 a8 08 74 26 <0f> 0b 44 89 e6 44 89 ff 45 31 f6 e8 95 f1 ff ff be 00 02 00 00 48 Apr 2 08:38:04 MASTER kernel: RSP: 0018:ffff888fde803d90 EFLAGS: 00010202 Apr 2 08:38:04 MASTER kernel: RAX: 0000000000000188 RBX: ffff8889945e9b00 RCX: ffff888e4bddf618 Apr 2 08:38:04 MASTER kernel: RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffff81e08fb4 Apr 2 08:38:04 MASTER kernel: RBP: ffff888e4bddf5c0 R08: 00000000e88d3a6f R09: ffffffff81c8aa80 Apr 2 08:38:04 MASTER kernel: R10: 0000000000000098 R11: ffff888f634f9400 R12: 000000000000ca6d Apr 2 08:38:04 MASTER kernel: R13: ffffffff81e91080 R14: 0000000000000000 R15: 000000000000ac8f Apr 2 08:38:04 MASTER kernel: FS: 0000000000000000(0000) GS:ffff888fde800000(0000) knlGS:0000000000000000 Apr 2 08:38:04 MASTER kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 2 08:38:04 MASTER kernel: CR2: 0000146f70289000 CR3: 0000000fd4aa6000 CR4: 0000000000340ee0 Apr 2 08:38:04 MASTER kernel: Call Trace: Apr 2 08:38:04 MASTER kernel: <IRQ> Apr 2 08:38:04 MASTER kernel: ipv4_confirm+0xaf/0xb9 Apr 2 08:38:04 MASTER kernel: nf_hook_slow+0x3a/0x90 Apr 2 08:38:04 MASTER kernel: ip_local_deliver+0xad/0xdc Apr 2 08:38:04 MASTER kernel: ? ip_sublist_rcv_finish+0x54/0x54 Apr 2 08:38:04 MASTER kernel: ip_rcv+0xa0/0xbe Apr 2 08:38:04 MASTER kernel: ? ip_rcv_finish_core.isra.0+0x2e1/0x2e1 Apr 2 08:38:04 MASTER kernel: __netif_receive_skb_one_core+0x53/0x6f Apr 2 08:38:04 MASTER kernel: process_backlog+0x77/0x10e Apr 2 08:38:04 MASTER kernel: net_rx_action+0x107/0x26c Apr 2 08:38:04 MASTER kernel: __do_softirq+0xc9/0x1d7 Apr 2 08:38:04 MASTER kernel: do_softirq_own_stack+0x2a/0x40 Apr 2 08:38:04 MASTER kernel: </IRQ> Apr 2 08:38:04 MASTER kernel: do_softirq+0x4d/0x5a Apr 2 08:38:04 MASTER kernel: netif_rx_ni+0x1c/0x22 Apr 2 08:38:04 MASTER kernel: macvlan_broadcast+0x111/0x156 [macvlan] Apr 2 08:38:04 MASTER kernel: macvlan_process_broadcast+0xea/0x128 [macvlan] Apr 2 08:38:04 MASTER kernel: process_one_work+0x16e/0x24f Apr 2 08:38:04 MASTER kernel: worker_thread+0x1e2/0x2b8 Apr 2 08:38:04 MASTER kernel: ? rescuer_thread+0x2a7/0x2a7 Apr 2 08:38:04 MASTER kernel: kthread+0x10c/0x114 Apr 2 08:38:04 MASTER kernel: ? kthread_park+0x89/0x89 Apr 2 08:38:04 MASTER kernel: ret_from_fork+0x22/0x40 Apr 2 08:38:04 MASTER kernel: ---[ end trace eba31347ec0cb1fc ]--- Looks like something to do with nvidia again
April 2, 20206 yr Community Expert Macvlan call traces are usually related to dockers with custom IP addresses.
April 2, 20206 yr Author Right . . I have edited the docker's with IP to bridge mode . . except PiHole . . Couldn't get it to work without IP. Will look into it later. I do have 1 entry in syslog: Apr 2 16:22:53 MASTER kernel: igb 0000:09:00.0 eth1: mixed HW and IP checksum settings. How do I fix that?
Archived
This topic is now archived and is closed to further replies.