• [6.9.0 beta 30] Server hard lock up


    nickp85
    • Urgent

    Updated to 6.9 beta 30 last night and sometime overnight my whole server locked up.  My Windows 10 VM was running along with 4 docker containers, Plex, Sonarr, Radarr and Nzbget, nothing else. I have my logs going to a local syslog so they are on a share.  There is nothing in the log between when I went to bed and when I had to force reboot the machine.

     

    Couldn't ping the machine, local console was blank, keyboard unresponsive, totally dead.  Force power off and on brought it back. After the reboot a parity check started automatically.

     

    Server had been working 100% fine with 6.8.3 for months. Since going to 6.9 beta 30, I reformatted my cache to 1MiB and also changed Docker to use an xfs image to control the high writes I was getting to my cache SSDs.  Machine was working fine just before bed, was playing Overwatch no issue.

     

    ***UPDATE*** It hasn't happened since the first night of being on 6.9 beta 30

    nicknas2-diagnostics-20201011-1053.zip




    User Feedback

    Recommended Comments



    I just had my 6.9 beta 30 server hard lock as well.  I couldn't SSH, all VM's were gone.  Couldn't get to the web console etc.  Resetting the box got me back in.  VM and Dockers look to be back.

     

    I've never had unraid hard freeze... that i can even remember.  Maybe in the 5x days.  This particular system was setup with 6.9 beta 25, then upgraded to 29, and then 30 about a week ago.  Now that i'm back in, is there any forensic data that i can pull from the system? 

    Link to comment

    Just got to 9 days uptime then boom! syslog when crashed below.

     

    Nov 14 00:16:25 10.0.0.31 crond[3558]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null
    Nov 14 00:46:17 10.0.0.31 kernel: general protection fault, probably for non-canonical address 0x2e5084006edb2028: 0000 [#1] SMP NOPTI
    Nov 14 00:46:17 10.0.0.31 kernel: CPU: 3 PID: 63854 Comm: Plex DLNA Serve Tainted: G        W  O      5.8.13-Unraid #1
    Nov 14 00:46:17 10.0.0.31 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./TRX40 Creator, BIOS P1.70 05/29/2020
    Nov 14 00:46:17 10.0.0.31 kernel: RIP: 0010:nf_nat_setup_info+0x129/0x652 [nf_nat]
    Nov 14 00:46:17 10.0.0.31 kernel: Code: ff 48 8b 15 c9 5a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28
    Nov 14 00:46:17 10.0.0.31 kernel: RSP: 0018:ffffc90036ceb8b8 EFLAGS: 00010206
    Nov 14 00:46:17 10.0.0.31 kernel: RAX: 7d39969285431311 RBX: ffff88859d5a4f00 RCX: fd3bbef3955777bf
    Nov 14 00:46:17 10.0.0.31 kernel: RDX: 2e5084006edb2028 RSI: 000000003e21e21f RDI: ffffc90036ceb8d8
    Nov 14 00:46:17 10.0.0.31 kernel: RBP: ffffc90036ceb980 R08: 00000000e8312846 R09: ffff889f7c421020
    Nov 14 00:46:17 10.0.0.31 kernel: R10: ffff8880162c4388 R11: ffffffff815daffc R12: 0000000000000000
    Nov 14 00:46:17 10.0.0.31 kernel: R13: ffffc90036ceb8d8 R14: ffffc90036ceb994 R15: ffffffff82095e40
    Nov 14 00:46:17 10.0.0.31 kernel: FS:  00001513139e2700(0000) GS:ffff889fdd0c0000(0000) knlGS:0000000000000000
    Nov 14 00:46:17 10.0.0.31 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Nov 14 00:46:17 10.0.0.31 kernel: CR2: 0000387b74524008 CR3: 0000000a640a4000 CR4: 0000000000340ee0
    Nov 14 00:46:17 10.0.0.31 kernel: Call Trace:
    Nov 14 00:46:17 10.0.0.31 kernel: ? update_cfs_rq_load_avg+0x14b/0x154
    Nov 14 00:46:17 10.0.0.31 kernel: ? fib_table_lookup+0x339/0x361
    Nov 14 00:46:17 10.0.0.31 kernel: ? __ksize+0x15/0x64
    Nov 14 00:46:17 10.0.0.31 kernel: ? krealloc+0x2d/0x81
    Nov 14 00:46:17 10.0.0.31 kernel: nf_nat_masquerade_ipv4+0x10a/0x130 [nf_nat]
    Nov 14 00:46:17 10.0.0.31 kernel: masquerade_tg+0x44/0x5e [xt_MASQUERADE]
    Nov 14 00:46:17 10.0.0.31 kernel: ? ipt_do_table+0x4b6/0x5bb [ip_tables]
    Nov 14 00:46:17 10.0.0.31 kernel: ipt_do_table+0x515/0x5bb [ip_tables]
    Nov 14 00:46:17 10.0.0.31 kernel: ? ipt_do_table+0x56b/0x5bb [ip_tables]
    Nov 14 00:46:17 10.0.0.31 kernel: nf_nat_inet_fn+0xe9/0x182 [nf_nat]
    Nov 14 00:46:17 10.0.0.31 kernel: nf_nat_ipv4_out+0xf/0x88 [nf_nat]
    Nov 14 00:46:17 10.0.0.31 kernel: nf_hook_slow+0x39/0x8e
    Nov 14 00:46:17 10.0.0.31 kernel: nf_hook+0xa8/0xc3
    Nov 14 00:46:17 10.0.0.31 kernel: ? __ip_finish_output+0x142/0x142
    Nov 14 00:46:17 10.0.0.31 kernel: ip_output+0x7d/0x8a
    Nov 14 00:46:17 10.0.0.31 kernel: ? __ip_finish_output+0x142/0x142
    Nov 14 00:46:17 10.0.0.31 kernel: ip_send_skb+0x10/0x32
    Nov 14 00:46:17 10.0.0.31 kernel: udp_send_skb+0x24e/0x2b0
    Nov 14 00:46:17 10.0.0.31 kernel: udp_sendmsg+0x60e/0x838
    Nov 14 00:46:17 10.0.0.31 kernel: ? ip_dont_fragment+0x33/0x33
    Nov 14 00:46:17 10.0.0.31 kernel: ? rt_add_uncached_list+0x23/0x51
    Nov 14 00:46:17 10.0.0.31 kernel: ? _raw_spin_unlock_bh+0x5/0x13
    Nov 14 00:46:17 10.0.0.31 kernel: ? rt_set_nexthop.constprop.0+0x214/0x22b
    Nov 14 00:46:17 10.0.0.31 kernel: ? sock_sendmsg_nosec+0x2b/0x3c
    Nov 14 00:46:17 10.0.0.31 kernel: sock_sendmsg_nosec+0x2b/0x3c
    Nov 14 00:46:17 10.0.0.31 kernel: __sys_sendto+0xce/0x109
    Nov 14 00:46:17 10.0.0.31 kernel: ? move_addr_to_user+0x46/0x6c
    Nov 14 00:46:17 10.0.0.31 kernel: ? __sys_getpeername+0x84/0xac
    Nov 14 00:46:17 10.0.0.31 kernel: ? kern_select+0xab/0xce
    Nov 14 00:46:17 10.0.0.31 kernel: ? __sys_setsockopt+0x97/0xb9
    Nov 14 00:46:17 10.0.0.31 kernel: __x64_sys_sendto+0x20/0x23
    Nov 14 00:46:17 10.0.0.31 kernel: do_syscall_64+0x7a/0x94
    Nov 14 00:46:17 10.0.0.31 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    Nov 14 00:46:17 10.0.0.31 kernel: RIP: 0033:0x15137b3739ff
    Nov 14 00:46:17 10.0.0.31 kernel: Code: 53 49 89 f4 89 fb 48 83 ec 10 e8 7c f7 ff ff 45 31 c9 89 c5 45 31 c0 4d 63 d5 4c 89 f2 4c 89 e6 48 63 fb b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 1e 89 ef 48 89 44 24 08 e8 ad f7 ff ff 48 8b
    Nov 14 00:46:17 10.0.0.31 kernel: RSP: 002b:00001513139e1b50 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    Nov 14 00:46:17 10.0.0.31 kernel: RAX: ffffffffffffffda RBX: 000000000000003b RCX: 000015137b3739ff
    Nov 14 00:46:17 10.0.0.31 kernel: RDX: 0000000000000155 RSI: 00001510fc0266e0 RDI: 000000000000003b
    Nov 14 00:46:17 10.0.0.31 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
    Nov 14 00:46:17 10.0.0.31 kernel: R10: 0000000000004000 R11: 0000000000000246 R12: 00001510fc0266e0
    Nov 14 00:46:17 10.0.0.31 kernel: R13: 0000000000004000 R14: 0000000000000155 R15: 00001510fc0266e0
    Nov 14 00:46:17 10.0.0.31 kernel: Modules linked in: xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle ip6table_filter ip6_tables vhost_net vhost vhost_iotlb tap tun macvlan xt_nat veth xt_MASQUERADE iptable_filter iptable_nat nf_nat ip_tables xfs nfsd lockd grace sunrpc md_mod nct6683 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha bonding r8169 realtek atlantic igb i2c_algo_bit wmi_bmof mxm_wmi edac_mce_amd kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper r8125(O) rapl ahci nvme libahci nvme_core ccp k10temp i2c_piix4 i2c_core wmi button acpi_cpufreq [last unloaded: realtek]
    Nov 14 00:46:17 10.0.0.31 kernel: ---[ end trace ce1e2ba9ecf5c847 ]---
    Nov 14 00:46:17 10.0.0.31 kernel: RIP: 0010:nf_nat_setup_info+0x129/0x652 [nf_nat]
    Nov 14 00:46:17 10.0.0.31 kernel: Code: ff 48 8b 15 c9 5a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28
    Nov 14 00:46:17 10.0.0.31 kernel: RSP: 0018:ffffc90036ceb8b8 EFLAGS: 00010206
    Nov 14 00:46:17 10.0.0.31 kernel: RAX: 7d39969285431311 RBX: ffff88859d5a4f00 RCX: fd3bbef3955777bf
    Nov 14 00:46:17 10.0.0.31 kernel: RDX: 2e5084006edb2028 RSI: 000000003e21e21f RDI: ffffc90036ceb8d8
    Nov 14 00:46:17 10.0.0.31 kernel: RBP: ffffc90036ceb980 R08: 00000000e8312846 R09: ffff889f7c421020

     

    Link to comment

    Just found the posts about power supply idle and c-states so trying that see what happens.

     

     

    Edited by turnipisum
    Link to comment

    Think I’m suffering from the same thing.

     

    First boot of an updated server results in a complete hard lock.


    Subsequent start from cold absolutely fine. 
     

    I’m now thinking it might be caused by a restart as I think when I updated to beta30 it hard locked and a power down was required then 40+ days uptime.

     

    Updated to beta35, hard lock within a few hours then powered off and started again and it’s fine.

     

    Might restart (I.e. not a total power off) beta35 and see what happens...

    Link to comment

    3 days up time! so i'm hoping the c-state or power supply idle setting has done the trick.  i will come back with update if i can get to 20 days up time or if it locks up again.

    Link to comment

    Nope that didn't work just had one again.

    ------------[ cut here ]------------
    Nov 27 00:24:47 10.0.0.31 kernel: WARNING: CPU: 6 PID: 7646 at drivers/iommu/dma-iommu.c:471 __iommu_dma_unmap+0x7a/0xe8
    Nov 27 00:24:47 10.0.0.31 kernel: Modules linked in: nfsd lockd grace sunrpc md_mod nct6683 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha bonding r8169 realtek atlantic igb i2c_algo_bit wmi_bmof mxm_wmi edac_mce_amd kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd r8125(O) glue_helper ahci rapl nvme libahci nvme_core ccp k10temp i2c_piix4 i2c_core wmi button acpi_cpufreq [last unloaded: realtek]
    Nov 27 00:24:47 10.0.0.31 kernel: CPU: 6 PID: 7646 Comm: ethtool Tainted: G           O      5.8.18-Unraid #1
    Nov 27 00:24:47 10.0.0.31 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./TRX40 Creator, BIOS P1.70 05/29/2020
    Nov 27 00:24:47 10.0.0.31 kernel: RIP: 0010:__iommu_dma_unmap+0x7a/0xe8
    Nov 27 00:24:47 10.0.0.31 kernel: Code: 46 28 4c 8d 60 ff 48 8d 54 18 ff 49 21 ec 48 f7 d8 4c 29 e5 49 01 d4 49 21 c4 48 89 ee 4c 89 e2 e8 c3 de ff ff 4c 39 e0 74 02 <0f> 0b 49 83 be 68 07 00 00 00 75 32 49 8b 45 08 48 8b 40 48 48 85
    Nov 27 00:24:47 10.0.0.31 kernel: RSP: 0018:ffffc90001afba40 EFLAGS: 00010206
    Nov 27 00:24:47 10.0.0.31 kernel: RAX: 0000000000002000 RBX: 0000000000001000 RCX: 0000000000000001
    Nov 27 00:24:47 10.0.0.31 kernel: RDX: ffff889fd571ae20 RSI: ffffffffffffe000 RDI: 0000000000000009
    Nov 27 00:24:47 10.0.0.31 kernel: RBP: 00000000fed8e000 R08: ffff889fd571ae20 R09: ffff889f86573c70
    Nov 27 00:24:47 10.0.0.31 kernel: R10: 0000000000000009 R11: ffff888000000000 R12: 0000000000001000
    Nov 27 00:24:47 10.0.0.31 kernel: R13: ffff889fd571ae10 R14: ffff889f9c979800 R15: ffffffffa0170600
    Nov 27 00:24:47 10.0.0.31 kernel: FS:  000014a5efd0b740(0000) GS:ffff889fdd180000(0000) knlGS:0000000000000000
    Nov 27 00:24:47 10.0.0.31 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Nov 27 00:24:47 10.0.0.31 kernel: CR2: 000014a5efd9ff30 CR3: 0000001f80272000 CR4: 0000000000340ee0
    Nov 27 00:24:47 10.0.0.31 kernel: Call Trace:
    Nov 27 00:24:47 10.0.0.31 kernel: iommu_dma_free+0x1a/0x2b
    Nov 27 00:24:47 10.0.0.31 kernel: aq_ptp_ring_free+0x31/0x60 [atlantic]
    Nov 27 00:24:47 10.0.0.31 kernel: aq_nic_deinit+0x4e/0xa4 [atlantic]
    Nov 27 00:24:47 10.0.0.31 kernel: aq_ndev_close+0x26/0x2d [atlantic]
    Nov 27 00:24:47 10.0.0.31 kernel: __dev_close_many+0xa1/0xb5
    Nov 27 00:24:47 10.0.0.31 kernel: dev_close_many+0x48/0xa6
    Nov 27 00:24:47 10.0.0.31 kernel: dev_close+0x42/0x64
    Nov 27 00:24:47 10.0.0.31 kernel: aq_set_ringparam+0x4c/0xc8 [atlantic]
    Nov 27 00:24:47 10.0.0.31 kernel: ethnl_set_rings+0x1fc/0x252
    Nov 27 00:24:47 10.0.0.31 kernel: genl_rcv_msg+0x1d9/0x251
    Nov 27 00:24:47 10.0.0.31 kernel: ? genlmsg_multicast_allns+0xea/0xea
    Nov 27 00:24:47 10.0.0.31 kernel: netlink_rcv_skb+0x7d/0xd1
    Nov 27 00:24:47 10.0.0.31 kernel: genl_rcv+0x1f/0x2c
    Nov 27 00:24:47 10.0.0.31 kernel: netlink_unicast+0x10c/0x1a5
    Nov 27 00:24:47 10.0.0.31 kernel: netlink_sendmsg+0x29d/0x2d3
    Nov 27 00:24:47 10.0.0.31 kernel: sock_sendmsg_nosec+0x32/0x3c
    Nov 27 00:24:47 10.0.0.31 kernel: __sys_sendto+0xce/0x109
    Nov 27 00:24:47 10.0.0.31 kernel: ? exc_page_fault+0x3e2/0x40c
    Nov 27 00:24:47 10.0.0.31 kernel: __x64_sys_sendto+0x20/0x23
    Nov 27 00:24:47 10.0.0.31 kernel: do_syscall_64+0x7a/0x94
    Nov 27 00:24:47 10.0.0.31 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    Nov 27 00:24:47 10.0.0.31 kernel: RIP: 0033:0x14a5efe25bc6
    Nov 27 00:24:47 10.0.0.31 kernel: Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb bc 0f 1f 80 00 00 00 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 c3 90 55 48 83 ec 30 44 89 4c 24 2c 4c 89
    Nov 27 00:24:47 10.0.0.31 kernel: RSP: 002b:00007ffc7437e458 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    Nov 27 00:24:47 10.0.0.31 kernel: RAX: ffffffffffffffda RBX: 00007ffc7437e4d0 RCX: 000014a5efe25bc6
    Nov 27 00:24:47 10.0.0.31 kernel: RDX: 000000000000002c RSI: 000000000046f3a0 RDI: 0000000000000004
    Nov 27 00:24:47 10.0.0.31 kernel: RBP: 000000000046f2a0 R08: 000014a5efef61a0 R09: 000000000000000c
    Nov 27 00:24:47 10.0.0.31 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000000000046f340
    Nov 27 00:24:47 10.0.0.31 kernel: R13: 000000000046f330 R14: 0000000000000000 R15: 000000000043504b
    Nov 27 00:24:47 10.0.0.31 kernel: ---[ end trace 87428ae110e59bcb ]---

    Link to comment

    Hi @ich777 as below. I'm starting to think maybe it could be i440x(currently in use) vs q35 issue but just guessing really. I have 3 x win 10 vm's 1x vnc only the other two have 2070 supers passed through and usb 3.

     

    Case: Corsair Obsidian 750d | MB: Asrock Trx40 Creator | CPU: AMD Threadripper 3970X | Cooler: Noctua NH-U14S | RAM: Corsair LPX 128GB DDR4 C16 | GPU: 2 x MSI  RTX 2070 Super's | Cache: Intel 660p Series 1TB M.2 X2 in 2TB Pool | Parity: Ironwolf 6TB | Array Storage: Ironwolf 6TB + Ironwolf 4TB | Unassigned Devices: Corsair 660p M.2 1TB + Kingston 480GB SSD + Skyhawk 2TB  | NIC: Intel 82576 Chip, Dual RJ45 Ports, 1Gbit PCI | PSU: Corsair RM1000i

    • Thanks 1
    Link to comment

    Have you also hw acceleration for docker enabled?

     

    Also isn't the power supply sufficient enough for your build? Threadripper and 2x 2070?

    If something spikes your systemnit may be the case that it uses for a short ammount of time too much power (please note that it could use for a little short time more power than advertised for example a single 3090 can crash the system with a 700w supply and a beefy cpu).

     

    You got that many components so I don't know where to start. :D

     

    I would try to run only one VM if it crashes again I would disable that VM and try it with the other if it also crashes disable both VM's and try that a few days (only if that's possible).

     

    Link to comment

    It's possible @ich777 that it's a bit close on the psu but i have ups monitor on in UR gui and i've not seen it above 560 watts as yet. rounding the spec draw figures the 3970x max 300w plus 300w each for the two  2070's =900w maybe do with more scope after adding other bits into the mix. i might put the 2070's on separate 750 psu that i have spare then see what happens as a 1200-1600 watt psu are fair few quid more! 🤪

     

    Bit of a pain not to have the other vm's running as they are all needed sort of living with the short down time when it happens while i try and track it down.

    Thanks for the suggestion.

    • Like 1
    Link to comment

    I am having a similar issue.  Started with Beta 29, beta 25 was good.  Beta 25 was the Nvidia build with a quadro p2000 and using hardware accel in my dockers.  Never crashed.

     

    Now on beta 29 and higher, I get crashes every few hours.  Ive been running for like 3 days now and I have had 12 crashes already.  Completely unresponsive, can't access anything, I need to do a power cycle.  

     

    Using the Nvidia Driver on beta 35 but on beta 29 I was using the Nvidia build.  Absolutely nothing in my logs....Server is pretty much useless now, I have no idea when it dies, don't get any notifications or anything.  So only way I know its down is someone messages me saying they can't access Plex.

     

    Attaching Diagnostics and Enhanced Syslogs

    Archive.zip

    Link to comment
    38 minutes ago, sittingmongoose said:

    I am having a similar issue.  Started with Beta 29, beta 25 was good.  Beta 25 was the Nvidia build with a quadro p2000 and using hardware accel in my dockers.  Never crashed.

     

    Now on beta 29 and higher, I get crashes every few hours.  Ive been running for like 3 days now and I have had 12 crashes already.  Completely unresponsive, can't access anything, I need to do a power cycle.  

     

    Using the Nvidia Driver on beta 35 but on beta 29 I was using the Nvidia build.  Absolutely nothing in my logs....Server is pretty much useless now, I have no idea when it dies, don't get any notifications or anything.  So only way I know its down is someone messages me saying they can't access Plex.

     

    Attaching Diagnostics and Enhanced Syslogs

    Archive.zip 354 kB · 0 downloads

    Oh crap i don't get it that many times a day revert back to a beta that worked for you. mines random can be 3-5 days a day or like 12 days longest up on beta 35 is 17 days i think.

     

    What is your server hardware?

    Link to comment
    59 minutes ago, trurl said:

    Have you seen this FAQ?

     

    Yep given it a go! I've disabled c-states and set power to typical idle but still no cigar 🚬🤪

    Link to comment
    4 hours ago, turnipisum said:

    Oh crap i don't get it that many times a day revert back to a beta that worked for you. mines random can be 3-5 days a day or like 12 days longest up on beta 35 is 17 days i think.

     

    What is your server hardware?

    asrock z390, 9900k, supermicro 846, 2x lsi 9207(ones an external model and the other is internal), p2000, 3x 1tb 970 evo plus ssds, dual 1200watt server psus. using 1 windows 10 vm but its just idling, I haven't touched the VM since I upgraded at all.  I will try to disable the vm to see if that's it.

     

    I can't pin a rime or reason to it.  No errors in logs, and it seems random, it seems to stay alive through the night and die a few times during the day.....and again, beta 25 is stable.  I am going to roll back but I figured I would stay on this for a few days to let @limetech gather some data to fix it in the next release.

     

    Edit: VMs being disabled didn't change anything.  Just crashed again.  Going to try disabling hard accel aka not using the p2000.

     

    Edit 2: Crashed again with no P2000...So its not gpu related....

    Edited by sittingmongoose
    Link to comment
    15 hours ago, turnipisum said:

    Yep given it a go! I've disabled c-states and set power to typical idle but still no cigar 🚬🤪

    What about the RAM recommendations at that link?

    Link to comment

    Another update...crashing about every hour randomly.  I tried seeing if Plex might be the cause either through disk access or the gpu and nothing.  I tried like 10 streams, direct play, transcoded and hardware transcoded and that didn't trigger it.  Tried them all at the same time, tried mixing disks, tried only accessing movies on the one sas card versus the other, nothing.

    Link to comment
    5 hours ago, trurl said:

    What about the RAM recommendations at that link?

    I've got 8x 16gb strips running at 2133mhz so well within spec. I was running it at 2666mhz which is still in with suggest max. Dropping ram speed was one of the first things i did when i started getting issues as well as memory test.

    image.png.3b686d4a37596a22888260baa0a21c68.png

     

    Link to comment
    15 minutes ago, turnipisum said:

    I've got 8x 16gb strips running at 2133mhz so well within spec. I was running it at 2666mhz which is still in with suggest max. Dropping ram speed was one of the first things i did when i started getting issues as well as memory test.

    image.png.3b686d4a37596a22888260baa0a21c68.png

     

    I am also within spec, 4x16gb of ddr4 3000mhz.  Memtest not showing any errors.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.