relink

Members
  • Posts

    235
  • Joined

  • Last visited

Everything posted by relink

  1. I think I might have got it, and shame on ASUS for this one. For whatever reason they decided to give the top PCIe slot 2 different modes that you have to manually change “X16” and “PCIe RAID Mode”, now what’s confusing is if you read the help at the bottom of the screen this option talks about enabling this for a single specific m.2 expander card, nowhere does it say it’s used for anything else. To make matters worse not only is the change not reflected in the manual on the ASUS website, the entire section is missing from the manual, and none of this was documented in the change logs for the BIOS updates. I can’t believe I just spent 4 hours trouble shooting every inch of my server, only to find out it was an undocumented change to the BIOS... Since I am far from the only person with an AM4 ASUS board, and I’m sure in not the only one who doesn’t put their transcoding GPU in slot one the solution to this is below; In your BIOS go to; Advanced > Onboard devices configuration then change; PCIEX16_1 to PCIe RAID Mode
  2. Swapped it back and it’s gone again. something changed with this BIOS update and I cannot figure out what it is. I’m currently reading the Mobo manual and it’s not very helpful so far. I’m going to read through all the release notes for the bios versions. something is different, I just need to find out what...
  3. Literally just finished trying that and it does seem to work in another pcie slot. But I had to swap with my GPU and the GPU is also still working. I keep my HBA in the top pcie x16 slot so it’ll get all 8 lanes it needs, then my NVME, GPU, AND NIC all get X4, and it’s been like that for over a month now. The bios update has to have done SOMETHING that is causing it to not initialize the raid card when it’s in slot one. I’m going to tray swapping back and see what happens.
  4. Oh man this seems to get getting worse and worse. I made a freeDOS bootable USB and got a copy of the cards latest bios and the flashing utility. I load it up and get the error “no known controllers detected. Quitting” has anyone ever had something similar happen? I find it very hard to believe my raid controller just died out of the blue like that. Are there any bios settings on my Mobo I should look for?
  5. I have finally found some more info, including the missing “readme” however when it comes to the instructions for resetting everything I have found says to just contact Adaptec...not very helpful. of course I didn’t do anything afaik that should have messed with the raid cards bios...
  6. I have been troubleshooting some crashes on my unraid server for the last couple weeks, after checking everything software I could think of, I moved onto the hardware, after re-seating my ram I decided to update my motherboards bios. The bios in my board was from 2019, and there are updates all the up to 2021. The update went fine. Except for some weird reason now my Adaptec raid card is just...gone. It’s like it’s not even in the system. It should show in the bios, and it doesn’t. It also doesn’t show up anywhere in unraid. I did notice that immediately following the Mobo bios update that I saw the Adaptec “Press CTRL + A” screen which was odd because I had disabled it back when I first bought the card. The boot failed because I didn’t realize the update reset my settings, and it was actually trying to boot from the HBA. after putting my bios settings back to boot from my unraid usb, I tried again. This time I saw another Adaptec boot screen with the message “BIOS not installed” and my system then booted into unraid, and there has been no sign of my Adaptec card since other than it’s leds do still blink when booting. I read the manual for the card i found a section on how to reset the card, but it just tells you to refer to “the read me”....but I have no idea what read me it’s talking about. Mobo: Asus rog strix B450f gaming CPU: Ryzen 5 2600 HBA: Adaptec 71605 1G
  7. Ok so I finally got a chance to rerun the memtest, and I did get a few errors on the first run. I shutdown and reseated the DIMMS and it passed just fine. Maybe something got knocked loose at some point. I also updated my motherboard to the latest bios. Unfortunately, ever since updating the bios UNRAID doesn’t see any of the disks attached to my Adaptec HBA, and I am not seeing it’s bios on boot either. In my motherboards bios I made sure quick boot was disabled, show logo is disabled, show oprom enabled, and allow interrupt 19 is enabled...now the panic starts to set in... EDIT: That was a great use of almost 4 hours, the raid card not showing up was due to a new BIOS option in the update that ASUS didn’t bother to document... Back to the original issue. RAM good, bios is updated, I’m going to continue monitoring and see what happens this week. I really hope I can get some stability back, things have been rough since 6.9
  8. Well I can certainly try updating the bios, so I can at least rule it out. Whatever this is I hope we can get it sorted soon because I cringe every time I have to hard shutdown my server. Sent from my iPhone using Tapatalk
  9. Ever since upgrading to the 6.9.x versions on Unraid I have been getting Kernel panics at-least once a week, sometimes more. I have no idea what to check next to get this fixed. So far I have; checked the RAM, and CPU, C-states are disabled in bios, temps are fine, PSU is fine. I had heard it may have been the GPU Stats plugin, so I added the below script to run on first array start; #!/bin/bash nvidia-persistenced fuser -v /dev/nvidia* I have also removed any plugins I don't actively use, and made sure the ones I do use stay updated. The only thing I haven't tried yet is doing a BIOS update, which I would like to not do unless I completely run out of options. Diagnostics are from after a reboot. And the below logs are from my Syslog server. Date Time Level Host Name Category Program Messages 4/21/21 12:18:43 Warning SERVERUS kern kernel CR2: 0000000000000034 4/21/21 12:18:43 Warning SERVERUS kern kernel Modules linked in: md4 sha512_ssse3 sha512_generic cmac cifs libarc4 nfsv3 nfs nfs_ssc nvidia_uvm(PO) xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter ext4 mbcache jbd2 xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper drm backlight agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit edac_mce_amd kvm_amd kvm crct10dif_pclmul mxm_wmi wmi_bmof crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd nvme glue_helper input_leds nvme_core rapl ahci aacraid ccp i2c_piix4 led_class k10temp wmi i2c_core libahci button acpi_cpufreq [last unloaded: e1000e] 4/21/21 12:18:43 Warning SERVERUS kern kernel R13: 000014dc4a9acfe0 R14: 0000000000000010 R15: 000014dc4a7814e8 4/21/21 12:18:43 Warning SERVERUS kern kernel R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002a 4/21/21 12:18:43 Warning SERVERUS kern kernel RBP: 000014dc490e0340 R08: 0000000000000000 R09: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RDX: 0000000000000010 RSI: 000014dc495705e0 RDI: 000000000000000e 4/21/21 12:18:43 Warning SERVERUS kern kernel RAX: ffffffffffffffda RBX: 000014dc49574b38 RCX: 000014dc4e5a4352 4/21/21 12:18:43 Warning SERVERUS kern kernel RSP: 002b:000014dc49570538 EFLAGS: 00000246 ORIG_RAX: 000000000000002a 4/21/21 12:18:43 Warning SERVERUS kern kernel Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 8a d2 ff ff 41 54 b8 02 00 00 00 49 89 f4 be 00 88 08 00 55 4/21/21 12:18:43 Warning SERVERUS kern kernel RIP: 0033:0x14dc4e5a4352 4/21/21 12:18:43 Warning SERVERUS kern kernel entry_SYSCALL_64_after_hwframe+0x44/0xa9 4/21/21 12:18:43 Warning SERVERUS kern kernel do_syscall_64+0x5d/0x6a 4/21/21 12:18:43 Warning SERVERUS kern kernel __x64_sys_connect+0x11/0x14 4/21/21 12:18:43 Warning SERVERUS kern kernel __sys_connect+0x62/0x9d 4/21/21 12:18:43 Warning SERVERUS kern kernel inet_stream_connect+0x34/0x49 4/21/21 12:18:43 Warning SERVERUS kern kernel __inet_stream_connect+0xd3/0x2b6 4/21/21 12:18:43 Warning SERVERUS kern kernel tcp_v4_connect+0x3fc/0x455 4/21/21 12:18:43 Warning SERVERUS kern kernel tcp_connect+0x76d/0x7f4 4/21/21 12:18:43 Warning SERVERUS kern kernel __tcp_transmit_skb+0x845/0x8ba 4/21/21 12:18:43 Warning SERVERUS kern kernel __ip_queue_xmit+0x2a3/0x2df 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipv4_mtu+0x3d/0x64 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_finish_output2+0x2ec/0x31f 4/21/21 12:18:43 Warning SERVERUS kern kernel __local_bh_enable_ip+0x3b/0x43 4/21/21 12:18:43 Warning SERVERUS kern kernel do_softirq+0x3a/0x44 4/21/21 12:18:43 Warning SERVERUS kern kernel do_softirq_own_stack+0x2c/0x39 4/21/21 12:18:43 Warning SERVERUS kern kernel </IRQ> 4/21/21 12:18:43 Warning SERVERUS kern kernel asm_call_irq_on_stack+0xf/0x20 4/21/21 12:18:43 Warning SERVERUS kern kernel __do_softirq+0xc4/0x1c2 4/21/21 12:18:43 Warning SERVERUS kern kernel net_rx_action+0xf4/0x29d 4/21/21 12:18:43 Warning SERVERUS kern kernel process_backlog+0xa3/0x13b 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_one_core+0x3d/0x95 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_core+0x335/0x4e7 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_handle_frame+0x25e/0x2a6 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_pre_routing+0x229/0x239 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel NF_HOOK+0xd7/0xf7 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_handle_frame_finish+0x351/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_handle_frame_finish+0x351/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_hook_thresh+0xa3/0xc3 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipt_do_table+0x570/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel br_handle_frame_finish+0x30d/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel netif_receive_skb+0x79/0xa1 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_one_core+0x74/0x95 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_rcv+0x41/0x61 4/21/21 12:18:43 Warning SERVERUS kern kernel ? l3mdev_l3_rcv.constprop.0+0x50/0x50 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook.constprop.0+0xb1/0xd8 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook_slow+0x39/0x8e 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_sabotage_in+0x43/0x4d [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ip_check_defrag+0x18f/0x18f 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_forward+0x3f1/0x420 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ip_finish_output+0x146/0x146 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_output+0x7d/0x8a 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ip_finish_output+0x146/0x146 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook+0xab/0xd3 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook_slow+0x39/0x8e 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_ipv4_out+0xf/0x88 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_inet_fn+0xe9/0x183 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? fib_validate_source+0xb0/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipt_do_table+0x570/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __dev_queue_xmit+0x4d9/0x501 4/21/21 12:18:43 Warning SERVERUS kern kernel ipt_do_table+0x51a/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? netdev_start_xmit+0x1b/0x38 4/21/21 12:18:43 Warning SERVERUS kern kernel masquerade_tg+0x44/0x5e [xt_MASQUERADE] 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_masquerade_ipv4+0x10b/0x131 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? krealloc+0x26/0x7a 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ksize+0x15/0x64 4/21/21 12:18:43 Warning SERVERUS kern kernel ? sch_direct_xmit+0x16/0x1de 4/21/21 12:18:43 Warning SERVERUS kern kernel <IRQ> 4/21/21 12:18:43 Warning SERVERUS kern kernel Call Trace: 4/21/21 12:18:43 Warning SERVERUS kern kernel CR2: 0000000000000034 CR3: 0000000216e82000 CR4: 00000000003506f0 4/21/21 12:18:43 Warning SERVERUS kern kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 4/21/21 12:18:43 Warning SERVERUS kern kernel FS: 000014dc49574b38(0000) GS:ffff8887fe800000(0000) knlGS:0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel R13: ffffc90000003720 R14: ffffc900000037dc R15: ffffffff8210b440 4/21/21 12:18:43 Warning SERVERUS kern kernel R10: ffff88821d56c388 R11: ffffffff815cbe4b R12: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RBP: ffffc900000037c8 R08: 00000000d638ba09 R09: ffff88818fa62b80 4/21/21 12:18:43 Warning SERVERUS kern kernel RDX: ffffffffffffffee RSI: 000000007418f38d RDI: ffffc90000003720 4/21/21 12:18:43 Warning SERVERUS kern kernel RAX: ffff88819ec50206 RBX: ffff88816107f0c0 RCX: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RSP: 0018:ffffc90000003700 EFLAGS: 00010286 4/21/21 12:18:43 Warning SERVERUS kern kernel Code: ff 48 8b 15 ef 6a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28 4/21/21 12:18:43 Warning SERVERUS kern kernel RIP: 0010:nf_nat_setup_info+0x129/0x6aa [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019 4/21/21 12:18:43 Warning SERVERUS kern kernel CPU: 0 PID: 13620 Comm: python3 Tainted: P W O 5.10.28-Unraid #1 4/21/21 12:18:43 Warning SERVERUS kern kernel Oops: 0000 [#1] SMP NOPTI 4/21/21 12:18:43 Information SERVERUS kern kernel PGD 0 P4D 0 4/21/21 12:18:43 Alert SERVERUS kern kernel #PF: error_code(0x0000) - not-present page 4/21/21 12:18:43 Alert SERVERUS kern kernel #PF: supervisor read access in kernel mode 4/21/21 12:18:43 Alert SERVERUS kern kernel BUG: kernel NULL pointer dereference, address: 0000000000000034 4/21/21 12:00:44 Information SERVERUS kern kernel eth0: renamed from vethbb3a427 4/21/21 12:00:44 Information SERVERUS kern kernel veth37fa1d8: renamed from eth0 4/21/21 12:00:44 Information SERVERUS kern kernel oom_reaper: reaped process 3059 (xteve), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB 4/21/21 12:00:44 Error SERVERUS kern kernel Memory cgroup out of memory: Killed process 3059 (xteve) total-vm:2784340kB, anon-rss:2087212kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:4176kB oom_score_adj:0 4/21/21 12:00:44 Information SERVERUS kern kernel oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,mems_allowed=0,oom_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task=xteve,pid=3059,uid=0 4/21/21 12:00:44 Information SERVERUS kern kernel [ 3059] 0 3059 696085 521803 4276224 0 0 xteve 4/21/21 12:00:44 Information SERVERUS kern kernel [ 3054] 0 3054 396 11 36864 0 0 crond 4/21/21 12:00:44 Information SERVERUS kern kernel [ 2968] 0 2968 554 53 40960 0 0 entrypoint.sh 4/21/21 12:00:44 Information SERVERUS kern kernel [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name 4/21/21 12:00:44 Information SERVERUS kern kernel Tasks state (memory values in pages): 4/21/21 12:00:44 Information SERVERUS kern kernel thp_collapse_alloc 0 4/21/21 12:00:44 Information SERVERUS kern kernel thp_fault_alloc 66 4/21/21 12:00:44 Information SERVERUS kern kernel pglazyfreed 0 4/21/21 12:00:44 Information SERVERUS kern kernel pglazyfree 0 4/21/21 12:00:44 Information SERVERUS kern kernel pgdeactivate 104645 4/21/21 12:00:44 Information SERVERUS kern kernel pgactivate 732171 4/21/21 12:00:44 Information SERVERUS kern kernel pgsteal 272820 4/21/21 12:00:44 Information SERVERUS kern kernel pgscan 20914208 4/21/21 12:00:44 Information SERVERUS kern kernel pgrefill 107383 4/21/21 12:00:44 Information SERVERUS kern kernel pgmajfault 231 4/21/21 12:00:44 Information SERVERUS kern kernel pgfault 764247 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_nodereclaim 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_restore_file 2013 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_restore_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_activate_file 3927 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_activate_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_refault_file 212124 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_refault_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel slab 297176 4/21/21 12:00:44 Information SERVERUS kern kernel slab_unreclaimable 1336 4/21/21 12:00:44 Information SERVERUS kern kernel slab_reclaimable 295840 4/21/21 12:00:44 Information SERVERUS kern kernel unevictable 0 4/21/21 12:00:44 Information SERVERUS kern kernel active_file 77824 4/21/21 12:00:44 Information SERVERUS kern kernel inactive_file 2076672 4/21/21 12:00:44 Information SERVERUS kern kernel active_anon 270336 4/21/21 12:00:44 Information SERVERUS kern kernel inactive_anon 2139475968 4/21/21 12:00:44 Information SERVERUS kern kernel anon_thp 69206016 4/21/21 12:00:44 Information SERVERUS kern kernel file_writeback 675840 4/21/21 12:00:44 Information SERVERUS kern kernel file_dirty 0 4/21/21 12:00:44 Information SERVERUS kern kernel file_mapped 270336 4/21/21 12:00:44 Information SERVERUS kern kernel shmem 0 4/21/21 12:00:44 Information SERVERUS kern kernel sock 0 4/21/21 12:00:44 Information SERVERUS kern kernel percpu 0 4/21/21 12:00:44 Information SERVERUS kern kernel kernel_stack 147456 4/21/21 12:00:44 Information SERVERUS kern kernel file 3108864 4/21/21 12:00:44 Information SERVERUS kern kernel anon 2138357760 4/21/21 12:00:44 Information SERVERUS kern kernel Memory cgroup stats for /docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89: 4/21/21 12:00:44 Information SERVERUS kern kernel kmem: usage 5288kB, limit 9007199254740988kB, failcnt 0 4/21/21 12:00:44 Information SERVERUS kern kernel memory+swap: usage 2097156kB, limit 4194304kB, failcnt 0 4/21/21 12:00:44 Information SERVERUS kern kernel memory: usage 2097156kB, limit 2097152kB, failcnt 76745 4/21/21 12:00:44 Warning SERVERUS kern kernel R13: 0000000000bcd4e0 R14: 0000000000000000 R15: 000000000046c560 4/21/21 12:00:44 Warning SERVERUS kern kernel R10: 000014f1d1566eb0 R11: ed386059261e7389 R12: 0000000000000002 4/21/21 12:00:44 Warning SERVERUS kern kernel RBP: 000000c00079b888 R08: 0000000000000009 R09: ffffffffffffffff 4/21/21 12:00:44 Warning SERVERUS kern kernel RDX: 00000000009ea5f4 RSI: 000000000041a345 RDI: 000000c00079bac0 4/21/21 12:00:44 Warning SERVERUS kern kernel RAX: 0000000000bcd4e0 RBX: 0000000000000003 RCX: 000000c000288f00 4/21/21 12:00:44 Warning SERVERUS kern kernel RSP: 002b:000000c00079b878 EFLAGS: 00010216 4/21/21 12:00:44 Warning SERVERUS kern kernel Code: Unable to access opcode bytes at RIP 0x4548ff. 4/21/21 12:00:44 Warning SERVERUS kern kernel RIP: 0033:0x454929 4/21/21 12:00:44 Warning SERVERUS kern kernel asm_exc_page_fault+0x1e/0x30 4/21/21 12:00:44 Warning SERVERUS kern kernel ? asm_exc_page_fault+0x8/0x30 4/21/21 12:00:44 Warning SERVERUS kern kernel exc_page_fault+0x259/0x373 4/21/21 12:00:44 Warning SERVERUS kern kernel ? __raw_spin_unlock_irq+0x5/0xd 4/21/21 12:00:44 Warning SERVERUS kern kernel handle_mm_fault+0xb83/0xec3 4/21/21 12:00:44 Warning SERVERUS kern kernel __do_fault+0x49/0x64 4/21/21 12:00:44 Warning SERVERUS kern kernel ? filemap_map_pages+0x1ec/0x217 4/21/21 12:00:44 Warning SERVERUS kern kernel ? xas_find+0xa9/0x121 4/21/21 12:00:44 Warning SERVERUS kern kernel filemap_fault+0x2a4/0x478 4/21/21 12:00:44 Warning SERVERUS kern kernel pagecache_get_page+0x108/0x13e 4/21/21 12:00:44 Warning SERVERUS kern kernel add_to_page_cache_lru+0x42/0xa7 4/21/21 12:00:44 Warning SERVERUS kern kernel ? lruvec_page_state+0x2f/0x2f 4/21/21 12:00:44 Warning SERVERUS kern kernel __add_to_page_cache_locked+0xab/0x274 4/21/21 12:00:44 Warning SERVERUS kern kernel mem_cgroup_charge+0xfe/0x17e 4/21/21 12:00:44 Warning SERVERUS kern kernel ? mem_cgroup_charge+0x170/0x17e 4/21/21 12:00:44 Warning SERVERUS kern kernel try_charge+0x3ec/0x501 4/21/21 12:00:44 Warning SERVERUS kern kernel mem_cgroup_out_of_memory+0x79/0xae 4/21/21 12:00:44 Warning SERVERUS kern kernel out_of_memory+0x3dd/0x410 4/21/21 12:00:44 Warning SERVERUS kern kernel oom_kill_process+0x7b/0xf6 4/21/21 12:00:44 Warning SERVERUS kern kernel dump_header+0x45/0x1e8 4/21/21 12:00:44 Warning SERVERUS kern kernel dump_stack+0x6b/0x83 4/21/21 12:00:44 Warning SERVERUS kern kernel Call Trace: 4/21/21 12:00:44 Warning SERVERUS kern kernel Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019 4/21/21 12:00:44 Warning SERVERUS kern kernel CPU: 4 PID: 3653 Comm: xteve Tainted: P W O 5.10.28-Unraid #1 4/21/21 12:00:44 Warning SERVERUS kern kernel xteve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 4/21/21 9:44:28 Notice SERVERUS user flash_backup adding task: php /usr/local/emhttp/plugins/dynamix.unraid.net/include/UpdateFlashBackup.php update 4/21/21 9:43:20 Alert SERVERUS local7 nginx 2021/04/21 09:43:20 [alert] 29892#29892: worker process 7356 exited on signal 6 serverus-diagnostics-20210421-1308.zip
  10. So far removing the GPU Stats plugin seems to have been the fix. EDIT: After some reading in the nvidia driver thread, it would appear this issue can be resolved by running the below script on array start. Im going to try it out and see what happens. #!/bin/bash nvidia-persistenced fuser -v /dev/nvidia*
  11. Thank you for this, I believe you may have helped me solve another issue I was having.
  12. I’ll try running another Ram test, but I won’t be able to take my rig down for that long until Saturday. is there anything else I could try looking into in the meantime? Also I’ve had no issues since removing the GPU stats plugin, however I am still within the window for another crash to happen. So I’m not calling it solved just yet.
  13. I can certainly try it. Yup, it was fine just a few months ago. I have 4 modules, and everything in this system is stock speeds. I did read in another topic that it could possibly be the GPU Stats plugin, so I have also uninstalled that for now. I did not have that plugin prior to 6.9
  14. I have had constant crashes (every 12-48hr) ever since updating to 6.9.0, and im now at 6.9.2. I have not been able to figure out why. Every time I have no choice but to do a hard reboot, I cant even SSH in. The crashes seem to be random as far as I can tell. I attached diagnostics, but they are from after a reboot. However, I do run a syslog server on my Synology and I have attached what I believe are the logs from when it crashed (~1:30AM) to when I did a hard reboot (~6:15AM). serverus-diagnostics-20210414-0639.zip All_2021-4-14-6_38_29.csv
  15. Man life can get hectic. I finally got around to checking and the only option for C-states in my bios is a single option called “Global C-states control” and it was set to auto, I changed it to disabled. I completely forgot about that when I reset my BIOS about a month ago. Hopefully that’s all it was.
  16. I should already have C-states disabled, I read the threads on ryzen before buying the board I have. Plus I have had this board and cpu for around 2 years now and have never had this issue before. however I can double check, but I won’t be able to get physical access to my rig until later this afternoon.
  17. I went back into the logs, this seems to be when the crash started, maybe a little before, all the way up to me having to hard reset. 4/4/21 17:46:06 Warning SERVERUS kern kernel CR2: 0000000000000010 4/4/21 17:46:06 Warning SERVERUS kern kernel Modules linked in: md4 sha512_ssse3 sha512_generic cmac cifs libarc4 nfsv3 nfs nfs_ssc nvidia_uvm(PO) xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan veth xt_nat xt_MASQUERADE iptable_nat nf_nat ext4 mbcache jbd2 xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper drm backlight agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) ip6table_filter ip6_tables iptable_filter ip_tables bonding e1000e igb i2c_algo_bit edac_mce_amd kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd nvme glue_helper mxm_wmi wmi_bmof ahci nvme_core rapl aacraid i2c_piix4 k10temp input_leds wmi i2c_core led_class ccp libahci button acpi_cpufreq [last unloaded: e1000e] 4/4/21 17:46:06 Warning SERVERUS kern kernel secondary_startup_64_no_verify+0xb0/0xbb 4/4/21 17:46:06 Warning SERVERUS kern kernel cpu_startup_entry+0x18/0x1a 4/4/21 17:46:06 Warning SERVERUS kern kernel do_idle+0x1a6/0x214 4/4/21 17:46:06 Warning SERVERUS kern kernel cpuidle_enter+0x25/0x31 4/4/21 17:46:06 Warning SERVERUS kern kernel cpuidle_enter_state+0xba/0x1c4 4/4/21 17:46:06 Warning SERVERUS kern kernel acpi_idle_enter+0x9a/0xa9 4/4/21 17:46:06 Warning SERVERUS kern kernel acpi_idle_do_entry+0x25/0x37 4/4/21 17:46:06 Warning SERVERUS kern kernel arch_safe_halt+0x5/0x8 4/4/21 17:46:06 Warning SERVERUS kern kernel ? native_safe_halt+0x5/0x8 4/4/21 17:46:06 Warning SERVERUS kern kernel R13: ffff888100b0cc64 R14: ffffffff820cada8 R15: 0000000000000000 4/4/21 17:46:06 Warning SERVERUS kern kernel R10: 00000000000003e4 R11: 071c71c71c71c71c R12: 0000000000000001 4/4/21 17:46:06 Warning SERVERUS kern kernel RBP: ffff888104dd1c00 R08: ffff888100b0cc00 R09: 00000000000003e4 4/4/21 17:46:06 Warning SERVERUS kern kernel RDX: ffff8887feac0000 RSI: ffffffff820cad40 RDI: ffff888100b0cc64 4/4/21 17:46:06 Warning SERVERUS kern kernel RAX: 0000000000004000 RBX: 0000000000000001 RCX: 000000000000001f 4/4/21 17:46:06 Warning SERVERUS kern kernel RSP: 0018:ffffc90000177e78 EFLAGS: 00000246 4/4/21 17:46:06 Warning SERVERUS kern kernel Code: 60 02 df f0 83 44 24 fc 00 48 8b 00 a8 08 74 0b 65 81 25 b1 26 92 7e ff ff ff 7f c3 e8 4e ca 94 ff f4 c3 e8 47 ca 94 ff fb f4 <c3> 53 e8 a6 6b 9a ff e8 ba f2 97 ff 65 48 8b 1c 25 c0 7b 01 00 48 4/4/21 17:46:06 Warning SERVERUS kern kernel RIP: 0010:native_safe_halt+0x7/0x8 4/4/21 17:46:06 Warning SERVERUS kern kernel asm_common_interrupt+0x1e/0x40 4/4/21 17:46:06 Warning SERVERUS kern kernel common_interrupt+0xa5/0x12e 4/4/21 17:46:06 Warning SERVERUS kern kernel </IRQ> 4/4/21 17:46:06 Warning SERVERUS kern kernel asm_call_irq_on_stack+0xf/0x20 4/4/21 17:46:06 Warning SERVERUS kern kernel handle_edge_irq+0xb0/0xd0 4/4/21 17:46:06 Warning SERVERUS kern kernel handle_irq_event+0x34/0x51 4/4/21 17:46:06 Warning SERVERUS kern kernel handle_irq_event_percpu+0x2c/0x6f 4/4/21 17:46:06 Warning SERVERUS kern kernel __handle_irq_event_percpu+0x36/0xcb 4/4/21 17:46:06 Warning SERVERUS kern kernel aac_src_intr_message+0x321/0x35d [aacraid] 4/4/21 17:46:06 Warning SERVERUS kern kernel aac_intr_normal+0x2dc/0x2ff [aacraid] 4/4/21 17:46:06 Warning SERVERUS kern kernel aac_srb_callback+0x67/0x30d [aacraid] 4/4/21 17:46:06 Warning SERVERUS kern kernel <IRQ> 4/4/21 17:46:06 Warning SERVERUS kern kernel Call Trace: 4/4/21 17:46:06 Warning SERVERUS kern kernel CR2: 0000000000000010 CR3: 000000022f1ac000 CR4: 00000000003506e0 4/4/21 17:46:06 Warning SERVERUS kern kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 4/4/21 17:46:06 Warning SERVERUS kern kernel FS: 0000000000000000(0000) GS:ffff8887feac0000(0000) knlGS:0000000000000000 4/4/21 17:46:06 Warning SERVERUS kern kernel R13: ffff8881016340b8 R14: ffff888103380a9c R15: 0000000000000000 4/4/21 17:46:06 Warning SERVERUS kern kernel R10: 0000000000000000 R11: 0000000000000000 R12: 000000000000001f 4/4/21 17:46:06 Warning SERVERUS kern kernel RBP: ffff88813703ca10 R08: 0000000000000000 R09: 0000000000000000 4/4/21 17:46:06 Warning SERVERUS kern kernel RDX: 0000000000000020 RSI: 0000000000000000 RDI: 0000000000000000 4/4/21 17:46:06 Warning SERVERUS kern kernel RAX: ffffffff81436445 RBX: 0000000000000000 RCX: 0000000000000002 4/4/21 17:46:06 Warning SERVERUS kern kernel RSP: 0018:ffffc90000400e88 EFLAGS: 00010012 4/4/21 17:46:06 Warning SERVERUS kern kernel Code: 00 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e c3 41 57 45 31 ff 41 56 41 55 49 89 fd 48 89 f7 41 54 41 89 d4 55 41 ff cc 53 48 89 f3 <4c> 8b 76 10 e8 42 8e eb ff 48 89 c5 45 39 fc 7e 06 83 7d 18 00 75 4/4/21 17:46:06 Warning SERVERUS kern kernel RIP: 0010:iommu_dma_unmap_sg+0x1c/0x68 4/4/21 17:46:06 Warning SERVERUS kern kernel Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019 4/4/21 17:46:06 Warning SERVERUS kern kernel CPU: 11 PID: 0 Comm: swapper/11 Tainted: P W O 5.10.21-Unraid #1 4/4/21 17:46:06 Warning SERVERUS kern kernel Oops: 0000 [#1] SMP NOPTI 4/4/21 17:46:06 Information SERVERUS kern kernel PGD 22b484067 P4D 22b484067 PUD 0 4/4/21 17:46:06 Alert SERVERUS kern kernel #PF: error_code(0x0000) - not-present page 4/4/21 17:46:06 Alert SERVERUS kern kernel #PF: supervisor read access in kernel mode 4/4/21 17:46:06 Alert SERVERUS kern kernel BUG: kernel NULL pointer dereference, address: 0000000000000010 4/4/21 17:46:05 Error SERVERUS kern kernel aacraid: Outstanding commands on (2,1,13,0): 4/4/21 17:46:05 Error SERVERUS kern kernel aacraid: Host adapter abort request. 4/4/21 17:45:37 Error SERVERUS kern kernel aacraid: Outstanding commands on (2,1,13,0): 4/4/21 17:45:37 Error SERVERUS kern kernel aacraid: Host adapter abort request.
  18. Actually I have a syslog server that records everything on my Synology. What should I be looking for in the logs and I can go pull it and post it here. I remember around what time this all happened.
  19. Well that solved that issue, apparently everything was up to date except that one. Updated it, and now that error is gone. But im still concerned about the KP, thats never happened to me before. Im also not sure what the Parity 2 cacheing being disabled in all about.
  20. So I’m not sure what’s going on but completely out of the blue my unraid server locked up. I wasn’t able to get a diagnostics as I couldn’t even SSH in, I connected a monitor and saw that it had a kernel panic. The last line read; not syncing: fatal exception in interupt The only other indication I saw was a reference to “iommu”, but I don’t currently have anything passed through to a VM, in fact I don’t even have any VMs running. After unfortunately doing a hard reboot, a parity check started as expected, and everything seems to be ok except the following at the very bottom on the unraid GUI; Parse error: syntax error, unexpected '{', expecting '(' in /usr/local/emhttp/plugins/parity.check.tuning/parity.check.tuning.php on line 1396 As well as a message from fix common problems that says that “write cache is disabled on parity 2”. There have been no major hardware changes recently other than adding some new drives, and a new cache pool. I’m not seeing any errors coming from any disks, and everything on both of my cache pools is running fine. The new drives were added about 2 weeks ago and I have had no issues between then and now. The last major change was changing my HBA but this was nearly a year ago at this point. Unraid and all plugins are fully updated.
  21. I am having the exact same issue. I only have tabs open on one computer at the moment. I am trying to get the diagnostics but I cant get it to download right now. EDIT: Added Dianostics. serverus-diagnostics-20210331-0914.zip
  22. So, as a temporary work around, would there be an issue if I copied files over to the cache from the array? From how I understand it everything would function like normal, except those files that are duplicated wont get moved.
  23. Well it was a total success! After quadruple the amount of time I thought it was going to take my new cache drive is up and running. I think next time I’ll take the advice above and just restore from my backup.
  24. Good thing I ended up not doing that at the last second. instead I decided to add my new nvme drive as a second cache, and use MC to move everything from the old one to the new one and reassign the cache for each share. however, this is still ungodly slow. Midnight commander “claims” it’s transferring at 15MB/s ...it’s not, there’s no way. It’s maybe 15Mb/s if that. I calculated out the transfer time beforehand and this should have been done hours ago, yet it’s still going.