Harblar

Members
  • Posts

    55
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Harblar's Achievements

Rookie

Rookie (2/14)

1

Reputation

  1. 6 hours in and 21% through the rebuild and EVERYTHING is running smoother and cooler. CPU is sitting at 25C and 1-2% utilization (was 35-40C and 7-10% yesterday.) Ram usage is sitting tight at 3GB used/6GB cached. With twice the memory yesterday, it was sitting at near double that. There also hasn't been a single hiccup in the syslog yet. One other thing I disabled in Bios was the Cstate on the CPU. It was hidden in an obscure overclocking menu or I would have had it disabled from the first night. Not saying it was causing issues, but It's pretty much not needed for a server that runs 24-7 and I've seen where some people with AMD based systems have had freezing issues with it enabled. Granted, I have an intel setup, but better to get rid of all the useless crap that could potentially be getting in the way. 🤘
  2. Gotcha. Didn't mean to come across as sarcastic. Genuinely would take any recommendations you might have that have proven good combos. I did just check my memory against the MSI QVL for this board and it is only technically qualified for 1 or 2 (not 4) of this Corsair module. That said, I'm still nearly certain that A1 is faulty since its the only slot that gave any error when I did the single stick slot by slot tests. Was going to run a dual channel (A1/B1) test to further verify, but the A2/B2 came back clean so I just left it and decided to give the rebuild another shot and see what happens.
  3. Getting errors on A1 where it is the ONLY slot filled. Tested the same stick in all other slots with no issue. Going to run a couple more tests to confirm things, but I've already started a return request with MSI. Any recommendations on boards that WILL run with all slots filled? I'm looking at getting an MSI Pro Z790-P WIFI, since it will allow expansion to 256GB RAM and does provide PCIE 4.0 x4 through the chipset (for the HBA) in addition to the PCIE 5x16 through the CPU Lane, where as the current board I have only allows pcie3 x4 through the chipset. I'd prefer to get one without "wifi", but thats actually not easy to find these days. lol
  4. Quick Update... Memtest86 errors/failed with all 4 sticks installed on test 3-4 of a single pass. So far I've tested each individual stick in the A2 slot and all have successfully completed a single pass with no errors. So... the memory itself might not be bad. I am now testing each individual slot on the motherboard. So far A2 is obviously good and A1 is currently testing, but looking good as well. That means either B1 or B2 is faulty or one of the sticks wasn't 100% set or something. All the sticks were fully locked into place so all I could think is that maybe the fan from the cpu cooler caused an issue when I swapped cpu's (fan sits directly over A1) or something got bumped or shifted slightly out of alignment. I've moved the fan to the opposite side of the cooling tower for now and we'll see where I'm at post testing. I'm now 99% certain my testing will reveal NO errors or definitive answers. It'll probably run 24/7 for the next 10 years and never give me another single issue... which will piss me off to no end. 😂 edit: OR the A1 slot is bad... errors on test #6 and #8 so far.. I'll see how B1 and B2 do, but I'm guessing it's a Motherboard problem. Good to know, but only mildly less annoying. lol
  5. OK, I addressed that in my second post above. Tried two different slots and, while I don't have fan mounted directly to the heat sink of the HBA, I do have 4x140mm fans in a pushpull configuration pulling air throught the front, over the hdd stack, and directly onto the HBA card. It's warmer than the other components in my system, but i can easily put my hand on the heatsink without discomfort/burning. I'll grab a laser temp guage later today to get an actual reading. That said... I had another crash after going to bed last night! The big issue is that when it crashes there is no event added to the log as the system just randomly hangs out of the blue. After my last post yesterday I got a bit more in the log and then nothing all the way till the system froze 5-6 hours later. Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:c2bb4f2a9f02a87f pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000400000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:0 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:e66f50f55cccfd27 pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000401000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:1 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P B O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:d8d336e62a531f07 pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000402000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:2 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P B O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:88810cf8a39f9f00 pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000403000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:3 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P B O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:e7e6fde0edee0451 pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000404000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:4 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P B O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad page map in process disk_load pte:b883ef5228c7e09d pmd:1546ec067 Apr 21 17:39:51 DVD kernel: addr:0000000000407000 vm_flags:00000071 anon_vma:0000000000000000 mapping:ffff888108c1a898 index:7 Apr 21 17:39:51 DVD kernel: file:bash fault:shmem_fault mmap:shmem_mmap read_folio:0x0 Apr 21 17:39:51 DVD kernel: CPU: 10 PID: 7966 Comm: disk_load Tainted: P B O 6.1.79-Unraid #1 Apr 21 17:39:51 DVD kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D25/PRO Z690-A WIFI (MS-7D25), BIOS A.H0 03/29/2024 Apr 21 17:39:51 DVD kernel: Call Trace: Apr 21 17:39:51 DVD kernel: <TASK> Apr 21 17:39:51 DVD kernel: dump_stack_lvl+0x44/0x5c Apr 21 17:39:51 DVD kernel: print_bad_pte+0x1bc/0x1d6 Apr 21 17:39:51 DVD kernel: vm_normal_page+0x81/0x9b Apr 21 17:39:51 DVD kernel: unmap_page_range+0x384/0x67b Apr 21 17:39:51 DVD kernel: ? prep_new_page+0x1c/0x4c Apr 21 17:39:51 DVD kernel: unmap_vmas+0xb6/0x100 Apr 21 17:39:51 DVD kernel: exit_mmap+0xdb/0x22e Apr 21 17:39:51 DVD kernel: ? finish_task_switch.isra.0+0x140/0x218 Apr 21 17:39:51 DVD kernel: __mmput+0x43/0xe3 Apr 21 17:39:51 DVD kernel: do_exit+0x31b/0x923 Apr 21 17:39:51 DVD kernel: ? _raw_spin_lock_irqsave+0x2c/0x37 Apr 21 17:39:51 DVD kernel: do_group_exit+0x7a/0x7a Apr 21 17:39:51 DVD kernel: get_signal+0x622/0x65a Apr 21 17:39:51 DVD kernel: arch_do_signal_or_restart+0x36/0x607 Apr 21 17:39:51 DVD kernel: ? __do_sys_wait4+0x37/0x8a Apr 21 17:39:51 DVD kernel: ? do_sigaction+0x1c4/0x1ee Apr 21 17:39:51 DVD kernel: exit_to_user_mode_prepare+0x58/0x112 Apr 21 17:39:51 DVD kernel: syscall_exit_to_user_mode+0x18/0x2c Apr 21 17:39:51 DVD kernel: do_syscall_64+0x77/0x81 Apr 21 17:39:51 DVD kernel: entry_SYSCALL_64_after_hwframe+0x64/0xce Apr 21 17:39:51 DVD kernel: RIP: 0033:0x1477feee5c63 Apr 21 17:39:51 DVD kernel: Code: Unable to access opcode bytes at 0x1477feee5c39. Apr 21 17:39:51 DVD kernel: RSP: 002b:00007ffc7a031608 EFLAGS: 00000202 ORIG_RAX: 000000000000003d Apr 21 17:39:51 DVD kernel: RAX: fffffffffffffe00 RBX: 0000000000000000 RCX: 00001477feee5c63 Apr 21 17:39:51 DVD kernel: RDX: 0000000000000000 RSI: 00007ffc7a031638 RDI: 00000000ffffffff Apr 21 17:39:51 DVD kernel: RBP: 0000000000538c48 R08: 0000000000000001 R09: 0000000000000008 Apr 21 17:39:51 DVD kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00000000005393a0 Apr 21 17:39:51 DVD kernel: R13: 0000000000528e0c R14: 0000000000538c48 R15: 00000000005393a0 Apr 21 17:39:51 DVD kernel: </TASK> Apr 21 17:39:51 DVD kernel: BUG: Bad rss-counter state mm:000000009a70e951 type:MM_SHMEMPAGES val:8 Apr 21 18:00:01 DVD crond[1461]: exit status 127 from user root /usr/sbin/speedtest-xml &> /dev/null Apr 21 21:00:01 DVD crond[1461]: exit status 127 from user root /usr/sbin/speedtest-xml &> /dev/null Apr 22 00:00:01 DVD crond[1461]: exit status 127 from user root /usr/sbin/speedtest-xml &> /dev/null Got this around 5:40pm and then nothing till it froze sometime after midnight.. On a whim, I started a mem test this morning after seeing elsewhere that "Bad page map" errors might be indicative of bad memory. Got 5% of the way through the first pass and started popping errors and fails! *Head -> Desk... lots of creative mumbling of inappropriate phrases* I'm now doing a stick by stick test to see which one is borked! Honestly... it'd been running perfect for a week and only started giving issues after I brought on the HBA. Maybe that just put the right kind of strain on the memory to present the problems that were already there. Who knows. Either way, I'm fairly certain now that this is the majority of my problem atm. Once I isolate the bad stick I'll run things with half the memory while I get the one kit replaced and see if anything else weird pops up.
  6. Ok, 7 hours in now and still going, though this did pop up in the syslog a while ago. Looks like some kind of fault related to the HBA. Does this tell us anything more specific? Apr 21 12:18:14 DVD kernel: mpt3sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready Apr 21 12:18:14 DVD kernel: mpt3sas_cm0: fault_state(0x5862)! Apr 21 12:18:14 DVD kernel: mpt3sas_cm0: sending diag reset !! Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: diag reset: SUCCESS Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: _base_display_fwpkg_version: complete Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: LSISAS3008: FWVersion(16.00.10.00), ChipRevision(0x02), BiosVersion(18.00.00.00) Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ) Apr 21 12:18:15 DVD kernel: mpt3sas_cm0: sending port enable !! Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: port enable: SUCCESS Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for end-devices: start Apr 21 12:18:23 DVD kernel: scsi target9:0:0: handle(0x0009), sas_addr(0x4433221100000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:0: enclosure logical id(0x500062b202a05640), slot(3) Apr 21 12:18:23 DVD kernel: scsi target9:0:1: handle(0x000a), sas_addr(0x4433221101000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:1: enclosure logical id(0x500062b202a05640), slot(2) Apr 21 12:18:23 DVD kernel: scsi target9:0:5: handle(0x000b), sas_addr(0x5000c500c9d8f0d5) Apr 21 12:18:23 DVD kernel: scsi target9:0:5: enclosure logical id(0x500062b202a05640), slot(6) Apr 21 12:18:23 DVD kernel: #011handle changed from(0x000c)!!! Apr 21 12:18:23 DVD kernel: scsi target9:0:2: handle(0x000c), sas_addr(0x4433221102000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:2: enclosure logical id(0x500062b202a05640), slot(0) Apr 21 12:18:23 DVD kernel: #011handle changed from(0x000b)!!! Apr 21 12:18:23 DVD kernel: scsi target9:0:3: handle(0x000d), sas_addr(0x4433221103000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:3: enclosure logical id(0x500062b202a05640), slot(1) Apr 21 12:18:23 DVD kernel: scsi target9:0:6: handle(0x000e), sas_addr(0x4433221107000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:6: enclosure logical id(0x500062b202a05640), slot(5) Apr 21 12:18:23 DVD kernel: #011handle changed from(0x000f)!!! Apr 21 12:18:23 DVD kernel: scsi target9:0:4: handle(0x000f), sas_addr(0x4433221104000000) Apr 21 12:18:23 DVD kernel: scsi target9:0:4: enclosure logical id(0x500062b202a05640), slot(7) Apr 21 12:18:23 DVD kernel: #011handle changed from(0x000e)!!! Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for end-devices: complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for end-devices: start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for PCIe end-devices: complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for expanders: start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: search for expanders: complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: mpt3sas_base_hard_reset_handler: SUCCESS Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: _base_fault_reset_work: hard reset: success Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: removing unresponding devices: start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: removing unresponding devices: end-devices Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: Removing unresponding devices: pcie end-devices Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: removing unresponding devices: expanders Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: removing unresponding devices: complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: scan devices: start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011scan devices: expanders start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011break from expander scan: ioc_status(0x0022), loginfo(0x310f0400) Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011scan devices: expanders complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011scan devices: end devices start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011break from end device scan: ioc_status(0x0022), loginfo(0x310f0400) Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011scan devices: end devices complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011scan devices: pcie end devices start Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d) Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: log_info(0x3003011d): originator(IOP), code(0x03), sub_code(0x011d) Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011break from pcie end device scan: ioc_status(0x0021), loginfo(0x3003011d) Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: #011pcie devices: pcie end devices complete Apr 21 12:18:23 DVD kernel: mpt3sas_cm0: scan devices: complete Apr 21 12:18:23 DVD kernel: sd 9:0:5:0: Mode parameters changed Apr 21 12:18:23 DVD kernel: sd 9:0:0:0: Power-on or device reset occurred Apr 21 12:18:23 DVD kernel: sd 9:0:4:0: Power-on or device reset occurred Apr 21 12:18:23 DVD kernel: sd 9:0:2:0: Power-on or device reset occurred Apr 21 12:18:23 DVD kernel: sd 9:0:6:0: Power-on or device reset occurred Apr 21 12:18:23 DVD kernel: sd 9:0:1:0: Power-on or device reset occurred Apr 21 12:18:23 DVD kernel: sd 9:0:3:0: Power-on or device reset occurred Thoughts? Anybody?
  7. As a follow up... The rebuild has been running for 4-5 hours now with no issue. I realized that after loading the downgraded firmware for the HBA I had not restarted. I just started the array and proceeded with the rebuild, which then crashed again. Where I'm at currently is the first reboot post downgrade, so maybe it was the 16.00.12.00 firmware causing the issue and it just needed a reboot to clear the issue after going back to 16.00.10.00. Is that plausible? I don't fully buy it, but so far so good. Also wanted to mention that I was having the freeze up occur during both normal and safe mode and docker both enabled and disabled. I've also only noted the issue while the array is started/disks mounted. I haven't mounted a fan on the LSI card yet, but it doesn't seem to be getting that hot. I've got great airflow through the case and drive, nvme, and mb temps rarely get over 30 C. HBA is also connected to the psu via the 6 pin pcie power cable. Final item of note, I've tried the HBA on both the pcie 5 x16 (cpu lane) and a pcie 3 x4 (mb lane) slot... Both also experienced the crash. Hopefully that covers everything. Hope someone can give me some solid advice based off from that.
  8. Like the title says. I just went through a server hardware upgrade. New MB/CPU/RAM/PSU. I'll post the details of the system below along with the latest diagnostic zip and syslog, but for now a brief description of what has been happening. I performed the upgrade a week ago Friday and the server was running rock solid and has been till 2 days ago when I decided to add an HBA, 2 SAS drives, a different CPU, and a new Pioneer UHD drive. The reason for the new CPU was that the previous one didn't have integrated graphics, which meant cannibalizing the gpu from my wife's pc everytime I needed to make a bios change. Easier to swap the 14400F chip for the 14500. The System is currently as follows: UnRAID OS Pro ver. 6.12.10 CPU - Intel i5 14500 MB - MSI ProZ690-A RAM - 4x32GB Corsair Vengeance DDR5-5200 HBA - LSI SAS9300-16i (Currently running the 16.00.10.00 IT Firmware and associated bios/efi from Broadcom's website. I also had the the TrueNAS/Broadcom Collaboration 16.00.12.00 IT FW installed, but downgraded as a troubleshooting step... Still got a crash) Connections - 2x8643 mini-SAS to 4x 8482 connectors for a total of 8 drives connected. PSU - Corsair HX1000i Cache - Samsung 970 Evo Plus SSD 1TB NVME M.2 Drive configuration post update (ignore the missing disc numbers. I removed several unused drives and was going to rebuild the parity/drive config after preclearing/installing the new SAS drives. *additional note, I did rebuild the parity after removing those drives and prior to the skyhawk failing): Parity - Seagate EXOS X16 ST16000NM001G 16TB Disc 3 - Seagate EXOS X16 ST16000NM001G 16TB Disc 6 - Seagate Skyhawk ST10000VX0004 10TB (after the second or third crash this drive began returning numerous Reallocated sector errors. over 200 at one point. Currently trying to Rebuild with a new EXOS X16 16TB SAS I purchased. Had been trying to run a preclear on it prior to this, but the server kept freezing during preread) Disc 7 - Seagate Desktop ST4000DM000 4TB Disc 8 - Seagate Desktop ST4000DM000 4TB Disc 9 - Seagate Desktop ST4000DM000 4TB Disc 10 - Seagate EXOS X10 ST10000NM0086 10TB I was seeing I/O errors associated with the Pioneer drive in a log at one point so it has been temporarily removed. Crashes still remain. I also swapped to a new (old) flash drive for the OS. Was using a 2GB Sony drive for the last 15 years. I was having issues with my Laptop reading the drive, so I replaced it with a 32GB USB 2.0 Sandisk Cruzr Glide, just in case that was the issue. no problems with the backup/key transfer and the drive appears to be workin perfectly. tried in multiple USB ports on server. Currently in a 2.0 rear panel slot. I've setup the syslog server to save locally to my cache drive and mirror to the flash. So far I haven't been able to capture much. as of my typing this the server has been running a rebuild on the new drive for 20 some minutes, but nothing has been added to the syslog since I made the change to the syslog server on where to save the file. I downloaded the latest diagnostic report zip and it is attached below. No idea what I should be looking for there. I had the skyhawk removed from the array, but still attached as an unassigned device. I've now removed that as well, so my current theory is a hardware issue with the LSI HBA card or the 14500 CPU. So, if there's any indication for either of these in the diagnostics it'd be great to know which. At this point I'm mostly stumped and not sure what to try next, short of pulling the HBA card/sas drives and going back to the onboard SATA, I'm not sure what else to try. Any help would be great! Thanks. dvd-diagnostics-20240421-1049.zip syslog-previous
  9. Yep, that's the guide I referenced in my first post. I did a bit more digging and found the slightly updated firmware that freenas and broadcom collaborated on and, after some trial an error getting a directory set up on my flash drive and copied over to ram I got everything to work. Can confirm all firmware and bios are up to date. System starts with minimal issue (seems like if I reboot/power down I have to enter the mb's uefi bios and reboot from there before unra8d will load. No idea why, though it might be my 15+ year old sony USB flash drive... Just getting it to show up on my laptop's file manager was tricky, so I'll probably be updating that as well in the coming weeks). Plugged all my drives in via the breakout cable and every single drive spun up and was properly detected. First of my new 16tb SAS drives is currently preclearing and everything seems to be running well!🤘
  10. ALLLLllLlLllll- Righty then... Got my LSI Card installed and, honestly, I think it was good to go from before I bought it. It apparently has an EFI bios since it showed up in my Motherboard's UEFI Bios. According to the data there it already has an IT firmware on both controllers. Sooo... Good to go there, and it shows up in my device list in UnRAID. So far so good. The only thing of note is that the firmware version is 7.0.1.0 and the version I found on Broadcom's website is 16.00.10.00. Obviously its pretty out of date, but does it really matter much for this? Any significant reason to upgrade it to the current firmware? Any other settings I should be adjusting for this? I haven't plugged a drive in to it yet (damn amazon is taking forever getting me my breakout cables, but I should have them later today). Anyway... This feels like it's going too easy. NOTHING seemingly simple EVER goes this easy for me. I've gotta be missing something, but I have no idea what. Lol
  11. Sooooo... I'm just going to assume from that overwhelming response that I'm on the right track with the above then. Should things go horribly wrong, I'll be sure to be back with some Joe Exotic gifs and a whole bunch more questions! 😂
  12. Hi all. Lookin for someone with a lot more knowledge on the subject double check my upgrade process and tell me if I've missed anything important. Last week I undertook the once a decade process of upgrading my UnRAID server. New Mobo, New CPU, New PSU, and 4 times the RAM. Basically I had a 15 year old drive (14.5 years worth of poweron time) die on me. Luckily I was anticipating this and had it and 3 other drives, still in my array, but with zero files on them. So I figured it was an opportune time to do a lot of clean up and expansion. I removed 4 old drives (1x1TB and 3x 2TB drives, all with over a decade of power on hours) and that left me with 7 SATA III drives in my array, as follows: Parity - Seagate EXOS X16 ST16000NM001G 16TB Disc 1 - Seagate EXOS X16 ST16000NM001G 16TB Disc 2 - Seagate Skyhawk ST10000VX0004 10TB Disc 3 - Seagate Desktop ST4000DM000 4TB Disc 4 - Seagate Desktop ST4000DM000 4TB Disc 5 - Seagate Desktop ST4000DM000 4TB Disc 6 - Seagate EXOS X10 ST10000NM0086 10TB I've ordered a couple more used/refurbished EXOS 16TB SAS Drives and a Used LSI SAS9300-16i HBA. My Hardware after this final round of upgrades will look be as follows: UnRAID OS Pro ver. 6.12.10 CPU - Intel i5 14500 MB - MSI ProZ690-A RAM - 4x32GB Corsair Vengeance DDR5-5200 HBA - LSI SAS9300-16i (unsure of currently installed firmware/bios... will update to latest IT firmware and bios) Connections - 2x8643 mini-SAS to 4x 8482 connectors for a total of 8 drives connected. PSU - Corsair HX1000i Cache - Samsung 970 Evo Plus SSD 1TB NVME M.2 Drive configuration post update: Parity - Seagate Exos X16 ST16000NM002G 16TB SAS 12/Gbps Disc 1 - Seagate Exos X16 ST16000NM002G 16TB SAS 12/Gbps Disc 2 - Seagate EXOS X16 ST16000NM001G 16TB SATA 6/Gbps Disc 3 - Seagate EXOS X16 ST16000NM001G 16TB SATA 6/Gbps Disc 4 - Seagate Skyhawk ST10000VX0004 10TB SATA 6/Gbps Disc 5 - Seagate Desktop ST4000DM000 4TB SATA 6/Gbps Disc 6 - Seagate Desktop ST4000DM000 4TB SATA 6/Gbps Disc 7 - Seagate EXOS X10 ST10000NM0086 10TB SATA 6/Gbps SHOULD the LSI HBA not already be flashed to the latest IT Firmware and BIOS, I have downloaded the following 3 Files for the process via UnRAID's terminal following this forum post: Linux Executable: sas3flash_linux_x64_rel/sas3flash (I had to Pull this executable from the Broadcom firmware downloads for the 9300-8i since the 16i downloads only include the firmware/bios zip... I assume it'll work for this as well?) SAS9300_16i_IT.bin mptsas3.rom Now, given my UEFI based MB, should I instead be looking at doing an EFI upgrade to the bios on the LSI card instead? (So fair warning. I have just enough knowledge on this stuff to get in the ballpark, but I'm still slightly confused by some of the technicalities, such as this bit about Bios vs EFI and exactly what that means as far as getting this all working correctly, so any help or clarifications when I obviously get something confused or flat out wrong, would be super appreciated.) Finally, I've been running my own UnRAID server for over 15 years now. That said, it has been 99% set and forget storage for my extensive collection of 1:1 MKV ripped movies. Minimal to no transcoding and usually only 1 user at a time, soooo... most of the above is massive overkill for my needs. I've started hosting a game server, but even that isn't very taxing. As far as the movies go, I rip my movies using an internal drive and ripping directly to a specified HDD. In other words, the SAS and HBA, are probably pointless for me in terms of performance, but whatever. Maybe it'll make the monthly parity checks go a bit quicker at least. lol Anyway... too that end, is there any specific settings/utilities/plugins/etc... that I can implement to maximize the performance from my new hardware? I've seen mention of ZFS drive pools and stuff, but a lot of that is outside of my wheelhouse (Honestly, I only know i need to flash IT firmware, but not exactly why. lol) Anyway, TLDR: Check my hardware setup above and tell me if I missed anything significant in my planned move to an HBA and SAS drives. Thanks!
  13. Funny you should mention that. My next plan was to sacrifice a live chicken! lol
  14. Unbelievable... I finally got it to work my completely disabling SMB2 and SMB3. Disable/enable SMB2/3 Now everything is fine again.... till the next time Microsoft has a bright idea!
  15. So any answers for the other questions? Like I said before, I can see the server just fine, but when I try to access it I get a network error (see attached photo). Any suggestions on how to get past this?