Sniper00X

Members
  • Posts

    23
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

Sniper00X's Achievements

Noob

Noob (1/14)

1

Reputation

  1. @JorgeB no, with versions > than 6.9.2 when it crash, if's inaccessible even on the local terminal (frozen).
  2. So I've downgraded back to 6.9.2 for now Array did start, however interestingly, I've been doing a ping test to the server every time i start the array to see if it goes down. It seems even with 6.9.2 the interface goes offline for a moment but comes back up. I don't know if it's related, but is it possible that my network cards or net config is what's causing the crashes --- and i didn't notice because it recovers with 6.9.2 but not the later versions? Here's a screenshot of the ping test as the array was starting --- and i've included the latest syslog as well (weird) syslog
  3. Here's the latest syslog after most recent reboot and crash syslog
  4. Thanks for the reply @Squid yes I've double checked those settings. I've even toggled C-State to see if it would make a difference, no luck.
  5. Hey @JorgeB I've captured and attached the latest syslog It's still crashing upon array startup syslog
  6. I recently upgraded from version 6.9.2 But all versions I've tried has been crashing after array startup Some versions (like 6.11.5) the array was able to start (sporadically) but then would crash during Parity Checks I've had to downgrade to 6.9.2 to get back to an operational state. Here's what i've tried so far: - Booted into Safe Mode (no plugins), made no difference - Fixed all Common Issues reported - Removed old plugins - Upgraded all plugins to latest version - Looked at logs to see if anything stood out that might be causing a kernal panic (couldn't find anything meaningful - Messed around with c-state and acpi settings in bios (no difference) What might I be missing in order to figure out what's causing the issue? Thanks S
  7. Hello, This weekend went into the BIOs and disabled Global C-States However, this morning the server crashed again. This time though, I had a monitor hooked up to the server and was able to take a image of the screen <code> a(PO) drm backlight agpgart w83795 i5500_temp ip6table_filter ip6_tables iptable _filter ip_tables x_tables be2net igb i2c_algo_bit atlantic edac_mce_amd kvm_ama kvm btusb btrt1 crct10dif_pclmul crc32_pclmul btbcm crc32c_intel btintel ghash_ clmulni_intel aesni_intel crypto_simd bluetooth mxm_wmi cryptd wmi_bmof glue_he I per mpt3sas izc_piix4 raid_class ecdh_generic input_leds ecc nume rapl scsi_tran sport_sas ahci i2c_core led_class nume_core k10temp wmi ccp libahci button acpi-_ cpufreq [last unloaded : be2net] --- end trace 306897cb8606c0c5 ]--- RIP: 0010:nf_nat_setup_info+0x129/0x6aa [nf_nat] Code: ff 48 8b 15 ef 6a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 of 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28 RSP: 0018:ffffc9000075c700 EFLAGS: 00010202 RAX: ffff8881bf5e5d11 RBX: ffff8884cb2d6940 RCX: 0000000000000000 RDX: 41a68045a11c4623 RSI: 0000000017213a45 RDI: ffffc9000075c720 RBP: ffffc9000075c7c8 R08: 00000000251a42d2 R09: ffff8881bec61680 R10: ffff8881432ec388 R11: ffffffff815cbe4b R12: 0000000000000000 R13: ffffc9000075c720 R14: ffffc9000075c7dc R15: ffffffff8210b440 FS: 0000150126d5b640(0000) GS:ffff889ffd340000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000148053281800 CR3: 000000013e810000 CR4: 00000000003506e0 Kernel panic not syncing: Fatal exception in interrupt Kernel Offset: disabled --- [ end Kernel panic not syncing: Fatal exception in interrupt ]--- </code> Any clue what this is referring to?
  8. @ChatNoir no I wasn't aware of that. But I'll certainly check and try that. I do have a Gen 1 Threadripper processor so it's quite possible that some bios update / tuning is required. I'll take a look at that as a start and report back. Question, is there anyway to get syslogs to be persistent? That way if i experience another crash we could have access to the log information and not lose it to reboot?
  9. Hello, I've been experiencing some random server crashes. Not exactly sure what could be causing it as it happens at different intervals. Sometimes I can go days without any issues. When it happens, I lose network access, access to the web interface, and access to SSH or local terminal (at the machine). Can someone help take a look at my logs to see if I'm missing something obvious? Thanks thetower-diagnostics-20220328-1953.zip
  10. Yes that's the only controller. Onboard SATA are unused. Ok i think it's time i unrack the server, crack it open and do some troubleshooting Will report back steps.
  11. Ok it says the test completed without error Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 1962 - # 2 Extended offline Interrupted (host reset) 00% 1929 - # 3 Short offline Completed without error 00% 1900 - # 4 Extended offline Completed without error 00% 332 - I've attached the new diag file. Drive is still spun down and in disabled state, I haven't changed anything. Awaiting advice. thetower-diagnostics-20211210-1625.zip
  12. ok did that and re-kicked off the test about 2 hours ago. It's still saying it's running and at 10% Will let it run it's course and report back.
  13. Woke up this morning to the following error: Last SMART test result: Interrupted (host reset) I've attached the new diag files thetower-diagnostics-20211209-0815.zip
  14. Should I run the Extended Smart test again? This time while mounted in Unraid?