renobles

Members
  • Posts

    10
  • Joined

  • Last visited

Converted

  • Gender
    Undisclosed

renobles's Achievements

Noob

Noob (1/14)

0

Reputation

  1. Thanks for the responses. I moved that SSD over to the motherboard, and haven't had a lockup since then.
  2. Thanks for the quick response. Attached is the diagnostics file. The server has generally been quite stable. As far as I can remember this is the first time it has had a kernel panic and hard frozen. About a month ago, there were some controller issues, and I put in a LSI controller and moved all disks to it. Since then, no issues until the first kernel panic about mid-week. tower-diagnostics-20190209-2240.zip
  3. So, the memtest passed for 24 hours without any issue. What else can I try?
  4. Good call. I'll get that started now.
  5. Recently my server has started having a kernel panic around once every 12 hours or so. Only way to get it back up is a hard reboot. I've enable the troubleshooting mode in CA Fix Common Problems, and captured the output of the panic (below). Any ideas? Feb 8 02:21:51 Tower kernel: general protection fault: 0000 [#1] SMP PTI Feb 8 02:21:51 Tower kernel: CPU: 3 PID: 17930 Comm: sleep Not tainted 4.18.20-unRAID #1 Feb 8 02:21:51 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012 Feb 8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542 Feb 8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 Feb 8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246 Feb 8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600 Feb 8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00 Feb 8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000 Feb 8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00 Feb 8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Feb 8 02:21:51 Tower kernel: FS: 0000150f6c708540(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:21:51 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:21:51 Tower kernel: CR2: 000014bf648da000 CR3: 0000000117b00001 CR4: 00000000001606e0 Feb 8 02:21:51 Tower kernel: Call Trace: Feb 8 02:21:51 Tower kernel: ? do_nanosleep+0x81/0x161 Feb 8 02:21:51 Tower kernel: ? hrtimer_nanosleep+0x99/0xf9 Feb 8 02:21:51 Tower kernel: ? hrtimer_init+0x2/0x2 Feb 8 02:21:51 Tower kernel: ? __se_sys_nanosleep+0x79/0x94 Feb 8 02:21:51 Tower kernel: ? do_syscall_64+0x57/0xe6 Feb 8 02:21:51 Tower kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Feb 8 02:21:51 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq Feb 8 02:21:51 Tower kernel: ---[ end trace 636252fd9269676d ]--- Feb 8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542 Feb 8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 Feb 8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246 Feb 8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600 Feb 8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00 Feb 8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000 Feb 8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00 Feb 8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Feb 8 02:21:51 Tower kernel: FS: 0000150f6c708540(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:21:51 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:21:51 Tower kernel: CR2: 000014bf648da000 CR3: 0000000117b00001 CR4: 00000000001606e0 Feb 8 02:21:51 Tower kernel: general protection fault: 0000 [#2] SMP PTI Feb 8 02:21:51 Tower kernel: CPU: 3 PID: 17937 Comm: awk Tainted: G D 4.18.20-unRAID #1 Feb 8 02:21:51 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012 Feb 8 02:21:51 Tower kernel: RIP: 0010:show_map_vma.isra.5+0x99/0x134 Feb 8 02:21:51 Tower kernel: Code: d8 81 4c 89 e6 48 89 ef e8 76 1c fd ff eb 67 48 8b 83 90 00 00 00 48 85 c0 75 0f 48 89 df e8 56 ae eb ff 48 85 c0 75 7b eb 18 <48> 8b 40 58 48 85 c0 74 e8 48 89 df e8 4d 83 87 00 48 85 c0 75 63 Feb 8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbe7da0 EFLAGS: 00010206 Feb 8 02:21:51 Tower kernel: RAX: 2000000000000000 RBX: ffff880212852b40 RCX: 0000000000000e06 Feb 8 02:21:51 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff88008ee0a980 Feb 8 02:21:51 Tower kernel: RBP: ffff88008ee0a980 R08: 0000000000000000 R09: 0000000000000001 Feb 8 02:21:51 Tower kernel: R10: 0000000000000000 R11: ffff88006e396e04 R12: 0000000000000000 Feb 8 02:21:51 Tower kernel: R13: ffff880210ed72c0 R14: ffff8801ed17cb00 R15: 0000000000000dd6 Feb 8 02:21:51 Tower kernel: FS: 0000151d42013a80(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:21:51 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:21:51 Tower kernel: CR2: 00000000006a83f8 CR3: 0000000117b00003 CR4: 00000000001606e0 Feb 8 02:21:51 Tower kernel: Call Trace: Feb 8 02:21:51 Tower kernel: show_pid_map+0xd/0x1d Feb 8 02:21:51 Tower kernel: seq_read+0x2a1/0x38b Feb 8 02:21:51 Tower kernel: __vfs_read+0x2e/0x133 Feb 8 02:21:51 Tower kernel: ? vm_mmap_pgoff+0xa4/0xe2 Feb 8 02:21:51 Tower kernel: vfs_read+0x9a/0x11f Feb 8 02:21:51 Tower kernel: ksys_read+0x58/0xa6 Feb 8 02:21:51 Tower kernel: do_syscall_64+0x57/0xe6 Feb 8 02:21:51 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Feb 8 02:21:51 Tower kernel: RIP: 0033:0x151d423485e1 Feb 8 02:21:51 Tower kernel: Code: fe ff ff 50 48 8d 3d 2e 27 0a 00 e8 49 21 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 19 c1 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48 Feb 8 02:21:51 Tower kernel: RSP: 002b:00007ffe761b8e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 Feb 8 02:21:51 Tower kernel: RAX: ffffffffffffffda RBX: 00007ffe761b8e50 RCX: 0000151d423485e1 Feb 8 02:21:51 Tower kernel: RDX: 0000000000002000 RSI: 0000151d42f06000 RDI: 0000000000000004 Feb 8 02:21:51 Tower kernel: RBP: 00007ffe761b8eec R08: 00000000ffffffff R09: 0000000000000000 Feb 8 02:21:51 Tower kernel: R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004 Feb 8 02:21:51 Tower kernel: R13: 0000000000001000 R14: 00007ffe761b8ef0 R15: 0000000000002000 Feb 8 02:21:51 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq Feb 8 02:21:51 Tower kernel: ---[ end trace 636252fd9269676e ]--- Feb 8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542 Feb 8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 Feb 8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246 Feb 8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600 Feb 8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00 Feb 8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000 Feb 8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00 Feb 8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Feb 8 02:21:51 Tower kernel: FS: 0000151d42013a80(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:21:51 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:21:51 Tower kernel: CR2: 00000000006a83f8 CR3: 0000000117b00003 CR4: 00000000001606e0 Feb 8 02:22:01 Tower kernel: general protection fault: 0000 [#3] SMP PTI Feb 8 02:22:01 Tower kernel: CPU: 3 PID: 17947 Comm: monitor Tainted: G D 4.18.20-unRAID #1 Feb 8 02:22:01 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012 Feb 8 02:22:01 Tower kernel: RIP: 0010:vma_interval_tree_insert+0x2d/0x7b Feb 8 02:22:01 Tower kernel: Code: 08 48 89 f8 49 89 f0 45 31 c9 48 2b 17 4c 8b 97 98 00 00 00 48 c1 ea 0c 49 8d 7c 12 ff ba 01 00 00 00 49 8b 08 48 85 c9 74 1f <48> 39 79 18 73 04 48 89 79 18 4c 3b 51 40 4c 8d 41 10 72 06 4c 8d Feb 8 02:22:01 Tower kernel: RSP: 0018:ffffc9000fbf7d68 EFLAGS: 00010206 Feb 8 02:22:01 Tower kernel: RAX: ffff880211664cc0 RBX: ffff8801fac03da8 RCX: 2000000000000000 Feb 8 02:22:01 Tower kernel: RDX: 0000000000000000 RSI: ffff8801fac03dc8 RDI: 0000000000000000 Feb 8 02:22:01 Tower kernel: RBP: ffff880210ed50c0 R08: ffff8801ee169120 R09: ffff8801ee169118 Feb 8 02:22:01 Tower kernel: R10: 0000000000000000 R11: ffff8802116646e0 R12: ffff880211664cc0 Feb 8 02:22:01 Tower kernel: R13: ffff8802116646e0 R14: ffff8802116646f0 R15: 0000000000000000 Feb 8 02:22:01 Tower kernel: FS: 0000151d67f93740(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:22:01 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:22:01 Tower kernel: CR2: 00000000011ec028 CR3: 0000000193198003 CR4: 00000000001606e0 Feb 8 02:22:01 Tower kernel: Call Trace: Feb 8 02:22:01 Tower kernel: vma_link+0x63/0x7e Feb 8 02:22:01 Tower kernel: mmap_region+0x313/0x412 Feb 8 02:22:01 Tower kernel: do_mmap+0x3e9/0x43f Feb 8 02:22:01 Tower kernel: vm_mmap_pgoff+0x99/0xe2 Feb 8 02:22:01 Tower kernel: ksys_mmap_pgoff+0x6c/0x94 Feb 8 02:22:01 Tower kernel: do_syscall_64+0x57/0xe6 Feb 8 02:22:01 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Feb 8 02:22:01 Tower kernel: RIP: 0033:0x151d6bd850a3 Feb 8 02:22:01 Tower kernel: Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74 56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f Feb 8 02:22:01 Tower kernel: RSP: 002b:00007ffeedbfd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 Feb 8 02:22:01 Tower kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000151d6bd850a3 Feb 8 02:22:01 Tower kernel: RDX: 0000000000000001 RSI: 0000000000000022 RDI: 0000000000000000 Feb 8 02:22:01 Tower kernel: RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000 Feb 8 02:22:01 Tower kernel: R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000001 Feb 8 02:22:01 Tower kernel: R13: 0000000000000022 R14: 0000000000000002 R15: 0000000000000004 Feb 8 02:22:01 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq Feb 8 02:22:01 Tower kernel: ---[ end trace 636252fd9269676f ]--- Feb 8 02:22:01 Tower kernel: RIP: 0010:__schedule+0x541/0x542 Feb 8 02:22:01 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 Feb 8 02:22:01 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246 Feb 8 02:22:01 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600 Feb 8 02:22:01 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00 Feb 8 02:22:01 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000 Feb 8 02:22:01 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00 Feb 8 02:22:01 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Feb 8 02:22:01 Tower kernel: FS: 0000151d67f93740(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000 Feb 8 02:22:01 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 8 02:22:01 Tower kernel: CR2: 00000000011ec028 CR3: 0000000193198003 CR4: 00000000001606e0
  6. So this is resolved now. When I got home last night, I hard powered down the server because nothing else would shut it down. Afterwards, it came up OK with the array stopped (by design). Started it without the failed drive, stopped, and re-added the drive. It started to rebuild, but the speed was all over the place. Looked at the syslog, and noticed this a few times every second: Feb 10 09:48:44 Nobles-Home kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Feb 10 09:48:46 Nobles-Home kernel: ata3.00: configured for UDMA/33 Feb 10 09:48:46 Nobles-Home kernel: ata3: EH complete Feb 10 09:48:46 Nobles-Home kernel: ata3.00: exception Emask 0x10 SAct 0x0 SErr 0x10200 action 0xe frozen Feb 10 09:48:46 Nobles-Home kernel: ata3.00: irq_stat 0x00400000, PHY RDY changed Feb 10 09:48:46 Nobles-Home kernel: ata3: SError: { Persist PHYRdyChg } Feb 10 09:48:46 Nobles-Home kernel: ata3.00: failed command: READ DMA EXT Feb 10 09:48:46 Nobles-Home kernel: ata3.00: cmd 25/00:40:10:1b:c6/00:05:04:00:00/e0 tag 11 dma 688128 in Feb 10 09:48:46 Nobles-Home kernel: res 50/00:00:8f:ee:09/00:00:05:00:00/e0 Emask 0x10 (ATA bus error) Feb 10 09:48:46 Nobles-Home kernel: ata3.00: status: { DRDY } Feb 10 09:48:46 Nobles-Home kernel: ata3: hard resetting link The rebuild did eventually complete though. I then started a parity check, which was going, but extremely slowly. Also, mid-morning, my cache drive simply dropped out of Unraid. At this point I began to suspect the power supply. Replaced the PSU tonight, and the errors are all gone. Running a parity check now in any event.
  7. The current status of my server is that I can't get in to the WebGUI or browse shares. I can, however, get into SSH/SCP. So I'm running Unraid v6, but I'm not certain of the exact version. I don't believe that there were any system updates available, so that should make it 6.1.7 I think. It's on whitebox hardware - AMD FX-6300, MSI AM3 MB, 4xSATA HDDs. As far as add-ons and plugins, it is mainly docker templates, with speedtest and powerdown. I wish that I could be more descriptive here, but without access to the webgui, I'm running off of memory. Here's the somewhat longer version of the issue, and what I've done. First, I was moving some files around within a share, and noticed that some files disappeared suddenly. I then went and checked the webgui, which showed disk 2 being offline. From here, I attempted to stop the array, with the intent to reboot to see if the drive came back. The array began stopping, but the hung at something like "Stopping User Shares...". I tried going back to the main webgui page, but it would never load. Nor would any other part of the webgui load. From here, I went in via SSH, and ran the diagnostic command (link below). I went ahead and issued the powerdown -r command, which stated that the system was going down for reboot. The funny thing is that it never actually did go down. I can still open SSH sessions to it without issue. Just not the webui or SMB shares. This is the message that appears to be repeating multiple times per second in the syslog: Feb 9 14:55:02 Nobles-Home kernel: XFS (md2): xfs_log_force: error -5 returned. The archive was too large to attach, so here is a link to it on Google Drive: https://drive.google.com/file/d/0B7qWXJTelq0LellaMEVnVy1tTTg/view?usp=sharing Before I go nuclear and hard power cycle the box, what else can I do? Thanks for any assistance!
  8. I'm also seeing the same thing. Here is what my log says currently: ----------------------------------- GID/UID ----------------------------------- User uid: 99 User gid: 100 ----------------------------------- Useing STABLE branch: Not up-to-date\Installed Checking som config options Oct 13 10:58:28 36631ad01c85 syslog-ng[50]: syslog-ng starting up; version='3.5.3' Looking at the Github repository, it looks like there were a handful of syntax changes this morning, so I'm wondering if one of them caused this to start.
  9. I have added this to our todo list You are a wizard. Thank you.
  10. Thank you for this. Works great for me. I do have one request though. Can you add the bits required for SSL to it? This is the warning generated when trying to enable SSL: The pyOpenSSL module is missing. Install this module to enable HTTPS. HTTPS will be disabled.