Jump to content

Kernel Panic on 6.6.6


renobles

Recommended Posts

Recently my server has started having a kernel panic around once every 12 hours or so. Only way to get it back up is a hard reboot. I've enable the troubleshooting mode in CA Fix Common Problems, and captured the output of the panic (below). Any ideas?

 

Feb  8 02:21:51 Tower kernel: general protection fault: 0000 [#1] SMP PTI
Feb  8 02:21:51 Tower kernel: CPU: 3 PID: 17930 Comm: sleep Not tainted 4.18.20-unRAID #1
Feb  8 02:21:51 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012
Feb  8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542
Feb  8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 
Feb  8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246
Feb  8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600
Feb  8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00
Feb  8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000
Feb  8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00
Feb  8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Feb  8 02:21:51 Tower kernel: FS:  0000150f6c708540(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:21:51 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:21:51 Tower kernel: CR2: 000014bf648da000 CR3: 0000000117b00001 CR4: 00000000001606e0
Feb  8 02:21:51 Tower kernel: Call Trace:
Feb  8 02:21:51 Tower kernel: ? do_nanosleep+0x81/0x161
Feb  8 02:21:51 Tower kernel: ? hrtimer_nanosleep+0x99/0xf9
Feb  8 02:21:51 Tower kernel: ? hrtimer_init+0x2/0x2
Feb  8 02:21:51 Tower kernel: ? __se_sys_nanosleep+0x79/0x94
Feb  8 02:21:51 Tower kernel: ? do_syscall_64+0x57/0xe6
Feb  8 02:21:51 Tower kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb  8 02:21:51 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq
Feb  8 02:21:51 Tower kernel: ---[ end trace 636252fd9269676d ]---
Feb  8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542
Feb  8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 
Feb  8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246
Feb  8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600
Feb  8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00
Feb  8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000
Feb  8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00
Feb  8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Feb  8 02:21:51 Tower kernel: FS:  0000150f6c708540(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:21:51 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:21:51 Tower kernel: CR2: 000014bf648da000 CR3: 0000000117b00001 CR4: 00000000001606e0
Feb  8 02:21:51 Tower kernel: general protection fault: 0000 [#2] SMP PTI
Feb  8 02:21:51 Tower kernel: CPU: 3 PID: 17937 Comm: awk Tainted: G      D           4.18.20-unRAID #1
Feb  8 02:21:51 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012
Feb  8 02:21:51 Tower kernel: RIP: 0010:show_map_vma.isra.5+0x99/0x134
Feb  8 02:21:51 Tower kernel: Code: d8 81 4c 89 e6 48 89 ef e8 76 1c fd ff eb 67 48 8b 83 90 00 00 00 48 85 c0 75 0f 48 89 df e8 56 ae eb ff 48 85 c0 75 7b eb 18 <48> 8b 40 58 48 85 c0 74 e8 48 89 df e8 4d 83 87 00 48 85 c0 75 63 
Feb  8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbe7da0 EFLAGS: 00010206
Feb  8 02:21:51 Tower kernel: RAX: 2000000000000000 RBX: ffff880212852b40 RCX: 0000000000000e06
Feb  8 02:21:51 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000020 RDI: ffff88008ee0a980
Feb  8 02:21:51 Tower kernel: RBP: ffff88008ee0a980 R08: 0000000000000000 R09: 0000000000000001
Feb  8 02:21:51 Tower kernel: R10: 0000000000000000 R11: ffff88006e396e04 R12: 0000000000000000
Feb  8 02:21:51 Tower kernel: R13: ffff880210ed72c0 R14: ffff8801ed17cb00 R15: 0000000000000dd6
Feb  8 02:21:51 Tower kernel: FS:  0000151d42013a80(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:21:51 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:21:51 Tower kernel: CR2: 00000000006a83f8 CR3: 0000000117b00003 CR4: 00000000001606e0
Feb  8 02:21:51 Tower kernel: Call Trace:
Feb  8 02:21:51 Tower kernel: show_pid_map+0xd/0x1d
Feb  8 02:21:51 Tower kernel: seq_read+0x2a1/0x38b
Feb  8 02:21:51 Tower kernel: __vfs_read+0x2e/0x133
Feb  8 02:21:51 Tower kernel: ? vm_mmap_pgoff+0xa4/0xe2
Feb  8 02:21:51 Tower kernel: vfs_read+0x9a/0x11f
Feb  8 02:21:51 Tower kernel: ksys_read+0x58/0xa6
Feb  8 02:21:51 Tower kernel: do_syscall_64+0x57/0xe6
Feb  8 02:21:51 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb  8 02:21:51 Tower kernel: RIP: 0033:0x151d423485e1
Feb  8 02:21:51 Tower kernel: Code: fe ff ff 50 48 8d 3d 2e 27 0a 00 e8 49 21 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 19 c1 0d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48 
Feb  8 02:21:51 Tower kernel: RSP: 002b:00007ffe761b8e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Feb  8 02:21:51 Tower kernel: RAX: ffffffffffffffda RBX: 00007ffe761b8e50 RCX: 0000151d423485e1
Feb  8 02:21:51 Tower kernel: RDX: 0000000000002000 RSI: 0000151d42f06000 RDI: 0000000000000004
Feb  8 02:21:51 Tower kernel: RBP: 00007ffe761b8eec R08: 00000000ffffffff R09: 0000000000000000
Feb  8 02:21:51 Tower kernel: R10: 000000000000000a R11: 0000000000000246 R12: 0000000000000004
Feb  8 02:21:51 Tower kernel: R13: 0000000000001000 R14: 00007ffe761b8ef0 R15: 0000000000002000
Feb  8 02:21:51 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq
Feb  8 02:21:51 Tower kernel: ---[ end trace 636252fd9269676e ]---
Feb  8 02:21:51 Tower kernel: RIP: 0010:__schedule+0x541/0x542
Feb  8 02:21:51 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 
Feb  8 02:21:51 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246
Feb  8 02:21:51 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600
Feb  8 02:21:51 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00
Feb  8 02:21:51 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000
Feb  8 02:21:51 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00
Feb  8 02:21:51 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Feb  8 02:21:51 Tower kernel: FS:  0000151d42013a80(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:21:51 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:21:51 Tower kernel: CR2: 00000000006a83f8 CR3: 0000000117b00003 CR4: 00000000001606e0
Feb  8 02:22:01 Tower kernel: general protection fault: 0000 [#3] SMP PTI
Feb  8 02:22:01 Tower kernel: CPU: 3 PID: 17947 Comm: monitor Tainted: G      D           4.18.20-unRAID #1
Feb  8 02:22:01 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Q77M vPro, BIOS P1.00 04/06/2012
Feb  8 02:22:01 Tower kernel: RIP: 0010:vma_interval_tree_insert+0x2d/0x7b
Feb  8 02:22:01 Tower kernel: Code: 08 48 89 f8 49 89 f0 45 31 c9 48 2b 17 4c 8b 97 98 00 00 00 48 c1 ea 0c 49 8d 7c 12 ff ba 01 00 00 00 49 8b 08 48 85 c9 74 1f <48> 39 79 18 73 04 48 89 79 18 4c 3b 51 40 4c 8d 41 10 72 06 4c 8d 
Feb  8 02:22:01 Tower kernel: RSP: 0018:ffffc9000fbf7d68 EFLAGS: 00010206
Feb  8 02:22:01 Tower kernel: RAX: ffff880211664cc0 RBX: ffff8801fac03da8 RCX: 2000000000000000
Feb  8 02:22:01 Tower kernel: RDX: 0000000000000000 RSI: ffff8801fac03dc8 RDI: 0000000000000000
Feb  8 02:22:01 Tower kernel: RBP: ffff880210ed50c0 R08: ffff8801ee169120 R09: ffff8801ee169118
Feb  8 02:22:01 Tower kernel: R10: 0000000000000000 R11: ffff8802116646e0 R12: ffff880211664cc0
Feb  8 02:22:01 Tower kernel: R13: ffff8802116646e0 R14: ffff8802116646f0 R15: 0000000000000000
Feb  8 02:22:01 Tower kernel: FS:  0000151d67f93740(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:22:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:22:01 Tower kernel: CR2: 00000000011ec028 CR3: 0000000193198003 CR4: 00000000001606e0
Feb  8 02:22:01 Tower kernel: Call Trace:
Feb  8 02:22:01 Tower kernel: vma_link+0x63/0x7e
Feb  8 02:22:01 Tower kernel: mmap_region+0x313/0x412
Feb  8 02:22:01 Tower kernel: do_mmap+0x3e9/0x43f
Feb  8 02:22:01 Tower kernel: vm_mmap_pgoff+0x99/0xe2
Feb  8 02:22:01 Tower kernel: ksys_mmap_pgoff+0x6c/0x94
Feb  8 02:22:01 Tower kernel: do_syscall_64+0x57/0xe6
Feb  8 02:22:01 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Feb  8 02:22:01 Tower kernel: RIP: 0033:0x151d6bd850a3
Feb  8 02:22:01 Tower kernel: Code: 54 41 89 d4 55 48 89 fd 53 4c 89 cb 48 85 ff 74 56 49 89 d9 45 89 f8 45 89 f2 44 89 e2 4c 89 ee 48 89 ef b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7d 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 
Feb  8 02:22:01 Tower kernel: RSP: 002b:00007ffeedbfd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000009
Feb  8 02:22:01 Tower kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000151d6bd850a3
Feb  8 02:22:01 Tower kernel: RDX: 0000000000000001 RSI: 0000000000000022 RDI: 0000000000000000
Feb  8 02:22:01 Tower kernel: RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
Feb  8 02:22:01 Tower kernel: R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000001
Feb  8 02:22:01 Tower kernel: R13: 0000000000000022 R14: 0000000000000002 R15: 0000000000000004
Feb  8 02:22:01 Tower kernel: Modules linked in: veth xt_nat ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs md_mod x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd mpt3sas e1000e glue_helper raid_class scsi_transport_sas intel_cstate intel_uncore ahci i2c_i801 intel_rapl_perf i2c_core libahci video backlight ie31200_edac button pcc_cpufreq
Feb  8 02:22:01 Tower kernel: ---[ end trace 636252fd9269676f ]---
Feb  8 02:22:01 Tower kernel: RIP: 0010:__schedule+0x541/0x542
Feb  8 02:22:01 Tower kernel: Code: 74 08 4c 89 e7 e8 c2 17 a3 ff 48 8b 45 d0 65 48 33 04 25 28 00 00 00 74 05 e8 28 59 a1 ff 58 5a 5b 41 5c 41 5d 41 5e 41 5f 5d <c3> 65 48 8b 04 25 00 5c 01 00 48 8b 50 10 48 85 d2 74 42 48 83 b8 
Feb  8 02:22:01 Tower kernel: RSP: 0018:ffffc9000fbd7e50 EFLAGS: 00010246
Feb  8 02:22:01 Tower kernel: RAX: ffff880212da7d80 RBX: ffffc9000fbd7ea8 RCX: ffff880214dff600
Feb  8 02:22:01 Tower kernel: RDX: 13ee998aded91800 RSI: 000000006c708540 RDI: ffff88021e3a0c00
Feb  8 02:22:01 Tower kernel: RBP: ffffc9000fbd7e98 R08: 000077ff80000000 R09: 0000000000000000
Feb  8 02:22:01 Tower kernel: R10: 0000000000000004 R11: ffff88021e3a0c80 R12: ffff880214dfec00
Feb  8 02:22:01 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Feb  8 02:22:01 Tower kernel: FS:  0000151d67f93740(0000) GS:ffff88021e380000(0000) knlGS:0000000000000000
Feb  8 02:22:01 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  8 02:22:01 Tower kernel: CR2: 00000000011ec028 CR3: 0000000193198003 CR4: 00000000001606e0

 

Link to comment

Thanks for the quick response. Attached is the diagnostics file. The server has generally been quite stable. As far as I can remember this is the first time it has had a kernel panic and hard frozen. About a month ago, there were some controller issues, and I put in a LSI controller and moved all disks to it. Since then, no issues until the first kernel panic about mid-week. 

tower-diagnostics-20190209-2240.zip

Link to comment

I don't see anything alarming in your diagnostics. The only thing out of the ordinary is this at the very end of your syslog:

Feb  9 22:41:49 Tower kernel: sd 7:0:2:0: attempting task abort! scmd(00000000d4352fa5)
Feb  9 22:41:49 Tower kernel: sd 7:0:2:0: [sdd] tag#0 CDB: opcode=0x85 85 08 0e 00 d0 00 01 00 00 00 4f 00 c2 00 b0 00
Feb  9 22:41:49 Tower kernel: scsi target7:0:2: handle(0x000a), sas_address(0x4433221101000000), phy(1)
Feb  9 22:41:49 Tower kernel: scsi target7:0:2: enclosure logical id(0x500605b001600880), slot(2) 
Feb  9 22:41:49 Tower kernel: sd 7:0:2:0: task abort: SUCCESS scmd(00000000d4352fa5)
Feb  9 22:41:49 Tower kernel: sd 7:0:2:0: Power-on or device reset occurred

which coincides with your request for the diagnostics dump. Now, /dev/sdd is your Crucial BX SSD, which hasn't produced a SMART report. I'm guessing that the reset was possibly as a result of the failed SMART request. So, while it may not be a problem, it would be worth investigating. I don't have any Crucial SSDs in any system I've ever built but I vaguely remember reports on this forum about them sometimes behaving oddly so it might be worth a bit of research. It might be worth moving it to a motherboard SATA port. Your SanDisk SSD is fine and I've used them a lot and never had any issues.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...