Zeron Posted June 4, 2012 Share Posted June 4, 2012 I tried upgrading from beta-6a to 5c3 and get the panic below at boot just as the mpt2sas driver is detecting the drives. It appears that this bug is fixed by a patch on 3.0.33 (see below) Is ther any cahnce of 5.0rc4 using 3.0.33? Panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c101f8dd>] __wake_up_common+0x17/0x5c *pdpt = 000000000c934001 *pde = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: e1000 mpt2sas(+) scsi_transport_sas raid_class piix(+) Pid: 833, comm: modprobe Not tainted 3.0.31-unRAID #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform EIP: 0060:[<c101f8dd>] EFLAGS: 00010093 CPU: 0 EIP is at __wake_up_common+0x17/0x5c EAX: dd6c84e8 EBX: fffffff4 ECX: 00000001 EDX: 00000003 ESI: 00000092 EDI: 00000003 EBP: dd40ff4c ESP: dd40ff34 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process modprobe (pid: 833, ti=dd40e000 task=cc9014a0 task.ti=d911e000) Stack: c14c9600 dd6d3fa8 00000001 dd6c84e4 00000092 dd6c84e0 dd40ff68 c10202e5 00000000 00000000 dd6c837c d9800020 dd602620 dd40ff78 e0869c9e d9708000 0000000d dd40ffb8 e086a720 19800000 19800000 dd40ffa4 dd6c8390 00000000 Call Trace: [<c10202e5>] complete+0x2b/0x3e [<e0869c9e>] mpt2sas_base_done+0x6e/0x74 [mpt2sas] [<e086a720>] _base_interrupt+0x133/0x2f8 [mpt2sas] [<c104e61b>] handle_irq_event_percpu+0x24/0x100 [<c105038e>] ? irq_set_chip_and_handler_name+0x24/0x24 [<c104e71b>] handle_irq_event+0x24/0x3b [<c105038e>] ? irq_set_chip_and_handler_name+0x24/0x24 [<c105043c>] handle_edge_irq+0xae/0xce <IRQ> [<c1003622>] ? do_IRQ+0x37/0x90 [<c130fa69>] ? common_interrupt+0x29/0x30 [<e086858c>] ? _base_event_notification+0xef/0x1af [mpt2sas] [<e0868b0a>] ? _base_make_ioc_operational+0x2f1/0x36e [mpt2sas] [<e086bb47>] ? mpt2sas_base_attach+0x447/0x558 [mpt2sas] [<e08703a5>] ? _scsih_probe+0x310/0x3fe [mpt2sas] [<c11af52a>] ? local_pci_probe+0x3e/0x81 [<c11afc89>] ? pci_device_probe+0x43/0x66 [<c1207e52>] ? really_probe+0x72/0xed [<c1207ef9>] ? driver_probe_device+0x2c/0x41 [<c1207f51>] ? __driver_attach+0x43/0x5f [<c1207838>] ? bus_for_each_dev+0x3d/0x67 [<c1207d19>] ? driver_attach+0x14/0x16 [<c1207f0e>] ? driver_probe_device+0x41/0x41 [<c12072cb>] ? bus_add_driver+0x9d/0x1cd [<c12083ca>] ? driver_register+0x7c/0xe3 [<c11ff477>] ? misc_register+0x91/0xe6 [<c11afe4e>] ? __pci_register_driver+0x38/0x95 [<e088610c>] ? _scsih_init+0x10c/0x12d [mpt2sas] [<c1001159>] ? do_one_initcall+0x71/0x113 [<e0886000>] ? 0xe0885fff [<c104cae6>] ? sys_init_module+0x61/0x18b [<c130ed05>] ? syscall_call+0x7/0xb Code: 7e 08 01 5a 30 11 72 34 eb 06 01 5a 28 11 72 2c 5b 5e 5d c3 55 89 e5 57 89 d7 56 53 83 ec 0c 89 4d f0 8b 58 04 83 c0 04 83 eb 0c <8b> 73 0c 89 45 e8 83 ee 0c eb 2a 8b 03 89 fa ff 75 0c 8b 4d 08 EIP: [<c101f8dd>] __wake_up_common+0x17/0x5c SS:ESP 0068:dd40ff34 CR2: 0000000000000000 ---[ end trace 22b2c2d4edffda35 ]--- Kernel panic - not syncing: Fatal exception in interrupt Pid: 833, comm: modprobe Tainted: G D 3.0.31-unRAID #1 Call Trace: [<c130ce2f>] panic+0x50/0x13a [<c100490a>] oops_end+0x6e/0x7c [<c101b2d9>] no_context+0xac/0xb6 [<c101b3cb>] __bad_area_nosemaphore+0xe8/0xf0 [<c101b582>] ? mm_fault_error+0x129/0x129 [<c101b3e0>] bad_area_nosemaphore+0xd/0x10 [<c101b6d2>] do_page_fault+0x150/0x332 [<c10202ee>] ? complete+0x34/0x3e [<c107ace1>] ? dma_pool_free+0xcd/0xd5 [<c101b582>] ? mm_fault_error+0x129/0x129 [<c130f302>] error_code+0x5a/0x60 [<c103007b>] ? task_ns_capable+0x1/0x16 [<c101b582>] ? mm_fault_error+0x129/0x129 [<c101f8dd>] ? __wake_up_common+0x17/0x5c [<c10202e5>] complete+0x2b/0x3e [<e0869c9e>] mpt2sas_base_done+0x6e/0x74 [mpt2sas] [<e086a720>] _base_interrupt+0x133/0x2f8 [mpt2sas] [<c104e61b>] handle_irq_event_percpu+0x24/0x100 [<c105038e>] ? irq_set_chip_and_handler_name+0x24/0x24 [<c104e71b>] handle_irq_event+0x24/0x3b [<c105038e>] ? irq_set_chip_and_handler_name+0x24/0x24 [<c105043c>] handle_edge_irq+0xae/0xce <IRQ> [<c1003622>] ? do_IRQ+0x37/0x90 [<c130fa69>] ? common_interrupt+0x29/0x30 [<e086858c>] ? _base_event_notification+0xef/0x1af [mpt2sas] [<e0868b0a>] ? _base_make_ioc_operational+0x2f1/0x36e [mpt2sas] [<e086bb47>] ? mpt2sas_base_attach+0x447/0x558 [mpt2sas] [<e08703a5>] ? _scsih_probe+0x310/0x3fe [mpt2sas] [<c11af52a>] ? local_pci_probe+0x3e/0x81 [<c11afc89>] ? pci_device_probe+0x43/0x66 [<c1207e52>] ? really_probe+0x72/0xed [<c1207ef9>] ? driver_probe_device+0x2c/0x41 [<c1207f51>] ? __driver_attach+0x43/0x5f [<c1207838>] ? bus_for_each_dev+0x3d/0x67 [<c1207d19>] ? driver_attach+0x14/0x16 [<c1207f0e>] ? driver_probe_device+0x41/0x41 [<c12072cb>] ? bus_add_driver+0x9d/0x1cd [<c12083ca>] ? driver_register+0x7c/0xe3 [<c11ff477>] ? misc_register+0x91/0xe6 [<c11afe4e>] ? __pci_register_driver+0x38/0x95 [<e088610c>] ? _scsih_init+0x10c/0x12d [mpt2sas] [<c1001159>] ? do_one_initcall+0x71/0x113 [<e0886000>] ? 0xe0885fff [<c104cae6>] ? sys_init_module+0x61/0x18b [<c130ed05>] ? syscall_call+0x7/0xb Fix from 3.0.33 commit 35d73fe5e3d8c72a41c2eaf285a9bfb7b6c66aee Author: [email protected] <[email protected]> Date: Tue Mar 20 12:10:01 2012 +0530 SCSI: mpt2sas: Fix for panic happening because of improper memory allocation commit e42fafc25fa86c61824e8d4c5e7582316415d24f upstream. The ioc->pfacts member in the IOC structure is getting set to zero following a call to _base_get_ioc_facts due to the memset in that routine. So if the ioc->pfacts was read after a host reset, there would be a NULL pointer dereference. The routine _base_get_ioc_facts is called from context of host reset. The problem in _base_get_ioc_facts is the size of Mpi2IOCFactsReply is 64, whereas the sizeof "struct mpt2sas_facts" is 60, so there is a four byte overflow resulting from the memset. Also, there is memset in _base_get_port_facts using the incorrect structure, it should be "struct mpt2sas_port_facts" instead of Mpi2PortFactsReply. Signed-off-by: Nagalakshmi Nandigama <[email protected]> Signed-off-by: James Bottomley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Link to comment
limetech Posted June 4, 2012 Share Posted June 4, 2012 LOL I was just looking at kernel.org... That fix may or may not fix the issue you are seeing, but I can't upgrade to 3.0.33 because they haven't posted the full source yet. I always upgrade kernels from full source - been burned by faulty patches in the past Link to comment
Zeron Posted June 4, 2012 Author Share Posted June 4, 2012 LOL I was just looking at kernel.org... That fix may or may not fix the issue you are seeing, but I can't upgrade to 3.0.33 because they haven't posted the full source yet. I always upgrade kernels from full source - been burned by faulty patches in the past Ths file is on kernel.org, just not linked form the front page yet: http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.33.tar.gz Link to comment
limetech Posted June 4, 2012 Share Posted June 4, 2012 LOL I was just looking at kernel.org... That fix may or may not fix the issue you are seeing, but I can't upgrade to 3.0.33 because they haven't posted the full source yet. I always upgrade kernels from full source - been burned by faulty patches in the past Ths file is on kernel.org, just not linked form the front page yet: http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.0.33.tar.gz It's there now... we have been looking at it simultaneously with whoever is posting it Link to comment
Zeron Posted July 12, 2012 Author Share Posted July 12, 2012 5rc6-r8168-test has fixed this issue. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.