Jump to content
  • [6.9.1] Kernal Panic(?) on array start (after cache config modification)


    johner
    • Urgent

    So I had everything running nice, 23 disk unraid, 2xraid1 nvme cache, 1xssd cache2.

     

    My original config had 4 slots for cache, but with only 2 drives. I stopped the array and shrunk the slots to 2, after restarting the array the machine froze. I couldn't access the logs via the web portal, but could see on the console (and via SSH) that something serious happened.

     

    I had to reboot to free it up and fortunately i have it set to not start the array as every start of the array it would freeze again with the same errors on the console (lots of hex pairs i think).

     

    I removed the cache config, recreated it (2 slots!), formatted again and now it works fine.

     

    I can no longer see the log entries, and don't have a copy - I assume due to the reboot? Is there an archive of them anywhere? I've not changed any logging settings.

     

    I would try to reproduce this and capture the logs but I'm getting pressure from the family to get it back up and running 🙂

     

    Just happened again:

     

     

    Mar 25 10:00:46 Tower kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
    Mar 25 10:00:46 Tower kernel: #PF: supervisor read access in kernel mode
    Mar 25 10:00:46 Tower kernel: #PF: error_code(0x0000) - not-present page
    Mar 25 10:00:46 Tower kernel: PGD 8000000169cbd067 P4D 8000000169cbd067 PUD 16a028067 PMD 0
    Mar 25 10:00:46 Tower kernel: Oops: 0000 [#1] SMP PTI
    Mar 25 10:00:46 Tower kernel: CPU: 0 PID: 1402 Comm: btrfs Not tainted 5.10.21-Unraid #1
    Mar 25 10:00:46 Tower kernel: Hardware name: Supermicro Super Server/X10SRH-CLN4F, BIOS 3.2 11/22/2019
    Mar 25 10:00:46 Tower kernel: RIP: 0010:strcmp+0x2/0x1a
    Mar 25 10:00:46 Tower kernel: Code: ef 4c 89 c0 c3 48 89 f8 48 89 fa 8a 0a 48 89 d7 48 8d 52 01 84 c9 75 f3 31 d2 8a 0c 16 88 0c 17 48 ff c2 84 c9 75 f3 c3 31 c0 <8a> 14 07 3a 14 06 74 06 19 c0 83 c8 01 c3 48 ff c0 84 d2 75 eb 31
    Mar 25 10:00:46 Tower kernel: RSP: 0018:ffffc900017f7d78 EFLAGS: 00010246
    Mar 25 10:00:46 Tower kernel: RAX: 0000000000000000 RBX: ffff8881559c5800 RCX: 0000000000000000
    Mar 25 10:00:46 Tower kernel: RDX: 0000000000000001 RSI: ffffffff81d9d9b5 RDI: 0000000000000000
    Mar 25 10:00:46 Tower kernel: RBP: fffffffffffffffe R08: 0000000000000001 R09: 0000000000004000
    Mar 25 10:00:46 Tower kernel: R10: ffffc900017f7f08 R11: 0000000000000000 R12: 0000000000000006
    Mar 25 10:00:46 Tower kernel: R13: ffff88816a984000 R14: 0000000000000000 R15: 00000000000037b4
    Mar 25 10:00:46 Tower kernel: FS:  00001498a6a99d40(0000) GS:ffff88903fa00000(0000) knlGS:0000000000000000
    Mar 25 10:00:46 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 25 10:00:46 Tower kernel: CR2: 0000000000000000 CR3: 0000000163544001 CR4: 00000000001706f0
    Mar 25 10:00:46 Tower kernel: Call Trace:
    Mar 25 10:00:46 Tower kernel: btrfs_rm_device+0x10b/0x4ad
    Mar 25 10:00:46 Tower kernel: btrfs_ioctl+0xced/0x2c28
    Mar 25 10:00:46 Tower kernel: ? getname_flags+0x44/0x146
    Mar 25 10:00:46 Tower kernel: ? vfs_statx+0x72/0x105
    Mar 25 10:00:46 Tower kernel: ? vfs_ioctl+0x19/0x26
    Mar 25 10:00:46 Tower kernel: vfs_ioctl+0x19/0x26
    Mar 25 10:00:46 Tower kernel: __do_sys_ioctl+0x51/0x74
    Mar 25 10:00:46 Tower kernel: do_syscall_64+0x5d/0x6a
    Mar 25 10:00:46 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    Mar 25 10:00:46 Tower kernel: RIP: 0033:0x1498a6bb1417
    Mar 25 10:00:46 Tower kernel: Code: 00 00 90 48 8b 05 79 2a 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 49 2a 0d 00 f7 d8 64 89 01 48
    Mar 25 10:00:46 Tower kernel: RSP: 002b:00007ffe3629bf38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    Mar 25 10:00:46 Tower kernel: RAX: ffffffffffffffda RBX: 00007ffe3629e0f0 RCX: 00001498a6bb1417
    Mar 25 10:00:46 Tower kernel: RDX: 00007ffe3629cf60 RSI: 000000005000943a RDI: 0000000000000004
    Mar 25 10:00:46 Tower kernel: RBP: 0000000000000001 R08: 1999999999999999 R09: 0000000000000000
    Mar 25 10:00:46 Tower kernel: R10: 00001498a6c33ac0 R11: 0000000000000246 R12: 0000000000000002
    Mar 25 10:00:46 Tower kernel: R13: 00007ffe3629cf60 R14: 0000000000000004 R15: 00007ffe3629e138
    Mar 25 10:00:46 Tower kernel: Modules linked in: md_mod xt_MASQUERADE iptable_nat nf_nat ip6table_filter ip6_tables iptable_filter ip_tables igb i2c_algo_bit sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel a>Mar 25 10:00:46 Tower kernel: CR2: 0000000000000000
    Mar 25 10:00:46 Tower kernel: ---[ end trace 7e2ba978fa8ddfc8 ]---
    Mar 25 10:00:46 Tower kernel: RIP: 0010:strcmp+0x2/0x1a
    Mar 25 10:00:46 Tower kernel: Code: ef 4c 89 c0 c3 48 89 f8 48 89 fa 8a 0a 48 89 d7 48 8d 52 01 84 c9 75 f3 31 d2 8a 0c 16 88 0c 17 48 ff c2 84 c9 75 f3 c3 31 c0 <8a> 14 07 3a 14 06 74 06 19 c0 83 c8 01 c3 48 ff c0 84 d2 75 eb 31
    Mar 25 10:00:46 Tower kernel: RSP: 0018:ffffc900017f7d78 EFLAGS: 00010246
    Mar 25 10:00:46 Tower kernel: RAX: 0000000000000000 RBX: ffff8881559c5800 RCX: 0000000000000000
    Mar 25 10:00:46 Tower kernel: RDX: 0000000000000001 RSI: ffffffff81d9d9b5 RDI: 0000000000000000
    Mar 25 10:00:46 Tower kernel: RBP: fffffffffffffffe R08: 0000000000000001 R09: 0000000000004000
    Mar 25 10:00:46 Tower kernel: R10: ffffc900017f7f08 R11: 0000000000000000 R12: 0000000000000006
    Mar 25 10:00:46 Tower kernel: R13: ffff88816a984000 R14: 0000000000000000 R15: 00000000000037b4
    Mar 25 10:00:46 Tower kernel: FS:  00001498a6a99d40(0000) GS:ffff88903fa00000(0000) knlGS:0000000000000000
    Mar 25 10:00:46 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Mar 25 10:00:46 Tower kernel: CR2: 0000000000000000 CR3: 0000000163544001 CR4: 00000000001706f0
    Mar 25 10:00:48 Tower emhttpd: shcmd (2307): mkdir -p /mnt/scratch

     

    tower-diagnostics-20210325-1500.zip




    User Feedback

    Recommended Comments

    There are no comments to display.



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...