• Unraid 6.5.3RC1 failed unmount


    Jerky_san
    • Closed

    un 10 11:46:28 Tower kernel: ------------[ cut here ]------------
    Jun 10 11:46:28 Tower kernel: WARNING: CPU: 2 PID: 8996 at fs/namespace.c:1169 cleanup_mnt+0x11/0x5c
    Jun 10 11:46:28 Tower kernel: Modules linked in: md_mod xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables tun xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs bonding edac_mce_amd crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper igb mpt3sas wmi_bmof mxm_wmi cryptd ptp i2c_piix4 pps_core i2c_algo_bit i2c_core raid_class scsi_transport_sas ccp ahci nvme libahci nvme_core wmi button acpi_cpufreq [last unloaded: kvm]
    Jun 10 11:46:28 Tower kernel: CPU: 2 PID: 8996 Comm: umount Tainted: G      D W       4.14.41-unRAID #1
    Jun 10 11:46:28 Tower kernel: Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 4011 04/19/2018
    Jun 10 11:46:28 Tower kernel: task: ffff880575bc8000 task.stack: ffffc9000d4d8000
    Jun 10 11:46:28 Tower kernel: RIP: 0010:cleanup_mnt+0x11/0x5c
    Jun 10 11:46:28 Tower kernel: RSP: 0018:ffffc9000d4dbef8 EFLAGS: 00010202
    Jun 10 11:46:28 Tower kernel: RAX: 0000000000000001 RBX: ffff88048325b780 RCX: 0000000000000000
    Jun 10 11:46:28 Tower kernel: RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffffff81ca04c0
    Jun 10 11:46:28 Tower kernel: RBP: 0000000000000000 R08: ffffffffffff0000 R09: 000000000000ffff
    Jun 10 11:46:28 Tower kernel: R10: 0000000000000000 R11: ffff880575bc8080 R12: ffffffff81fd8390
    Jun 10 11:46:28 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    Jun 10 11:46:28 Tower kernel: FS:  0000153bb2bc9780(0000) GS:ffff88101ec80000(0000) knlGS:0000000000000000
    Jun 10 11:46:28 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Jun 10 11:46:28 Tower kernel: CR2: 000014b7b5803be0 CR3: 0000000fd409e000 CR4: 00000000003406e0
    Jun 10 11:46:28 Tower kernel: Call Trace:
    Jun 10 11:46:28 Tower kernel: task_work_run+0x77/0x8b
    Jun 10 11:46:28 Tower kernel: exit_to_usermode_loop+0x46/0x75
    Jun 10 11:46:28 Tower kernel: do_syscall_64+0xf7/0xfe
    Jun 10 11:46:28 Tower kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    Jun 10 11:46:28 Tower kernel: RIP: 0033:0x153bb1e3bec7
    Jun 10 11:46:28 Tower kernel: RSP: 002b:00007fffc7d72988 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
    Jun 10 11:46:28 Tower kernel: RAX: 0000000000000000 RBX: 00000000006072b0 RCX: 0000153bb1e3bec7
    Jun 10 11:46:28 Tower kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000060a3a0
    Jun 10 11:46:28 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000153bb1e85a30
    Jun 10 11:46:28 Tower kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 000000000060a3a0
    Jun 10 11:46:28 Tower kernel: R13: 0000153bb29b4ed0 R14: 0000000000607490 R15: 0000000000000000
    Jun 10 11:46:28 Tower kernel: Code: c8 49 8b 14 24 48 8b 0c cd c0 83 b9 81 03 5c 0a 04 eb d9 89 d8 5b 5d 41 5c c3 53 48 89 fb 48 83 c7 48 e8 b2 ff ff ff 85 c0 74 02 <0f> 0b 48 83 bb 28 01 00 00 00 74 08 48 89 df e8 a3 d0 00 00 48
    Jun 10 11:46:28 Tower kernel: ---[ end trace 97d33381545a92b2 ]---

    tower-diagnostics-20180610-1153.zip




    User Feedback

    Recommended Comments

    You have a problem with your cache pool, it's failing to add a second device.

     

    Jun 10 11:38:02 Tower kernel: BTRFS info (device sdq1): relocating block group 1821087105024 flags data
    Jun 10 11:38:03 Tower kernel: BTRFS info (device sdq1): found 1412 extents
    Jun 10 11:38:05 Tower kernel: BTRFS info (device sdq1): found 1412 extents
    Jun 10 11:38:05 Tower kernel: BTRFS info (device sdq1): relocating block group 1820013363200 flags metadata
    Jun 10 11:38:07 Tower kernel: BTRFS: Transaction aborted (error -28)

     

    Usually easiest to backup and redo the pool, you can use this thread if you need help with the backup.

     

    You're also having issues with your NVMe device:

     

    Jun 10 11:07:39 Tower kernel: nvme nvme0: I/O 768 QID 7 timeout, aborting
    Jun 10 11:07:39 Tower kernel: nvme nvme0: I/O 769 QID 7 timeout, aborting
    Jun 10 11:07:39 Tower kernel: nvme nvme0: I/O 770 QID 7 timeout, aborting
    Jun 10 11:07:39 Tower kernel: nvme nvme0: I/O 771 QID 7 timeout, aborting
    Jun 10 11:07:39 Tower kernel: nvme nvme0: I/O 772 QID 7 timeout, aborting
    Jun 10 11:08:09 Tower kernel: nvme nvme0: I/O 768 QID 7 timeout, reset controller
    Jun 10 11:08:40 Tower kernel: nvme nvme0: I/O 16 QID 0 timeout, reset controller
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Device not ready; aborting reset
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Abort status: 0x7
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Abort status: 0x7
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Abort status: 0x7
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Abort status: 0x7
    Jun 10 11:10:11 Tower kernel: nvme nvme0: Abort status: 0x7
    Jun 10 11:11:12 Tower kernel: nvme nvme0: Device not ready; aborting reset
    Jun 10 11:11:12 Tower kernel: nvme nvme0: Removing after probe failure status: -19
    Jun 10 11:11:12 Tower kernel: nvme0n1: detected capacity change from 1024209543168 to 0

     

    Edited by johnnie.black
    Link to comment

    Yeah I am trying to get my big cache drive replaced. Only made it to 50TBW before it started to die. Also yeah.. removed the M.2 after I saw the abort messages during the next restart. Is that what caused a crash though? I was stopping the array and when it crashed. Never had that one happen before.

    Edited by Jerky_san
    Link to comment
    8 minutes ago, Jerky_san said:

    Is that what caused a crash though?

    I believe so, there were previous btrfs related call traces and the last one was when trying to unmount cache.

    Link to comment
    2 minutes ago, johnnie.black said:

    I believe so, there were previous btrfs related call traces and the last one was when trying to unmount cache.

    Welp that makes sense then.. Don't get why the drive had such a short life when it was nearly a tera..

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.