prytzen Posted October 27, 2020 Share Posted October 27, 2020 I have had this issue a few times in the past and do not know how to resolve it. Symptoms: The server will become unresponsive to anything and everything and has to be hard booted. When it comes back up and i start the array (not set to start automatically), it will immediately lock up again. Problem: I have diagnosed it to being an issue with the CACHE drive being corrupted in some way. In the past I have just swapped out the cache drive with another one, formatted it and let it rebuild (I have backups and don't really lose anything). It is just a pain. I would rather not have to go through that process and just fix the issue. I believe that the file system on the cache drive is getting corrupted but don't know why. It is a BTFRS drive, which I understand can be temperamental. Want: To fix the error on the current cache drive and get it back up and running as is. Or, if possible, use the drive that is current there and having issues, reformat it to XFS and utilize that as the cache drives in hopes that XFS will be a more stable filesystem. Below are logs that i was able to capture when starting the array that show the cache drive (sdh) having issues. Oct 27 15:22:21 UNRAID emhttpd: shcmd (505): mkdir -p /mnt/cache Oct 27 15:22:21 UNRAID emhttpd: cache uuid: f5cd23ae-1a51-4349-a01d-65b2335b7b7c Oct 27 15:22:21 UNRAID emhttpd: cache TotDevices: 1 Oct 27 15:22:21 UNRAID emhttpd: cache NumDevices: 1 Oct 27 15:22:21 UNRAID emhttpd: cache NumFound: 1 Oct 27 15:22:21 UNRAID emhttpd: cache NumMissing: 0 Oct 27 15:22:21 UNRAID emhttpd: cache NumMisplaced: 0 Oct 27 15:22:21 UNRAID emhttpd: cache NumExtra: 0 Oct 27 15:22:21 UNRAID emhttpd: cache LuksState: 0 Oct 27 15:22:21 UNRAID emhttpd: shcmd (506): mount -t btrfs -o noatime,nodiratime -U f5cd23ae-1a51-4349-a01d-65b2335b7b7c /mnt/cache Oct 27 15:22:21 UNRAID kernel: BTRFS info (device sdh1): disk space caching is enabled Oct 27 15:22:21 UNRAID kernel: BTRFS info (device sdh1): has skinny extents Oct 27 15:22:21 UNRAID kernel: BTRFS info (device sdh1): enabling ssd optimizations Oct 27 15:22:21 UNRAID kernel: BTRFS info (device sdh1): start tree-log replay Oct 27 15:22:21 UNRAID kernel: ------------[ cut here ]------------ Oct 27 15:22:21 UNRAID kernel: kernel BUG at fs/btrfs/extent-tree.c:6862! Oct 27 15:22:21 UNRAID kernel: invalid opcode: 0000 [#1] SMP PTI Oct 27 15:22:21 UNRAID kernel: CPU: 0 PID: 13450 Comm: mount Tainted: P O 4.19.107-Unraid #1 Oct 27 15:22:21 UNRAID kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Extreme4, BIOS P2.90 07/11/2013 Oct 27 15:22:21 UNRAID kernel: RIP: 0010:__btrfs_free_extent+0x558/0x90f Oct 27 15:22:21 UNRAID kernel: Code: ba c8 1a 00 00 e9 36 fd ff ff 48 81 3c 24 ff 00 00 00 76 18 48 8b 74 24 48 48 89 ef e8 a8 79 ff ff 3b 84 24 b8 00 00 00 74 02 <0f> 0b 48 83 7c 24 48 00 8b 45 40 74 0c 39 44 24 08 0f 84 0c ff ff Oct 27 15:22:21 UNRAID kernel: RSP: 0018:ffffc9000210f828 EFLAGS: 00010202 Oct 27 15:22:21 UNRAID kernel: RAX: 0000000004ca6098 RBX: 0000000000000001 RCX: ffff8883e3100000 Oct 27 15:22:21 UNRAID kernel: RDX: 0000000000002000 RSI: 0000000000000002 RDI: ffff8883e3126230 Oct 27 15:22:21 UNRAID kernel: RBP: ffff8883e25a3070 R08: ffffc9000210f7a8 R09: ffffc9000210f7b0 Oct 27 15:22:21 UNRAID kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Oct 27 15:22:21 UNRAID kernel: R13: ffff8883e7a69f08 R14: ffff8883e3126230 R15: 0000000072ac2000 Oct 27 15:22:21 UNRAID kernel: FS: 000015436552b480(0000) GS:ffff88843f400000(0000) knlGS:0000000000000000 Oct 27 15:22:21 UNRAID kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 27 15:22:21 UNRAID kernel: CR2: 000015311239b3a0 CR3: 00000003e4388003 CR4: 00000000001606f0 Oct 27 15:22:21 UNRAID kernel: Call Trace: Oct 27 15:22:21 UNRAID kernel: __btrfs_run_delayed_refs+0xa77/0xbf4 Oct 27 15:22:21 UNRAID kernel: ? generic_bin_search.constprop.0+0x163/0x1ab Oct 27 15:22:21 UNRAID kernel: btrfs_run_delayed_refs+0x5d/0x16d Oct 27 15:22:21 UNRAID kernel: ? btrfs_set_path_blocking+0x20/0x44 Oct 27 15:22:21 UNRAID kernel: btrfs_commit_transaction+0x54/0x79c Oct 27 15:22:21 UNRAID kernel: btrfs_recover_log_trees+0x332/0x39b Oct 27 15:22:21 UNRAID kernel: ? replay_one_dir_item+0x15c/0x15c Oct 27 15:22:21 UNRAID kernel: open_ctree+0x182e/0x1c94 Oct 27 15:22:21 UNRAID kernel: btrfs_mount_root+0x388/0x509 Oct 27 15:22:21 UNRAID kernel: ? pcpu_alloc_area+0xeb/0xff Oct 27 15:22:21 UNRAID kernel: ? cpumask_next+0x15/0x16 Oct 27 15:22:21 UNRAID kernel: ? pcpu_alloc+0x37d/0x489 Oct 27 15:22:21 UNRAID kernel: mount_fs+0x10/0x77 Oct 27 15:22:21 UNRAID kernel: vfs_kern_mount+0x66/0x100 Oct 27 15:22:21 UNRAID kernel: btrfs_mount+0x16b/0x7f9 Oct 27 15:22:21 UNRAID kernel: ? pcpu_block_update_hint_alloc+0x73/0x183 Oct 27 15:22:21 UNRAID kernel: ? pcpu_chunk_relocate+0x8/0x5a Oct 27 15:22:21 UNRAID kernel: ? pcpu_alloc_area+0xeb/0xff Oct 27 15:22:21 UNRAID kernel: ? pcpu_next_unpop+0x31/0x3c Oct 27 15:22:21 UNRAID kernel: ? cpumask_next+0x15/0x16 Oct 27 15:22:21 UNRAID kernel: ? pcpu_alloc+0x37d/0x489 Oct 27 15:22:21 UNRAID kernel: ? mount_fs+0x10/0x77 Oct 27 15:22:21 UNRAID kernel: ? btrfs_remount+0x3eb/0x3eb Oct 27 15:22:21 UNRAID kernel: mount_fs+0x10/0x77 Oct 27 15:22:21 UNRAID kernel: vfs_kern_mount+0x66/0x100 Oct 27 15:22:21 UNRAID kernel: do_mount+0x7b5/0xa22 Oct 27 15:22:21 UNRAID kernel: ? _copy_from_user+0x2f/0x4d Oct 27 15:22:21 UNRAID kernel: ? memdup_user+0x3a/0x57 Oct 27 15:22:21 UNRAID kernel: ksys_mount+0x71/0x99 Oct 27 15:22:21 UNRAID kernel: __x64_sys_mount+0x1c/0x1f Oct 27 15:22:21 UNRAID kernel: do_syscall_64+0x57/0xf2 Oct 27 15:22:21 UNRAID kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Oct 27 15:22:21 UNRAID kernel: RIP: 0033:0x15436566525a Oct 27 15:22:21 UNRAID kernel: Code: 48 8b 0d 39 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 06 7c 0c 00 f7 d8 64 89 01 48 Oct 27 15:22:21 UNRAID kernel: RSP: 002b:00007fffca32e4c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5 Oct 27 15:22:21 UNRAID kernel: RAX: ffffffffffffffda RBX: 00001543657ecf64 RCX: 000015436566525a Oct 27 15:22:21 UNRAID kernel: RDX: 000000000040d500 RSI: 000000000040d6c0 RDI: 000000000040d740 Oct 27 15:22:21 UNRAID kernel: RBP: 000000000040d2f0 R08: 0000000000000000 R09: 0000000000000003 Oct 27 15:22:21 UNRAID kernel: R10: 0000000000000c00 R11: 0000000000000202 R12: 0000000000000000 Oct 27 15:22:21 UNRAID kernel: R13: 000000000040d740 R14: 000000000040d500 R15: 000000000040d2f0 Oct 27 15:22:21 UNRAID kernel: Modules linked in: xfs nfsd lockd grace sunrpc md_mod nct6775 hwmon_vid bonding nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) crc32_pclmul intel_rapl_perf intel_uncore pcbc aesni_intel aes_x86_64 glue_helper crypto_simd ghash_clmulni_intel cryptd kvm_intel kvm intel_cstate coretemp drm_kms_helper crct10dif_pclmul intel_powerclamp crc32c_intel x86_pkg_temp_thermal drm mxm_wmi tg3 syscopyarea sysfillrect sysimgblt fb_sys_fops agpgart i2c_i801 i2c_core ahci libahci pcc_cpufreq video button ie31200_edac wmi backlight Oct 27 15:22:21 UNRAID kernel: ---[ end trace b4dd9901cbdd221e ]--- Oct 27 15:22:21 UNRAID kernel: RIP: 0010:__btrfs_free_extent+0x558/0x90f Oct 27 15:22:21 UNRAID emhttpd: shcmd (506): exit status: 139 Oct 27 15:22:21 UNRAID emhttpd: /mnt/cache mount error: No file system Oct 27 15:22:21 UNRAID emhttpd: shcmd (507): umount /mnt/cache Oct 27 15:22:21 UNRAID kernel: Code: ba c8 1a 00 00 e9 36 fd ff ff 48 81 3c 24 ff 00 00 00 76 18 48 8b 74 24 48 48 89 ef e8 a8 79 ff ff 3b 84 24 b8 00 00 00 74 02 <0f> 0b 48 83 7c 24 48 00 8b 45 40 74 0c 39 44 24 08 0f 84 0c ff ff Oct 27 15:22:21 UNRAID kernel: RSP: 0018:ffffc9000210f828 EFLAGS: 00010202 Oct 27 15:22:21 UNRAID kernel: RAX: 0000000004ca6098 RBX: 0000000000000001 RCX: ffff8883e3100000 Oct 27 15:22:21 UNRAID kernel: RDX: 0000000000002000 RSI: 0000000000000002 RDI: ffff8883e3126230 Oct 27 15:22:21 UNRAID kernel: RBP: ffff8883e25a3070 R08: ffffc9000210f7a8 R09: ffffc9000210f7b0 Oct 27 15:22:21 UNRAID kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Oct 27 15:22:21 UNRAID kernel: R13: ffff8883e7a69f08 R14: ffff8883e3126230 R15: 0000000072ac2000 Oct 27 15:22:21 UNRAID kernel: FS: 000015436552b480(0000) GS:ffff88843f400000(0000) knlGS:0000000000000000 Oct 27 15:22:21 UNRAID kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 27 15:22:21 UNRAID kernel: CR2: 000015311239b3a0 CR3: 00000003e4388003 CR4: 00000000001606f0 Oct 27 15:22:21 UNRAID root: umount: /mnt/cache: not mounted. Oct 27 15:22:21 UNRAID emhttpd: shcmd (507): exit status: 32 Oct 27 15:22:21 UNRAID emhttpd: shcmd (508): rmdir /mnt/cache Oct 27 15:22:21 UNRAID emhttpd: shcmd (509): sync Quote Link to comment
JorgeB Posted October 28, 2020 Share Posted October 28, 2020 Full diags might give some clues, but btrfs going corrupt multiple times without a reason usually indicates a hardware problem, like bad RAM. 1 Quote Link to comment
prytzen Posted October 30, 2020 Author Share Posted October 30, 2020 And how would i go about determining that? Quote Link to comment
trurl Posted October 30, 2020 Share Posted October 30, 2020 On 10/28/2020 at 3:44 AM, JorgeB said: Full diags might give some clues Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread. Quote Link to comment
prytzen Posted October 31, 2020 Author Share Posted October 31, 2020 attached unraid-diagnostics-20201031-1054.zip Quote Link to comment
trurl Posted October 31, 2020 Share Posted October 31, 2020 Have you done memtest? Quote Link to comment
prytzen Posted November 1, 2020 Author Share Posted November 1, 2020 24 hours in and no errors. I do not believe that the RAM is an issue at this point. Is there a way that i can force a new CACHE drive to be XFS rather than BTRFS? Quote Link to comment
trurl Posted November 1, 2020 Share Posted November 1, 2020 Stop array, reduce number of cache slots to 1, reformat cache as XFS Quote Link to comment
prytzen Posted November 1, 2020 Author Share Posted November 1, 2020 That should be easy since the array is already stopped. Everytime I have done this in the past it automatically formats as btrfs. Is there an option that I am missing somewhere? Quote Link to comment
itimpi Posted November 1, 2020 Share Posted November 1, 2020 1 hour ago, prytzen said: That should be easy since the array is already stopped. Everytime I have done this in the past it automatically formats as btrfs. Is there an option that I am missing somewhere? As long as you only have a single slot allocated to cache you can click on the drive on the Main tab to set the format to be used. The default is btrfs (and if you have more than 1 cache slot allowed then this is the only option). With 1 slot you can pick other options. Quote Link to comment
prytzen Posted November 1, 2020 Author Share Posted November 1, 2020 Great. Thanks. Looks like i am up and running. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.