February 17, 20224 yr Hello, I've been running into a few issues with Unraid recently: server will randomly crash with kernel panic stuck on the console screen xfs cache pool performs what I imagine is many many more writes that normal machine check error occurs generally 1-2 days before an imminent crash, i'm not entirely sure how to review the actual MCE events, but they are being logged as I did install the nerdpack thing logs may show that there is some corruption in the xfs configuration I was doing some cleaning of the server and upon reinserting my two nvidia GPUs, one no longer shows up in Unraid (I cannot confirm if they are back in their original slots, however they are in the same two slots I was using, but may be swapped) (both cards turn on with fans and light indicators) I've attached the diagnostics in case that helps. I'm hoping to install my new cache drive which is a bit larger, but cannot guarantee the server will stay online/stable during implementation. Otherwise, when the server is working, everything works great. EDIT - I should mention that I had prior crashing issues, but they were mostly resolved after 1) switching to a recommended USB flash drive and 2) disabling my NIC teaming grandline-diagnostics-20220216-2048.zip Edited February 17, 20224 yr by sirace100
February 17, 20224 yr Hello @sirace100, not sure if this is the cause or the consequence of your crashes but your syslog is spammed by filesystem issues on your disk5. Feb 6 05:17:44 GRANDLINE kernel: XFS (md5): Metadata corruption detected at xfs_dinode_verify+0xa3/0x581 [xfs], inode 0x30438b8f6 dinode Feb 6 05:17:44 GRANDLINE kernel: XFS (md5): Unmount and run xfs_repair Feb 6 05:17:44 GRANDLINE kernel: XFS (md5): First 128 bytes of corrupted metadata buffer: Feb 6 05:17:44 GRANDLINE kernel: 00000000: 49 4e 81 b0 03 02 00 00 00 00 00 63 00 00 00 64 IN.........c...d Feb 6 05:17:44 GRANDLINE kernel: 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ Feb 6 05:17:44 GRANDLINE kernel: 00000020: 5f 0c c6 17 39 fb 2a af 5a 6c a9 43 26 d6 0f 94 _...9.*.Zl.C&... Feb 6 05:17:44 GRANDLINE kernel: 00000030: 61 e6 4a 14 1c 04 78 b3 00 00 00 00 04 20 72 cc a.J...x...... r. Feb 6 05:17:44 GRANDLINE kernel: 00000040: 00 00 00 00 00 00 42 08 00 00 00 00 00 00 00 01 ......B......... Feb 6 05:17:44 GRANDLINE kernel: 00000050: 00 00 24 01 00 00 00 00 00 00 00 00 cb 48 c3 09 ..$..........H.. Feb 6 05:17:44 GRANDLINE kernel: 00000060: ff ff ff ff 9f 66 b8 11 00 00 00 00 00 00 00 55 .....f.........U Feb 6 05:17:44 GRANDLINE kernel: 00000070: 00 00 00 01 00 04 df 16 00 00 00 00 00 00 00 00 ................ You should fix the filesystem.
February 25, 20224 yr Author Update on this: I have run the repair on the btrfs file system as you indicated The missing nvidia GPU is reference in the log with this: "the nvidia quadro k2000 gpu installed in this system is supported through the nvidia 470.xx legacy drivers. please visit http://www.nvidia.com/object/unix.html for more information. The 510.54 NVIDIA driver will ignore this GPU." So I guess I'll need to replace/upgrade this card. Still looking into a potential ram issue.
February 25, 20224 yr 5 hours ago, sirace100 said: I have run the repair on the btrfs file system as you indicated On 2/17/2022 at 8:04 AM, ChatNoir said: XFS (md5): Unmount and run xfs_repair I guess that's a typo on your part ?
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.