July 5, 20232 yr Hey all, Since updating to 6.12.1 on Monday 3rd, the server has been crashing. One on Monday night, about 12 hours after update. Once again on Tuesday night, and just a few hours ago on Wednesday. The latest one I caught on a persistent syslog but nothing obvious to my eye... Any ideas? Jul 5 14:10:14 HomeServer kernel: XFS (md2p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x182390d3e dinode Jul 5 14:10:14 HomeServer kernel: XFS (md2p1): Unmount and run xfs_repair Jul 5 14:10:14 HomeServer kernel: XFS (md2p1): First 128 bytes of corrupted metadata buffer: Jul 5 14:10:14 HomeServer kernel: 00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. Jul 5 14:10:14 HomeServer kernel: 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jul 5 14:10:14 HomeServer kernel: 00000020: 64 9d bf 76 28 76 78 cd 64 9d bf 76 2d a5 f7 45 d..v(vx.d..v-..E Jul 5 14:10:14 HomeServer kernel: 00000030: 64 9d bf 76 2d a5 f7 45 00 00 00 00 00 a1 09 7d d..v-..E.......} Jul 5 14:10:14 HomeServer kernel: 00000040: 00 00 00 00 00 00 0a 11 00 00 00 00 00 00 00 01 ................ Jul 5 14:10:14 HomeServer kernel: 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 e2 4a c9 ef .............J.. Jul 5 14:10:14 HomeServer kernel: 00000060: ff ff ff ff 94 26 61 31 00 00 00 00 00 00 00 0a .....&a1........ Jul 5 14:10:14 HomeServer kernel: 00000070: 00 00 00 1a 00 1d cb b7 00 00 00 00 00 00 00 00 ................ Jul 5 14:10:49 HomeServer kernel: XFS (md2p1): Metadata corruption detected at xfs_dinode_verify+0xa0/0x732 [xfs], inode 0x182390d3e dinode Jul 5 14:10:49 HomeServer kernel: XFS (md2p1): Unmount and run xfs_repair Jul 5 14:10:49 HomeServer kernel: XFS (md2p1): First 128 bytes of corrupted metadata buffer: Jul 5 14:10:49 HomeServer kernel: 00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00 IN.............. Jul 5 14:10:49 HomeServer kernel: 00000010: 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 ................ Jul 5 14:10:49 HomeServer kernel: 00000020: 64 9d bf 76 28 76 78 cd 64 9d bf 76 2d a5 f7 45 d..v(vx.d..v-..E Jul 5 14:10:49 HomeServer kernel: 00000030: 64 9d bf 76 2d a5 f7 45 00 00 00 00 00 a1 09 7d d..v-..E.......} Jul 5 14:10:49 HomeServer kernel: 00000040: 00 00 00 00 00 00 0a 11 00 00 00 00 00 00 00 01 ................ Jul 5 14:10:49 HomeServer kernel: 00000050: 00 00 00 02 00 00 00 00 00 00 00 00 e2 4a c9 ef .............J.. Jul 5 14:10:49 HomeServer kernel: 00000060: ff ff ff ff 94 26 61 31 00 00 00 00 00 00 00 0a .....&a1........ Jul 5 14:10:49 HomeServer kernel: 00000070: 00 00 00 1a 00 1d cb b7 00 00 00 00 00 00 00 00 ................ Jul 5 15:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 5 15:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 5 15:05:25 HomeServer nginx: 2023/07/05 15:05:25 [error] 10087#10087: *216480 open() "/usr/local/emhttp/_proxy_auth" failed (2: No such file or directory) while sending to client, client: 172.17.0.4, server: , request: "GET /_proxy_auth?acs_token=eyJhbGciOiJkaXIiLCJlbmMiOiJBMTI4Q0JDLUhTMjU2In0..cC7cWBc3B4nEiqoPrbsR9w.rA5wILyQLA7Bu2EN2tKVGCHzbCxnwkQrajB-2Ol83Ao9T3FML9sS2_VNC-Cw-JET5hrZFslPe8ZAo_nhvI7oYG5kFe7pUXXLCNkNyb4O-qCmoDvx9oyt9TB5W7hzMnpGjgrBBWfpGD2bs78qHO_5WM-Sz6vlw7MsSTgpv6lpFrgR5Ql98sZDtjSu9sl4EKJ0.GJjxAfouN62G4Zd79X5bzQ HTTP/1.1", host: "unraid.james.am", referrer: "https://login.microsoftonline.com/" Jul 5 15:06:20 HomeServer flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update Jul 5 16:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 5 16:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 5 16:36:29 HomeServer webGUI: Successful login user root from 172.17.0.4 Jul 5 17:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 5 17:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 5 18:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 5 18:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Jul 5 19:07:09 HomeServer kernel: Linux version 6.1.34-Unraid (root@Develop) (gcc (GCC) 12.2.0, GNU ld version 2.40-slack151) #1 SMP PREEMPT_DYNAMIC Fri Jun 16 11:48:38 PDT 2023 Jul 5 19:07:09 HomeServer kernel: Command line: BOOT_IMAGE=/bzimage initrd=/bzroot i915.alpha_support=1 Jul 5 19:07:09 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' Jul 5 19:07:09 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' Jul 5 19:07:09 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' Jul 5 19:07:09 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' Jul 5 19:07:09 HomeServer kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 Jul 5 19:07:09 HomeServer kernel: x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 Jul 5 19:07:09 HomeServer kernel: x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. Jul 5 19:07:09 HomeServer kernel: signal: max sigframe size: 3376 Jul 5 19:07:09 HomeServer kernel: BIOS-provided physical RAM map: Dashed line where lockup was noticed. Disk LED and power LED were still on, hard drives making activity noises but otherwise nothing happening. VMs, docker and webUI all failing to connect. Edited July 5, 20232 yr by Jambo
July 6, 20232 yr Community Expert Check filesystem on disk2, also enable the syslog server and post that after a crash, together with the complete diagnostics.
July 7, 20232 yr Author On 7/6/2023 at 8:54 AM, JorgeB said: Check filesystem on disk2, also enable the syslog server and post that after a crash, together with the complete diagnostics. Thanks for the response - I just tried changing from macvlan to ipvlan as that had a mention of instability but no improvement as it froze again last night. The quote in the first post is from the preserved syslog, the line before the dash being the last line before freezing, the line after being the first line after starting up again. Diagnostics attached and starting filesystem check now. Log from last night's crash: Jul 6 14:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 14:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 15:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 15:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 16:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 16:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 17:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 17:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 18:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 18:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 19:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 19:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 20:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 20:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 21:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 21:00:02 HomeServer kernel: BTRFS info (device nvme0n1p1): relocating block group 3588542693376 flags data|raid1 Jul 6 21:00:03 HomeServer kernel: BTRFS info (device nvme0n1p1): found 23 extents, stage: move data extents Jul 6 21:00:03 HomeServer kernel: BTRFS info (device nvme0n1p1): found 20 extents, stage: update data pointers Jul 6 21:00:03 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 Jul 6 22:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: start -dusage=50 Jul 6 22:00:01 HomeServer kernel: BTRFS info (device nvme0n1p1): balance: ended with status: 0 --------------------------------------------------------------------------------------------------------------------- Jul 6 22:32:43 HomeServer kernel: Linux version 6.1.34-Unraid (root@Develop) (gcc (GCC) 12.2.0, GNU ld version 2.40-slack151) #1 SMP PREEMPT_DYNAMIC Fri Jun 16 11:48:38 PDT 2023 Jul 6 22:32:43 HomeServer kernel: Command line: BOOT_IMAGE=/bzimage initrd=/bzroot i915.alpha_support=1 Jul 6 22:32:43 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' Jul 6 22:32:43 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' Jul 6 22:32:43 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' Jul 6 22:32:43 HomeServer kernel: x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' Jul 6 22:32:43 HomeServer kernel: x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 Jul 6 22:32:43 HomeServer kernel: x86/fpu: xstate_offset[9]: 832, xstate_sizes[9]: 8 Jul 6 22:32:43 HomeServer kernel: x86/fpu: Enabled xstate features 0x207, context size is 840 bytes, using 'compacted' format. Jul 6 22:32:43 HomeServer kernel: signal: max sigframe size: 3376 Jul 6 22:32:43 HomeServer kernel: BIOS-provided physical RAM map: Jul 6 22:32:43 HomeServer kernel: BIOS-e820: [mem 0x0000000000000000-0x000000000009d3ff] usable Jul 6 22:32:43 HomeServer kernel: BIOS-e820: [mem 0x000000000009d400-0x000000000009ffff] reserved Jul 6 22:32:43 HomeServer kernel: BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved Jul 6 22:32:43 HomeServer kernel: BIOS-e820: [mem 0x0000000000100000-0x0000000009e01fff] usable Jul 6 22:32:43 HomeServer kernel: BIOS-e820: [mem 0x0000000009e02000-0x0000000009ffffff] reserved Again, nothing obvious homeserver-diagnostics-20230707-0926.zip
July 7, 20232 yr Community Expert Without anything logged it looks more like a hardware issue, did you take care of this: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173
July 7, 20232 yr Author 11 minutes ago, JorgeB said: Without anything logged it looks more like a hardware issue, did you take care of this: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=819173 Oh thanks, not seen that before. Will certainly disable C-States (although I think it's disabled already) and test lower speeds.
July 8, 20232 yr Author Still crashing after disabling C-States and on supported speeds for it's capacity and rank. Tried swapping out RAM and no improvement either.
July 9, 20232 yr Community Expert Post the persistent syslog to see if there's something there, but if it's hardware relate it likely won't be.
July 9, 20232 yr Author Just froze once again. Took out half the RAM and still froze, put the other half in and froze again. Both of those freezes were extremely fast too, happening within 3 hours of turning on. All four sticks of RAM seems to last the longest but certainly still failing. syslog attached showing last two power cycles, freeze happened at 13:14 local time, line 3551. syslog (1)
July 9, 20232 yr Community Expert Unfortunately there's nothing relevant logged, this usually points to a hardware issue, one thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
July 10, 20232 yr Author I downgraded back to 6.11.4 and that seems to have solved it. No issues since
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.