• Server becomes unresponsive


    schale01
    • Urgent

    Server becomes unresponsive after upgrading to 6.12.3.  

    Issue was also observed on 6.12.2 and 6.12.1 as well as 6.11 

    reverting to 6.10 does resolve the issue. 

     

    Issue occurs with little to no server load after an 12 to 24 hours. 

    Issue appears to be repeatable when running docker container transmission with wireguard network interface. 

    Attached debug kernal log.  

     

    2023-07-21T23:25:52-04:00 Carbon14 kernel: mdcmd (31): set md_num_stripes 1280
    2023-07-21T23:25:52-04:00 Carbon14 kernel: mdcmd (32): set md_queue_limit 80
    2023-07-21T23:25:52-04:00 Carbon14 kernel: mdcmd (33): set md_sync_limit 5
    2023-07-21T23:25:52-04:00 Carbon14 kernel: mdcmd (34): set md_write_method
    2023-07-21T23:25:52-04:00 Carbon14 kernel: mdcmd (35): start STOPPED
    2023-07-21T23:25:52-04:00 Carbon14 kernel: unraid: allocating 25990K for 1280 stripes (5 disks)
    2023-07-21T23:25:52-04:00 Carbon14 kernel: md1p1: running, size: 1953514552 blocks
    2023-07-21T23:25:52-04:00 Carbon14 kernel: md2p1: running, size: 1953514552 blocks
    2023-07-21T23:25:52-04:00 Carbon14 kernel: md3p1: running, size: 1465138552 blocks
    2023-07-21T23:25:53-04:00 Carbon14 kernel: SGI XFS with ACLs, security attributes, no debug enabled
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md1p1): Mounting V5 Filesystem
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md1p1): Starting recovery (logdev: internal)
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md1p1): Ending recovery (logdev: internal)
    2023-07-21T23:25:53-04:00 Carbon14 kernel: xfs filesystem being mounted at /mnt/disk1 supports timestamps until 2038 (0x7fffffff)
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md2p1): Mounting V5 Filesystem
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md2p1): Starting recovery (logdev: internal)
    2023-07-21T23:25:53-04:00 Carbon14 kernel: XFS (md2p1): Ending recovery (logdev: internal)
    2023-07-21T23:25:53-04:00 Carbon14 kernel: xfs filesystem being mounted at /mnt/disk2 supports timestamps until 2038 (0x7fffffff)
    2023-07-21T23:25:54-04:00 Carbon14 kernel: XFS (md3p1): Mounting V5 Filesystem
    2023-07-21T23:25:54-04:00 Carbon14 kernel: XFS (md3p1): Starting recovery (logdev: internal)
    2023-07-21T23:25:54-04:00 Carbon14 kernel: XFS (md3p1): Ending recovery (logdev: internal)
    2023-07-21T23:25:54-04:00 Carbon14 kernel: xfs filesystem being mounted at /mnt/disk3 supports timestamps until 2038 (0x7fffffff)
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device nvme0n1p1): using crc32c (crc32c-intel) checksum algorithm
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device nvme0n1p1): using free space tree
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device nvme0n1p1): enabling ssd optimizations
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device nvme0n1p1: state M): turning on async discard
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device sdf1): using crc32c (crc32c-intel) checksum algorithm
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device sdf1): using free space tree
    2023-07-21T23:25:55-04:00 Carbon14 kernel: BTRFS info (device sdf1: state M): turning on async discard
    2023-07-21T23:26:02-04:00 Carbon14 kernel: loop2: detected capacity change from 0 to 41943040
    2023-07-21T23:26:02-04:00 Carbon14 kernel: BTRFS: device fsid 8923b5f4-d5f2-4678-b493-0ab90b7ce0be devid 1 transid 9774336 /dev/loop2 scanned by mount (9644)
    2023-07-21T23:26:02-04:00 Carbon14 kernel: BTRFS info (device loop2): using crc32c (crc32c-intel) checksum algorithm
    2023-07-21T23:26:02-04:00 Carbon14 kernel: BTRFS info (device loop2): using free space tree
    2023-07-21T23:26:02-04:00 Carbon14 kernel: BTRFS info (device loop2): start tree-log replay
    2023-07-21T23:26:02-04:00 Carbon14 kernel: BTRFS info (device loop2): checking UUID tree
    2023-07-21T23:26:06-04:00 Carbon14 kernel: Bridge firewalling registered
    2023-07-21T23:26:06-04:00 Carbon14 kernel: Initializing XFRM netlink socket
    2023-07-21T23:26:11-04:00 Carbon14 kernel: loop3: detected capacity change from 0 to 2097152
    2023-07-21T23:26:11-04:00 Carbon14 kernel: BTRFS: device fsid 5c6f5ada-a0aa-4d1b-b311-15fc6b18fb7b devid 1 transid 2064 /dev/loop3 scanned by mount (10305)
    2023-07-21T23:26:11-04:00 Carbon14 kernel: BTRFS info (device loop3): using crc32c (crc32c-intel) checksum algorithm
    2023-07-21T23:26:11-04:00 Carbon14 kernel: BTRFS info (device loop3): using free space tree
    2023-07-21T23:26:12-04:00 Carbon14 kernel: tun: Universal TUN/TAP device driver, 1.6
    2023-07-21T23:26:12-04:00 Carbon14 kernel: mdcmd (36): check correct
    2023-07-21T23:26:12-04:00 Carbon14 kernel: md: recovery thread: check P ...
    2023-07-21T23:26:13-04:00 Carbon14 kernel: L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.
    2023-07-21T23:26:16-04:00 Carbon14 kernel: eth0: renamed from veth6751040
    2023-07-21T23:26:26-04:00 Carbon14 kernel: mdcmd (37): nocheck cancel
    2023-07-21T23:26:26-04:00 Carbon14 kernel: md: recovery thread: exit status: -4
    2023-07-21T23:28:05-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered blocking state
    2023-07-21T23:28:05-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered disabled state
    2023-07-21T23:28:05-04:00 Carbon14 kernel: device veth4aac129 entered promiscuous mode
    2023-07-21T23:28:05-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered blocking state
    2023-07-21T23:28:05-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered forwarding state
    2023-07-21T23:28:05-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered disabled state
    2023-07-21T23:28:07-04:00 Carbon14 kernel: eth0: renamed from vethcf3b69f
    2023-07-21T23:28:07-04:00 Carbon14 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth4aac129: link becomes ready
    2023-07-21T23:28:07-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered blocking state
    2023-07-21T23:28:07-04:00 Carbon14 kernel: br-dd02cf62dc1e: port 1(veth4aac129) entered forwarding state
    2023-07-21T23:28:07-04:00 Carbon14 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): br-dd02cf62dc1e: link becomes ready
     

     

    carbon14-diagnostics-20230722-0112.zip




    User Feedback

    Recommended Comments

    Update.  

     

    Moved all Dockers to bridge interference br1.  Docker settings are on ipvlan

     

    Server still becomes completely unresponsive.  

     

    Added another interface on eth2, previously disabled.  Confirmed able to access through this interface but server becomes unresponsive and this interface is also non-responsive.  

     

    Rolled back to 6.10.3 and system has been stable since.   Any recommendations or further logs needed I can reapply upgrade and gather.  

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.