Jump to content

konaboy

Members
  • Posts

    21
  • Joined

  • Last visited

Posts posted by konaboy

  1. On 1/2/2024 at 4:24 AM, JorgeB said:

    Try this, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot"

    nvme_core.default_ps_max_latency_us=0 pcie_aspm=off

    e.g.:

    append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off


    Reboot and see if it makes a difference.

    Thanks - between the above and updating firmware on the nvme, I'm all set.    PS:  I ran for a few days after removing the above, (after updated firmware), and that also seemed to fix the problem.    I put this back in as a belts and suspenders solution - just to be safe.

    • Like 2
  2. 3 hours ago, Vr2Io said:

    System have suspend / resume ?

    No.   In fact I was in the gui at the time that logged.   I was able to update the firmware on that nvme.    I’m going to put back some of the shares that were on it and see if it occurs again.   

  3. I recently added a third NVME SSD drive to my machine (Sandisk Corp WD Black SN850X NVMe SSD - 4 TB).  I have two existing nvme SSD drives that I’ve been running for more than a year with no issues: WD_BLACK SN750 SE 1TB each.


    After a few days of no problems, I’m now getting errors in the syslog (excerpt below).  On the Main tab on Unraid Gui, I see the drives also listed in the Unassigned devices.

     

    I’ve added the following to the syslinux configuration: nvme_core.default_ps_max_latency_us=0.   I’ll see if that stops the issue.  I am also moving everything off of the new nvme drive, and will see if there’s a firmware update for it.  WD firmware update must be done in windows, so will passthrough to one of my windows VMs. 

    While I'm working through the firmware update, Is there anything else I should be looking at to determine the cause of the problem?  My signature has some of my system information.

     

    Dec 31 10:09:14 Tower kernel: nvme 0000:03:00.0: platform quirk: setting simple suspend
    Dec 31 10:09:14 Tower kernel: nvme nvme3: pci function 0000:03:00.0
    Dec 31 10:09:14 Tower kernel: nvme 0000:04:00.0: platform quirk: setting simple suspend
    Dec 31 10:09:14 Tower kernel: nvme nvme4: pci function 0000:04:00.0
    Dec 31 10:09:14 Tower kernel: nvme nvme4: 16/0/0 default/read/poll queues
    Dec 31 10:09:14 Tower kernel: nvme4n2: p1
    Dec 31 10:09:14 Tower kernel: nvme nvme3: 16/0/0 default/read/poll queues
    Dec 31 10:09:14 Tower kernel: nvme3n2: p1
    Dec 31 10:09:16 Tower kernel: XFS (nvme0n1p1): log I/O error -5
    Dec 31 10:09:16 Tower kernel: XFS (nvme0n1p1): Filesystem has been shut down due to log error (0x2).
    Dec 31 10:09:16 Tower kernel: XFS (nvme0n1p1): Please unmount the filesystem and rectify the problem(s).
    Dec 31 10:09:16 Tower rsyslogd: file '/mnt/user/system/syslog-127.0.0.1.log'[13] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: Input/output error [v8.2102.0 try https://www.rsyslog.com/e/2027 ]
    Dec 31 10:09:16 Tower rsyslogd: file '/mnt/user/system/syslog-127.0.0.1.log': open error: Input/output error [v8.2102.0 try https://www.rsyslog.com/e/2433 ]
    Dec 31 10:09:16 Tower kernel: docker0: port 11(veth7a24e9e) entered disabled state
    Dec 31 10:09:16 Tower kernel: veth78e2538: renamed from eth0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 305624 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 3, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 5, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 6, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 7, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 8, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 9, flush 0, corrupt 0, gen 0
    Dec 31 10:09:35 Tower kernel: I/O error, dev loop2, sector 284456 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:35 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 10, flush 0, corrupt 0, gen 0
    Dec 31 10:09:50 Tower kernel: blk_print_req_error: 1408 callbacks suppressed
    Dec 31 10:09:50 Tower kernel: I/O error, dev loop2, sector 285536 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:50 Tower kernel: btrfs_dev_stat_inc_and_print: 1408 callbacks suppressed
    Dec 31 10:09:50 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 1419, flush 0, corrupt 0, gen 0
    Dec 31 10:09:50 Tower kernel: I/O error, dev loop2, sector 285536 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2
    Dec 31 10:09:50 Tower kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 0, rd 1420, flush 0, corrupt 0, gen 0

     

     

     

    tower-diagnostics-20231231-1011.zip

  4. Solved - turns out that using Powertop and enabling IEEE 802.3az (Energy Efficient Ethernet), can cause issues with Intel I225 & I226 (2.5 GHz) controllers.   My Gigabyte board has an I225 Ethernet Controller.

     

    For now I turned off all Powertop power savings.  And the above problem is gone.     I found reference to this in the first post here: Reduce power consumption with powertop - User Customizations - Unraid

    • Note: EEE can cause problems with 2.5G Intel Ethernet.
  5. Thanks JorgeB.

    I'm still having network issues if my router reboots (or when I remove/reinsert ethernet cable as a test).   I've now set up to boot with unraid GUI on the server all of the time.  So that if I'm in this state, I can try various things.   I HAVE NOT been able to get networking working again when in this state.   Here's what I've done so far:
    a. downgraded from 6.12.4 to 6.12.1 - I no longer have a 2 min delay at boot for nginx server startup.

    b. removed my ASRock Intel Arc A380 card (I suspected this might be causing issues)

    c. No longer am using the VM that had that card passed through (every time I shutdown the windows VM, the VM would hang, and then eventually the Unraid server would go into this state of no network access.

     

    Below is the current syslog excerpt.  Once the ethernet cable was unplugged and replugged back in - I executed "/etc/rc.d/rc.inet1  restart".  This did not complete, and no output to the syslog.  I still had to hard power off to reboot.

    My next steps to try are:
    a. downgrading to 6.11.4.  based on this thread: [6.12] Unraid webui stop responding then Nginx crash - Stable Releases - Unraid.   Similar but not exactly the same as I have.

    b. possibly ordering and installing a PCIe gigabit ethernet card.  Just in case the internal 2.5Gb ethernet on my system board is causing the issue.

    c. Trying out a new USB drive from scratch with a fresh Unraid install of 6.12.4.  Setting everything up manually to match my current drives and dockers etc.    Just in case there's something in my current setup/config.
    d. other?

     

    Are there any other commands I can try to reset/restart my network on the server when this happens?  

    thanks much.

     

    excerpt of current syslog during issue (created via 'script' and "tail -f /var/logs/syslog"

    Script started on 2023-11-17 19:23:22-05:00 [TERM="xterm-256color" TTY="/dev/pts/0" COLUMNS="130" LINES="43"]
    [?2004h]0;root@Tower: /mnt/user/Pub[01;32mroot@Tower[00m:[01;34m/mnt/user/Pub[00m# tail -f /var/log/syslog
    [?2004l
    Nov 17 19:22:01 Tower sSMTP[11600]: SSL connection using TLS_AES_256_GCM_SHA384
    Nov 17 19:22:03 Tower sSMTP[11600]: Sent mail for [email protected] (221 2.0.0 closing connection p1-20020ac84081000000b0041b83654af9sm936875qtl.30 - gsmtp) uid=0 username=root outbytes=788
    Nov 17 19:22:04 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.2102.0" x-pid="11610" x-info="https://www.rsyslog.com"] start
    Nov 17 19:22:04 Tower Parity Check Tuning: No restart information present
    Nov 17 19:22:10 Tower tips.and.tweaks: Tweaks Applied
    Nov 17 19:22:10 Tower unassigned.devices: Mounting 'Auto Mount' Remote Shares...
    Nov 17 19:22:10 Tower unassigned.devices: Using Gateway '192.168.1.1' to Ping Remote Shares.
    Nov 17 19:22:10 Tower unassigned.devices: Waiting 5 secs before mounting Remote Shares...
    
    ** unplugged ethernet cable
    Nov 17 19:23:27 Tower root: ACPI action up is not defined
    Nov 17 19:23:28 Tower root: ACPI action up is not defined
    Nov 17 19:23:45 Tower kernel: igc 0000:06:00.0 eth0: NIC Link is Down
    Nov 17 19:23:45 Tower kernel: br0: port 1(eth0) entered disabled state
    Nov 17 19:23:49 Tower ntpd[1534]: Deleting interface #1 br0, 192.168.1.82#123, interface stats: received=6, sent=6, dropped=0, active_time=184 secs
    Nov 17 19:23:49 Tower ntpd[1534]: 66.85.78.80 local addr 192.168.1.82 -> <null>
    
    ** plugged ethernet cable back in abt attempting /etc/rc.d/rc.inet1  restart (stop and start also) - all didn't finish, so Ctrl-Z to stop
    Nov 17 19:24:35 Tower Parity Check Tuning: Automatic Correcting Parity-Check detected
    Nov 17 19:24:59 Tower emhttpd: cmd: /usr/local/emhttp/plugins/user.scripts/startScript.sh /tmp/user.scripts/tmpScripts/PingUnraidTower/script 
    Nov 17 19:25:06 Tower root: ACPI action up is not defined
    
    ** this started up on its own:
    Nov 17 19:26:23 Tower emhttpd: cmd: /usr/local/emhttp/plugins/user.scripts/startScript.sh /tmp/user.scripts/tmpScripts/PingUnraidTower/script 
    Nov 17 19:26:38 Tower rc.inet1: ip -4 route flush default dev br0
    Nov 17 19:26:38 Tower rc.inet1: ip -4 addr flush dev br0
    Nov 17 19:26:38 Tower rc.inet1: ip -4 route flush dev br0
    Nov 17 19:26:38 Tower rc.inet1: ip -4 addr flush dev eth0
    Nov 17 19:26:38 Tower rc.inet1: ip -4 route flush dev eth0
    Nov 17 19:26:38 Tower rc.inet1: ip link set br0 down
    Nov 17 19:26:38 Tower rc.inet1: ip link set eth0 promisc off nomaster
    Nov 17 19:26:38 Tower kernel: br0: port 1(eth0) entered disabled state
    Nov 17 19:26:38 Tower kernel: device eth0 left promiscuous mode
    Nov 17 19:26:38 Tower rc.inet1: ip link set br0 down
    Nov 17 19:26:38 Tower rc.inet1: ip link del br0
    Nov 17 19:26:38 Tower rc.inet1: ip link set lo down
    Nov 17 19:26:39 Tower rc.inet1: ip -6 addr add ::1/128 dev lo
    Nov 17 19:26:39 Tower rc.inet1: ip link set lo up
    Nov 17 19:26:39 Tower rc.inet1: ip link add name br0 type bridge stp_state 0 forward_delay 0 nf_call_iptables 1 nf_call_ip6tables 1 nf_call_arptables 1
    Nov 17 19:26:39 Tower rc.inet1: ip link set br0 up
    Nov 17 19:26:39 Tower rc.inet1: ip link set eth0 down
    Nov 17 19:26:39 Tower rc.inet1: ip -4 addr flush dev eth0
    Nov 17 19:26:39 Tower rc.inet1: ip link set eth0 promisc on master br0 up
    Nov 17 19:26:39 Tower kernel: br0: port 1(eth0) entered blocking state
    Nov 17 19:26:39 Tower kernel: br0: port 1(eth0) entered disabled state
    Nov 17 19:26:39 Tower kernel: device eth0 entered promiscuous mode
    
    Nov 17 19:26:52 Tower root: Total Spundown: 0
    Nov 17 19:26:58 Tower root: ACPI action up is not defined
    Nov 17 19:27:01 Tower root: ACPI action up is not defined
    Nov 17 19:27:57 Tower root: ACPI action up is not defined
    Nov 17 19:28:36 Tower root: ACPI action up is not defined
    Nov 17 19:28:48 Tower root: ACPI action up is not defined
    Nov 17 19:28:49 Tower root: ACPI action up is not defined
    Nov 17 19:28:49 Tower root: ACPI action up is not defined
    Nov 17 19:28:50 Tower root: ACPI action up is not defined
    Nov 17 19:28:52 Tower root: ACPI action left is not defined
    Nov 17 19:28:52 Tower root: ACPI action left is not defined
    Nov 17 19:28:52 Tower root: ACPI action left is not defined
    Nov 17 19:28:52 Tower root: ACPI action left is not defined
    Nov 17 19:28:53 Tower root: ACPI action left is not defined
    Nov 17 19:28:56 Tower inetd[28338]: telnet/tcp (2): bind: Address already in use
    Nov 17 19:28:58 Tower root: ACPI action up is not defined
    Nov 17 19:28:59 Tower root: ACPI action up is not defined
    Nov 17 19:29:10 Tower root: ACPI action up is not defined
    Nov 17 19:29:13 Tower inetd[28840]: telnet/tcp (2): bind: Address already in use
    Nov 17 19:29:19 Tower root: ACPI action up is not defined
    Nov 17 19:29:20 Tower root: ACPI action up is not defined
    Nov 17 19:29:20 Tower root: ACPI action up is not defined
    Nov 17 19:29:20 Tower root: ACPI action up is not defined
    Nov 17 19:29:21 Tower root: ACPI action up is not defined
    Nov 17 19:29:21 Tower root: ACPI action up is not defined
    Nov 17 19:29:21 Tower root: ACPI action up is not defined
    Nov 17 19:29:22 Tower root: ACPI action up is not defined
    Nov 17 19:29:33 Tower inotifywait[10782]: Watches established.
    Nov 17 19:30:09 Tower nginx: 2023/11/17 19:30:09 [crit] 9730#9730: *4552 connect() to unix:/var/run/syslog.sock failed (2: No such file or directory) while connecting to upstream, client: 127.0.0.1, server: , request: "GET /webterminal/syslog/ HTTP/1.1", upstream: "http://unix:/var/run/syslog.sock:/", host: "localhost", referrer: "http://localhost/Main"
    Nov 17 19:30:23 Tower kernel: mdcmd (38): nocheck cancel
    Nov 17 19:30:23 Tower kernel: md: recovery thread: exit status: -4
    Nov 17 19:30:31 Tower Parity Check Tuning: Automatic Correcting Parity-Check canceled (0 errors)
    Nov 17 19:30:31 Tower Parity Check Tuning: Elapsed Time 8 min, 24 sec, Runtime 8 min, 24 sec, Increments 1, Average Speed 31.7 GB/s
    Nov 17 19:30:31 Tower Parity Check Tuning: Send notification: Automatic Correcting Parity-Check canceled (0 errors): Elapsed Time 8 min, 24 sec, Runtime 8 min, 24 sec, Increments 1, Average Speed 31.7 GB/s (type=alert link=/Settings/Scheduler)
    Nov 17 19:30:31 Tower sSMTP[32534]: Unable to locate smtp.gmail.com
    Nov 17 19:30:31 Tower sSMTP[32534]: Cannot open smtp.gmail.com:465
    Nov 17 19:30:32 Tower kernel: mdcmd (39): nocheck cancel
    Nov 17 19:30:33 Tower emhttpd: Spinning up all drives...
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdj
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdk
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdh
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdg
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdd
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sde
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdb
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdf
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdc
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/nvme1n1
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/nvme0n1
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdl
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sdi
    Nov 17 19:30:33 Tower emhttpd: read SMART /dev/sda
    Nov 17 19:30:33 Tower emhttpd: Stopping services...
    ^Z
    [1]+  Stopped(SIGTSTP)        tail -f /var/log/syslog
    [?2004h]0;root@Tower: /mnt/user/Pub[01;32mroot@Tower[00m:[01;34m/mnt/user/Pub[00m# exit
    [?2004l
    exit
    There are stopped jobs.
    [?2004h]0;root@Tower: /mnt/user/Pub[01;32mroot@Tower[00m:[01;34m/mnt/user/Pub[00m# exit
    [?2004l
    exit
    
    Script done on 2023-11-17 19:31:12-05:00 [COMMAND_EXIT_CODE="1"]

     

  6. Woke up this morning to another instance of this happening.  To recap: when it happens, I can not ssh into Unraid, nor can I access web gui via http or https, nor can I access any of the dockers and VM.    attaching system log (syslog-127.0.0.1.log).  It looks like it starts at "Nov 15 04:44:24".

     

    edit: at Nov 15 05:03:52 have this warning (not sure if connected):

    Nov 15 05:03:52 Tower kernel: ------------[ cut here ]------------
    Nov 15 05:03:52 Tower kernel: WARNING: CPU: 1 PID: 14163 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_nat vhost_net tun vhost vhost_iotlb tap macvlan xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) tcp_diag inet_diag ipmi_devintf kvmgt mdev i915 drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper drm intel_gtt agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops iptable_mangle xt_addrtype iptable_raw xt_comment xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark ip6table_mangle ip6table_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables efivarfs af_packet 8021q garp mrp bridge stp llc bonding tls intel_rapl_msr
    Nov 15 05:03:52 Tower kernel: intel_rapl_common iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd btusb btrtl rapl btbcm mei_hdcp mei_pxp btintel i2c_i801 intel_cstate gigabyte_wmi mxm_wmi intel_wmi_thunderbolt wmi_bmof nvme bluetooth mpt3sas mei_me intel_uncore i2c_smbus igc nvme_core i2c_core raid_class mei ahci scsi_transport_sas intel_pch_thermal libahci ecdh_generic input_leds joydev ecc led_class thermal fan video tpm_crb tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core button acpi_pad acpi_tad unix
    Nov 15 05:03:52 Tower kernel: CPU: 1 PID: 14163 Comm: kworker/u32:4 Tainted: P     U     O       6.1.49-Unraid #1
    Nov 15 05:03:52 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z490 AORUS ULTRA/Z490 AORUS ULTRA, BIOS F21 11/23/2021
    Nov 15 05:03:52 Tower kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    Nov 15 05:03:52 Tower kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01
    Nov 15 05:03:52 Tower kernel: RSP: 0018:ffffc9000007cd98 EFLAGS: 00010202
    Nov 15 05:03:52 Tower kernel: RAX: 0000000000000001 RBX: ffff8885f3bf5600 RCX: 5740b2312c11ea5c
    Nov 15 05:03:52 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8885f3bf5600
    Nov 15 05:03:52 Tower kernel: RBP: 0000000000000001 R08: f4273c8dbe3488f4 R09: df03e9e00259a57e
    Nov 15 05:03:52 Tower kernel: R10: 520af3f0314d3cd2 R11: ffffc9000007cd60 R12: ffffffff82a11d00
    Nov 15 05:03:52 Tower kernel: R13: 000000000001c742 R14: ffff888102fd7400 R15: 0000000000000000
    Nov 15 05:03:52 Tower kernel: FS:  0000000000000000(0000) GS:ffff88907f240000(0000) knlGS:0000000000000000
    Nov 15 05:03:52 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Nov 15 05:03:52 Tower kernel: CR2: 0000030751e89130 CR3: 000000014ca64003 CR4: 00000000007726e0
    Nov 15 05:03:52 Tower kernel: PKRU: 55555554
    Nov 15 05:03:52 Tower kernel: Call Trace:
    Nov 15 05:03:52 Tower kernel: <IRQ>
    Nov 15 05:03:52 Tower kernel: ? __warn+0xab/0x122
    Nov 15 05:03:52 Tower kernel: ? report_bug+0x109/0x17e
    Nov 15 05:03:52 Tower kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: ? handle_bug+0x41/0x6f
    Nov 15 05:03:52 Tower kernel: ? exc_invalid_op+0x13/0x60
    Nov 15 05:03:52 Tower kernel: ? asm_exc_invalid_op+0x16/0x20
    Nov 15 05:03:52 Tower kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat]
    Nov 15 05:03:52 Tower kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
    Nov 15 05:03:52 Tower kernel: nf_hook_slow+0x3a/0x96
    Nov 15 05:03:52 Tower kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    Nov 15 05:03:52 Tower kernel: NF_HOOK.constprop.0+0x79/0xd9
    Nov 15 05:03:52 Tower kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    Nov 15 05:03:52 Tower kernel: __netif_receive_skb_one_core+0x77/0x9c
    Nov 15 05:03:52 Tower kernel: process_backlog+0x8c/0x116
    Nov 15 05:03:52 Tower kernel: __napi_poll.constprop.0+0x28/0x124
    Nov 15 05:03:52 Tower kernel: net_rx_action+0x159/0x24f
    Nov 15 05:03:52 Tower kernel: __do_softirq+0x126/0x288
    Nov 15 05:03:52 Tower kernel: do_softirq+0x7f/0xab
    Nov 15 05:03:52 Tower kernel: </IRQ>
    Nov 15 05:03:52 Tower kernel: <TASK>
    Nov 15 05:03:52 Tower kernel: __local_bh_enable_ip+0x4c/0x6b
    Nov 15 05:03:52 Tower kernel: netif_rx+0x52/0x5a
    Nov 15 05:03:52 Tower kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
    Nov 15 05:03:52 Tower kernel: ? _raw_spin_unlock+0x14/0x29
    Nov 15 05:03:52 Tower kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
    Nov 15 05:03:52 Tower kernel: process_one_work+0x1a8/0x295
    Nov 15 05:03:52 Tower kernel: worker_thread+0x18b/0x244
    Nov 15 05:03:52 Tower kernel: ? rescuer_thread+0x281/0x281
    Nov 15 05:03:52 Tower kernel: kthread+0xe4/0xef
    Nov 15 05:03:52 Tower kernel: ? kthread_complete_and_exit+0x1b/0x1b
    Nov 15 05:03:52 Tower kernel: ret_from_fork+0x1f/0x30
    Nov 15 05:03:52 Tower kernel: </TASK>
    Nov 15 05:03:52 Tower kernel: ---[ end trace 0000000000000000 ]---

     

    A little after 09:30 I started doing some commands on the console command line.   looking at IP, and killing nginx.   I attempted shutdown, however it hangs doing a diagnostic, so had to hard power off.   the attached zipped syslog is after reboot.

     

    syslog-127.0.0.1.log tower-syslog-20231115-1539.zip

  7. Here's the two tests I performed today: 

     

    This does not fail

    1. Booted in Safe mode (no plugins, no GUI) and started array.
    2. ethernet on br0 at 192.168.1.82
    3. Unplugged ethernet cable at 18:37:09
    4. Plugged ethernet cable back in at 18:38:00

    You can see in the syslog excerpt that after cable is plugged back in, Unraid successfully rebinds to 192.168.1.82

    Nov 14 18:37:09 Tower kernel: igc 0000:09:00.0 eth0: NIC Link is Down
    Nov 14 18:37:09 Tower kernel: bond0: (slave eth0): link status definitely down, disabling slave
    Nov 14 18:37:09 Tower kernel: device eth0 left promiscuous mode
    Nov 14 18:37:09 Tower kernel: bond0: now running without any active interface!
    Nov 14 18:37:09 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:37:10 Tower dhcpcd[1490]: br0: carrier lost
    Nov 14 18:37:10 Tower avahi-daemon[13662]: Withdrawing address record for 192.168.1.82 on br0.
    Nov 14 18:37:10 Tower avahi-daemon[13662]: Leaving mDNS multicast group on interface br0.IPv4 with address 192.168.1.82.
    Nov 14 18:37:10 Tower avahi-daemon[13662]: Interface br0.IPv4 no longer relevant for mDNS.
    Nov 14 18:37:10 Tower dhcpcd[1490]: br0: deleting route to 192.168.1.0/24
    Nov 14 18:37:10 Tower dhcpcd[1490]: br0: deleting default route via 192.168.1.1
    Nov 14 18:37:12 Tower ntpd[2126]: Deleting interface #2 br0, 192.168.1.82#123, interface stats: received=8, sent=8, dropped=0, active_time=246 secs
    Nov 14 18:37:12 Tower ntpd[2126]: 66.70.172.17 local addr 192.168.1.82 -> <null>
    Nov 14 18:38:00 Tower kernel: igc 0000:09:00.0 eth0: NIC Link is Up 2500 Mbps Full Duplex, Flow Control: RX/TX
    Nov 14 18:38:01 Tower dhcpcd[1490]: br0: carrier acquired
    Nov 14 18:38:01 Tower kernel: bond0: (slave eth0): link status definitely up, 2500 Mbps full duplex
    Nov 14 18:38:01 Tower kernel: bond0: (slave eth0): making interface the new active one
    Nov 14 18:38:01 Tower kernel: device eth0 entered promiscuous mode
    Nov 14 18:38:01 Tower kernel: bond0: active interface up!
    Nov 14 18:38:01 Tower kernel: br0: port 1(bond0) entered blocking state
    Nov 14 18:38:01 Tower kernel: br0: port 1(bond0) entered forwarding state
    Nov 14 18:38:01 Tower dhcpcd[1490]: br0: rebinding lease of 192.168.1.82
    Nov 14 18:38:06 Tower dhcpcd[1490]: br0: probing for an IPv4LL address
    Nov 14 18:38:06 Tower dhcpcd[1490]: br0: DHCP lease expired
    Nov 14 18:38:06 Tower dhcpcd[1490]: br0: soliciting a DHCP lease
    Nov 14 18:38:11 Tower dhcpcd[1490]: br0: using IPv4LL address 169.254.243.138
    Nov 14 18:38:11 Tower avahi-daemon[13662]: Joining mDNS multicast group on interface br0.IPv4 with address 169.254.243.138.
    Nov 14 18:38:11 Tower dhcpcd[1490]: br0: adding route to 169.254.0.0/16
    Nov 14 18:38:11 Tower avahi-daemon[13662]: New relevant interface br0.IPv4 for mDNS.
    Nov 14 18:38:11 Tower avahi-daemon[13662]: Registering new address record for 169.254.243.138 on br0.IPv4.
    Nov 14 18:38:11 Tower dhcpcd[1490]: br0: adding default route
    Nov 14 18:38:11 Tower network: hook services: interface=br0, reason=IPV4LL, protocol=ipv4ll
    Nov 14 18:38:11 Tower network: update services: 30s
    Nov 14 18:38:11 Tower dhcpcd[1490]: br0: offered 192.168.1.82 from 192.168.1.1
    Nov 14 18:38:11 Tower dhcpcd[1490]: br0: probing address 192.168.1.82/24
    Nov 14 18:38:14 Tower emhttpd: spinning down /dev/sdh
    Nov 14 18:38:14 Tower emhttpd: spinning down /dev/sdd
    Nov 14 18:38:14 Tower emhttpd: spinning down /dev/sdf
    Nov 14 18:38:17 Tower dhcpcd[1490]: br0: leased 192.168.1.82 for 86400 seconds
    Nov 14 18:38:17 Tower avahi-daemon[13662]: Registering new address record for 192.168.1.82 on br0.IPv4.
    Nov 14 18:38:17 Tower dhcpcd[1490]: br0: adding route to 192.168.1.0/24
    Nov 14 18:38:17 Tower dhcpcd[1490]: br0: changing default route via 192.168.1.1
    Nov 14 18:38:17 Tower network: hook services: interface=br0, reason=BOUND, protocol=dhcp
    Nov 14 18:38:17 Tower network: update services: 30s
    Nov 14 18:38:17 Tower avahi-daemon[13662]: Withdrawing address record for 169.254.243.138 on br0.
    Nov 14 18:38:17 Tower avahi-daemon[13662]: Leaving mDNS multicast group on interface br0.IPv4 with address 169.254.243.138.
    Nov 14 18:38:17 Tower avahi-daemon[13662]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.1.82.
    Nov 14 18:38:17 Tower dhcpcd[1490]: br0: deleting route to 169.254.0.0/16
    Nov 14 18:38:17 Tower network: hook services: interface=br0, reason=IPV4LL, protocol=ipv4ll
    Nov 14 18:38:17 Tower network: update services: 30s
    Nov 14 18:38:19 Tower ntpd[2126]: Listen normally on 3 br0 192.168.1.82:123
    Nov 14 18:38:19 Tower ntpd[2126]: new interface(s) found: waking up resolver



    This fails:

    1. Booted into Unraid OS GUI Mode, started array.  Docker and VM Manager set to not start
    2. ethernet on br0 at 192.168.1.82
    3. Unplugged ethernet cable at 18:48:46
    4. Plugged ethernet cable back in at 18:50:02

    You can see in the syslog excerpt below that an entire different sequence of events is happening.  With the end result that 192.168.1.82 is not active.

    Nov 14 18:48:46 Tower kernel: igc 0000:09:00.0 eth0: NIC Link is Down
    Nov 14 18:48:46 Tower kernel: bond0: (slave eth0): link status definitely down, disabling slave
    Nov 14 18:48:46 Tower kernel: device eth0 left promiscuous mode
    Nov 14 18:48:46 Tower kernel: bond0: now running without any active interface!
    Nov 14 18:48:46 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:48:47 Tower dhcpcd[1486]: br0: carrier lost
    Nov 14 18:48:47 Tower dhcpcd[1486]: br0: deleting route to 192.168.1.0/24
    Nov 14 18:48:47 Tower dhcpcd[1486]: br0: deleting default route via 192.168.1.1
    Nov 14 18:48:50 Tower ntpd[2122]: Deleting interface #1 br0, 192.168.1.82#123, interface stats: received=10, sent=10, dropped=0, active_time=338 secs
    Nov 14 18:48:50 Tower ntpd[2122]: 199.182.221.110 local addr 192.168.1.82 -> <null>
    Nov 14 18:50:02 Tower rc.inet1: dhcpcd -q -k -4 br0
    Nov 14 18:50:02 Tower dhcpcd[30946]: sending signal ALRM to pid 1485
    Nov 14 18:50:02 Tower dhcpcd[30946]: waiting for pid 1485 to exit
    Nov 14 18:50:02 Tower dhcpcd[1486]: received SIGALRM, releasing
    Nov 14 18:50:02 Tower dhcpcd[1486]: br0: removing interface
    Nov 14 18:50:03 Tower rc.inet1: ip -4 addr flush dev br0
    Nov 14 18:50:03 Tower rc.inet1: ip -4 route flush dev br0
    Nov 14 18:50:03 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:50:03 Tower rc.inet1: ip -4 route flush dev bond0
    Nov 14 18:50:03 Tower rc.inet1: ip -4 addr flush dev eth0
    Nov 14 18:50:03 Tower rc.inet1: ip -4 route flush dev eth0
    Nov 14 18:50:03 Tower rc.inet1: ip link set br0 down
    Nov 14 18:50:03 Tower rc.inet1: ip link set bond0 nomaster
    Nov 14 18:50:03 Tower kernel: device bond0 left promiscuous mode
    Nov 14 18:50:03 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:50:03 Tower rc.inet1: ip link set br0 down
    Nov 14 18:50:03 Tower rc.inet1: ip link del br0
    Nov 14 18:50:03 Tower rc.inet1: ip link set eth0 nomaster
    Nov 14 18:50:03 Tower kernel: bond0: (slave eth0): Releasing backup interface
    Nov 14 18:50:03 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:50:03 Tower rc.inet1: ip link del bond0
    Nov 14 18:50:03 Tower kernel: bond0 (unregistering): Released all slaves
    Nov 14 18:50:03 Tower rc.inet1: ip link set lo down
    Nov 14 18:50:04 Tower rc.inet1: ip -6 addr add ::1/128 dev lo
    Nov 14 18:50:04 Tower rc.inet1: ip link set lo up
    Nov 14 18:50:04 Tower rc.inet1: ip link add name bond0 type bond mode 1 miimon 100
    Nov 14 18:50:04 Tower rc.inet1: ip link set bond0 up
    Nov 14 18:50:04 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:50:04 Tower rc.inet1: ip link set eth0 down
    Nov 14 18:50:04 Tower rc.inet1: ip link set eth0 master bond0 up type bond_slave
    Nov 14 18:50:04 Tower kernel: bond0: (slave eth0): Error -19 calling set_mac_address
    Nov 14 18:50:04 Tower rc.inet1: ip link set name bond0 type bond primary eth0
    Nov 14 18:50:05 Tower vnstatd[6425]: Interface "br0" disabled.
    Nov 14 18:50:07 Tower rc.inet1: ip link add name br0 type bridge stp_state 0 forward_delay 0
    Nov 14 18:50:07 Tower rc.inet1: ip link set br0 up
    Nov 14 18:50:07 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:50:07 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:50:07 Tower rc.inet1: ip link set bond0 master br0 up
    Nov 14 18:50:07 Tower kernel: br0: port 1(bond0) entered blocking state
    Nov 14 18:50:07 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:50:07 Tower kernel: device bond0 entered promiscuous mode
    Nov 14 18:50:07 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:50:08 Tower rc.inet1: polling up to 60 sec for DHCP server on interface br0
    Nov 14 18:50:08 Tower rc.inet1: timeout 60 dhcpcd -w -q -n -p -t 10 -h Tower -4 br0
    Nov 14 18:50:08 Tower dhcpcd[31402]: dhcpcd-9.4.1 starting
    Nov 14 18:50:08 Tower dhcpcd[31405]: DUID 00:04:03:c0:02:18:04:4d:05:29:33:06:cd:07:00:08:00:09
    Nov 14 18:50:08 Tower dhcpcd[31405]: br0: waiting for carrier
    Nov 14 18:50:10 Tower vnstatd[6425]: Interface "br0" enabled.
    Nov 14 18:50:17 Tower inotifywait[16421]: Watches established.
    Nov 14 18:51:08 Tower dhcpcd[31405]: received SIGTERM, stopping
    Nov 14 18:51:08 Tower dhcpcd[31405]: br0: removing interface
    Nov 14 18:51:08 Tower rc.inet1: can't obtain IP address, continue polling in background on interface br0
    Nov 14 18:51:08 Tower rc.inet1: dhcpcd -b -q -n -p -t 10 -h Tower -4 br0
    Nov 14 18:51:08 Tower dhcpcd[980]: dhcpcd-9.4.1 starting
    Nov 14 18:51:08 Tower dhcpcd[983]: DUID 00:04:03:c0:02:18:04:4d:05:29:33:06:cd:07:00:08:00:09
    Nov 14 18:51:08 Tower rc.inet1: ip link set br0 up
    Nov 14 18:51:08 Tower dhcpcd[983]: br0: waiting for carrier
    Nov 14 18:51:08 Tower root: Total Spundown: 0
    Nov 14 18:52:29 Tower root: ACPI action down is not defined
    Nov 14 18:52:29 Tower root: ACPI action up is not defined
    Nov 14 18:52:29 Tower root: ACPI action down is not defined
    Nov 14 18:52:29 Tower root: ACPI action up is not defined
    Nov 14 18:52:29 Tower root: ACPI action down is not defined
    Nov 14 18:53:55 Tower kernel: usb 1-11.3: USB disconnect, device number 7
    Nov 14 18:53:55 Tower acpid: input device has been disconnected, fd 5
    Nov 14 18:53:55 Tower acpid: input device has been disconnected, fd 6
    Nov 14 18:53:55 Tower acpid: input device has been disconnected, fd 7
    Nov 14 18:54:16 Tower kernel: usb 1-11.3: new full-speed USB device number 9 using xhci_hcd
    Nov 14 18:54:16 Tower kernel: input: Logitech USB Receiver as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.3/1-11.3:1.0/0003:046D:C52B.0006/input/input9
    Nov 14 18:54:16 Tower kernel: hid-generic 0003:046D:C52B.0006: input,hidraw2: USB HID v1.11 Keyboard [Logitech USB Receiver] on usb-0000:00:14.0-11.3/input0
    Nov 14 18:54:16 Tower kernel: input: Logitech USB Receiver Mouse as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.3/1-11.3:1.1/0003:046D:C52B.0007/input/input10
    Nov 14 18:54:16 Tower kernel: input: Logitech USB Receiver Consumer Control as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.3/1-11.3:1.1/0003:046D:C52B.0007/input/input11
    Nov 14 18:54:16 Tower kernel: input: Logitech USB Receiver System Control as /devices/pci0000:00/0000:00:14.0/usb1/1-11/1-11.3/1-11.3:1.1/0003:046D:C52B.0007/input/input12
    Nov 14 18:54:16 Tower kernel: hid-generic 0003:046D:C52B.0007: input,hiddev98,hidraw3: USB HID v1.11 Mouse [Logitech USB Receiver] on usb-0000:00:14.0-11.3/input1
    Nov 14 18:54:16 Tower kernel: hid-generic 0003:046D:C52B.0008: hiddev99,hidraw4: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:00:14.0-11.3/input2
    Nov 14 18:54:59 Tower rc.inet1: dhcpcd -q -k -4 br0
    Nov 14 18:54:59 Tower dhcpcd[7944]: sending signal ALRM to pid 982
    Nov 14 18:54:59 Tower dhcpcd[7944]: waiting for pid 982 to exit
    Nov 14 18:54:59 Tower dhcpcd[983]: received SIGALRM, releasing
    Nov 14 18:54:59 Tower dhcpcd[983]: br0: removing interface
    Nov 14 18:55:00 Tower rc.inet1: ip -4 addr flush dev br0
    Nov 14 18:55:00 Tower rc.inet1: ip -4 route flush dev br0
    Nov 14 18:55:00 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:55:00 Tower rc.inet1: ip -4 route flush dev bond0
    Nov 14 18:55:00 Tower rc.inet1: ip -4 addr flush dev eth0
    Nov 14 18:55:00 Tower rc.inet1: ip -4 route flush dev eth0
    Nov 14 18:55:00 Tower rc.inet1: ip link set br0 down
    Nov 14 18:55:00 Tower rc.inet1: ip link set bond0 nomaster
    Nov 14 18:55:00 Tower kernel: device bond0 left promiscuous mode
    Nov 14 18:55:00 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:55:00 Tower rc.inet1: ip link set br0 down
    Nov 14 18:55:00 Tower rc.inet1: ip link del br0
    Nov 14 18:55:00 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:55:00 Tower rc.inet1: ip link del bond0
    Nov 14 18:55:00 Tower kernel: bond0 (unregistering): Released all slaves
    Nov 14 18:55:00 Tower rc.inet1: ip link set lo down
    Nov 14 18:55:00 Tower vnstatd[6425]: Interface "br0" disabled.
    Nov 14 18:55:00 Tower vnstatd[6425]: Interface "bond0" disabled.
    Nov 14 18:55:01 Tower rc.inet1: ip -6 addr add ::1/128 dev lo
    Nov 14 18:55:01 Tower rc.inet1: ip link set lo up
    Nov 14 18:55:01 Tower rc.inet1: ip link add name bond0 type bond mode 1 miimon 100
    Nov 14 18:55:01 Tower rc.inet1: ip link set bond0 up
    Nov 14 18:55:01 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:55:01 Tower rc.inet1: ip link set eth0 down
    Nov 14 18:55:01 Tower rc.inet1: ip link set eth0 master bond0 up type bond_slave
    Nov 14 18:55:01 Tower kernel: bond0: (slave eth0): Error -19 calling set_mac_address
    Nov 14 18:55:01 Tower rc.inet1: ip link set name bond0 type bond primary eth0
    Nov 14 18:55:01 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:55:01 Tower rc.inet1: ip link del bond0
    Nov 14 18:55:01 Tower kernel: bond0 (unregistering): Released all slaves
    Nov 14 18:55:01 Tower rc.inet1: ip link set lo down
    Nov 14 18:55:02 Tower rc.inet1: ip -6 addr add ::1/128 dev lo
    Nov 14 18:55:02 Tower rc.inet1: ip link set lo up
    Nov 14 18:55:02 Tower rc.inet1: ip link add name bond0 type bond mode 1 miimon 100
    Nov 14 18:55:02 Tower rc.inet1: ip link set bond0 up
    Nov 14 18:55:02 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:55:02 Tower rc.inet1: ip link set eth0 down
    Nov 14 18:55:02 Tower rc.inet1: ip link set eth0 master bond0 up type bond_slave
    Nov 14 18:55:02 Tower kernel: bond0: (slave eth0): Error -19 calling set_mac_address
    Nov 14 18:55:02 Tower rc.inet1: ip link set name bond0 type bond primary eth0
    Nov 14 18:55:04 Tower rc.inet1: ip link add name br0 type bridge stp_state 0 forward_delay 0
    Nov 14 18:55:04 Tower rc.inet1: ip link set br0 up
    Nov 14 18:55:04 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:55:04 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:55:04 Tower rc.inet1: ip link set bond0 master br0 up
    Nov 14 18:55:04 Tower kernel: br0: port 1(bond0) entered blocking state
    Nov 14 18:55:04 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:55:04 Tower kernel: device bond0 entered promiscuous mode
    Nov 14 18:55:04 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:55:05 Tower rc.inet1: polling up to 60 sec for DHCP server on interface br0
    Nov 14 18:55:05 Tower rc.inet1: timeout 60 dhcpcd -w -q -n -p -t 10 -h Tower -4 br0
    Nov 14 18:55:05 Tower dhcpcd[8198]: dhcpcd-9.4.1 starting
    Nov 14 18:55:05 Tower dhcpcd[8201]: DUID 00:04:03:c0:02:18:04:4d:05:29:33:06:cd:07:00:08:00:09
    Nov 14 18:55:05 Tower dhcpcd[8201]: br0: waiting for carrier
    Nov 14 18:55:05 Tower rc.inet1: ip link add name br0 type bridge stp_state 0 forward_delay 0
    Nov 14 18:55:05 Tower rc.inet1: ip link set br0 up
    Nov 14 18:55:05 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:55:05 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:55:05 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:55:05 Tower rc.inet1: ip link set bond0 master br0 up
    Nov 14 18:55:05 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:55:05 Tower vnstatd[6425]: Interface "br0" enabled.
    Nov 14 18:55:05 Tower vnstatd[6425]: Interface "bond0" enabled.
    Nov 14 18:55:06 Tower rc.inet1: polling up to 60 sec for DHCP server on interface br0
    Nov 14 18:55:06 Tower rc.inet1: timeout 60 dhcpcd -w -q -n -p -t 10 -h Tower -4 br0
    Nov 14 18:55:06 Tower dhcpcd[8414]: sending signal HUP to pid 8200
    Nov 14 18:55:06 Tower dhcpcd[8201]: received SIGHUP, rebinding
    Nov 14 18:55:06 Tower rc.inet1: ip link set br0 up
    Nov 14 18:55:06 Tower dhcpcd[8201]: br0: waiting for carrier
    Nov 14 18:56:00 Tower root: Fix Common Problems Version 2023.10.08a
    Nov 14 18:56:00 Tower root: Fix Common Problems: Error: Unable to communicate with GitHub.com
    Nov 14 18:56:00 Tower root: Fix Common Problems: Other Warning: Could not check for blacklisted plugins
    Nov 14 18:56:03 Tower root: Fix Common Problems: Warning: Syslog mirrored to flash
    Nov 14 18:56:03 Tower root: Fix Common Problems: Other Warning: Could not perform unknown plugins installed checks
    Nov 14 18:56:03 Tower sSMTP[9988]: Unable to locate smtp.gmail.com
    Nov 14 18:56:03 Tower sSMTP[9988]: Cannot open smtp.gmail.com:465
    Nov 14 18:56:05 Tower dhcpcd[8201]: received SIGTERM, stopping
    Nov 14 18:56:05 Tower dhcpcd[8201]: br0: removing interface
    Nov 14 18:56:08 Tower root: Total Spundown: 0
    Nov 14 18:56:36 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:80 failed (98: Address already in use)
    Nov 14 18:56:36 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:443 failed (98: Address already in use)
    Nov 14 18:56:37 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:80 failed (98: Address already in use)
    Nov 14 18:56:37 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:443 failed (98: Address already in use)
    Nov 14 18:56:37 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:80 failed (98: Address already in use)
    Nov 14 18:56:37 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:443 failed (98: Address already in use)
    Nov 14 18:56:38 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:80 failed (98: Address already in use)
    Nov 14 18:56:38 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:443 failed (98: Address already in use)
    Nov 14 18:56:38 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:80 failed (98: Address already in use)
    Nov 14 18:56:38 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: bind() to 0.0.0.0:443 failed (98: Address already in use)
    Nov 14 18:56:39 Tower nginx: 2023/11/14 18:56:36 [emerg] 15159#15159: still could not bind()
    Nov 14 18:57:17 Tower rc.inet1: dhcpcd -q -k -4 br0
    Nov 14 18:57:17 Tower dhcpcd[12597]: dhcpcd not running
    Nov 14 18:57:18 Tower rc.inet1: ip -4 addr flush dev br0
    Nov 14 18:57:18 Tower rc.inet1: ip -4 route flush dev br0
    Nov 14 18:57:18 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:57:18 Tower rc.inet1: ip -4 route flush dev bond0
    Nov 14 18:57:18 Tower rc.inet1: ip -4 addr flush dev eth0
    Nov 14 18:57:18 Tower rc.inet1: ip -4 route flush dev eth0
    Nov 14 18:57:18 Tower rc.inet1: ip link set br0 down
    Nov 14 18:57:18 Tower rc.inet1: ip link set bond0 nomaster
    Nov 14 18:57:18 Tower kernel: device bond0 left promiscuous mode
    Nov 14 18:57:18 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:57:18 Tower rc.inet1: ip link set br0 down
    Nov 14 18:57:18 Tower rc.inet1: ip link del br0
    Nov 14 18:57:18 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:57:18 Tower rc.inet1: ip link del bond0
    Nov 14 18:57:18 Tower kernel: bond0 (unregistering): Released all slaves
    Nov 14 18:57:18 Tower rc.inet1: ip link set lo down
    Nov 14 18:57:20 Tower root: ACPI action up is not defined
    Nov 14 18:57:20 Tower vnstatd[6425]: Interface "br0" disabled.
    Nov 14 18:57:20 Tower vnstatd[6425]: Interface "bond0" disabled.
    Nov 14 18:57:21 Tower kernel: ntpd[2122]: segfault at 28 ip 0000564cb3ed2e34 sp 00007fffc7947ea0 error 4 in ntpd[564cb3ec4000+86000] likely on CPU 1 (core 1, socket 0)
    Nov 14 18:57:21 Tower kernel: Code: 85 d2 75 e2 83 3b 03 0f 8f fd 04 00 00 66 0f 1f 84 00 00 00 00 00 48 8d 2d 91 28 0b 00 66 0f 1f 84 00 00 00 00 00 4c 8b 75 00 <4d> 3b 7e 28 75 10 e9 da 02 00 00 90 4d 3b 7e 28 0f 84 d6 02 00 00
    Nov 14 18:57:22 Tower rc.inet1: ip -6 addr add ::1/128 dev lo
    Nov 14 18:57:22 Tower rc.inet1: ip link set lo up
    Nov 14 18:57:22 Tower rc.inet1: ip link add name bond0 type bond mode 1 miimon 100
    Nov 14 18:57:22 Tower rc.inet1: ip link set bond0 up
    Nov 14 18:57:22 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:57:22 Tower rc.inet1: ip link set eth0 down
    Nov 14 18:57:22 Tower rc.inet1: ip link set eth0 master bond0 up type bond_slave
    Nov 14 18:57:23 Tower kernel: bond0: (slave eth0): Error -19 calling set_mac_address
    Nov 14 18:57:23 Tower rc.inet1: ip link set name bond0 type bond primary eth0
    Nov 14 18:57:25 Tower vnstatd[6425]: Interface "bond0" enabled.
    Nov 14 18:57:26 Tower rc.inet1: ip link add name br0 type bridge stp_state 0 forward_delay 0
    Nov 14 18:57:26 Tower rc.inet1: ip link set br0 up
    Nov 14 18:57:26 Tower rc.inet1: ip link set bond0 down
    Nov 14 18:57:26 Tower rc.inet1: ip -4 addr flush dev bond0
    Nov 14 18:57:26 Tower rc.inet1: ip link set bond0 master br0 up
    Nov 14 18:57:26 Tower kernel: br0: port 1(bond0) entered blocking state
    Nov 14 18:57:26 Tower kernel: br0: port 1(bond0) entered disabled state
    Nov 14 18:57:26 Tower kernel: device bond0 entered promiscuous mode
    Nov 14 18:57:26 Tower kernel: 8021q: adding VLAN 0 to HW filter on device bond0
    Nov 14 18:57:27 Tower rc.inet1: polling up to 60 sec for DHCP server on interface br0
    Nov 14 18:57:27 Tower rc.inet1: timeout 60 dhcpcd -w -q -n -p -t 10 -h Tower -4 br0
    Nov 14 18:57:27 Tower dhcpcd[12844]: dhcpcd-9.4.1 starting
    Nov 14 18:57:27 Tower dhcpcd[12847]: DUID 00:04:03:c0:02:18:04:4d:05:29:33:06:cd:07:00:08:00:09
    Nov 14 18:57:27 Tower dhcpcd[12847]: br0: waiting for carrier
    Nov 14 18:57:30 Tower vnstatd[6425]: Interface "br0" enabled.

     

     

     

    Some differences that 'may' be important:
    - in normal boot mode, I'm running with: pcie_acs_override=downstream,multifunction.  I do this to passthrough a GPU to a VM.  I'm not sure if I need this.   I see some odd entries on the logitech USB receiver during the failure (18:54:16) I don't see that at all in safe mode.
    - in safe boot, nginx is not running at all
    - in normal boot mode, even though I didn't start the docker engine on this boot, I do see nginx syslog entries.  See in attached syslog file at 18:43:48.  It takes this command: "/etc/rc.d/rc.nginx start" a little over two minutes to run. I'm wondering if having nginx involved is causing this problem somehow?
    - I was able to get diagnostics during safe mode.  Attaching those diagnostics here.   Although I assume they aren't relevant.
    - I was not able to get diagnostics when failure occurred in normal mode. attempted on server console, and it never completes  left it for 30 minutes.

     

    I'm attaching the syslog file, as well as a diagnostic file from the safe mode test (which doesn't fail)

    syslog tower-diagnostics-20231114-1840 safe mode.zip

  8. My apologies, that last diagnostics I posted was when everything was working OK.  (I had removed the network.cfg file and rebooted, then posted the diagnostics.  However I hadn't made the issue happen.

    I Just finished doing some tests, and will pull together the logs and diagnostics either later today or tomorrow.   Note: the problem does not occur when I test this in Safe mode.  And I see a distinct difference in the syslog between safe mode (where the network comes back after unplugging/re-plugging the ethernet cable) and normal boot mode (where the network does not come back).

    I'll post the information as soon as I can.  thanks for your help JorgeB

  9. 2 hours ago, JorgeB said:

     

     

    You cannot access the GUI in general or just those pages? If the former try booting in safe mode.

     

     

    I can’t access the GUI in general.   Nor can I access anything else.  It’s like it has lost all access to the network.   Oh and I was incorrect in my original post when I stated that all prior occupancies were random.   I’m not sure if it it related, however I can also make the same thing occur by disconnecting the Ethernet cable, and reconnecting.   Or by rebooting my router.   I believe I can access the command line directly if I have originally booted with the monitor attached to the internal graphics.

     

    I can try doing that, and see if I can force a fail.   I will grab diagnostics at that point.  Is there anything else I should do if I can make it fail and have access to command line?

  10. This problem has been happening for a few weeks now.  It started when I removed an NVidia card and installed an ASRock Intel Arc A380.   I am passing the GPU through to a Windows VM.   I had changed the system bios (Gigabyte Z490 AORUS ULTRA) to turn on resizable rebar (in an attempt to use on the VM).    After I started getting these 'hangs', I swapped back the NVidia card, however continued getting the hangs.  So decided to put back the Arc card.

     

    All previous occurrences of this problem were random as far as I could tell.  This morning, I was attempting to access a file from my windows VM on the Unraid Flash drive (accessed via SMB). and Windows explorer hung.  Explorer died/restarted.  At which point I did a restart on the Windows VM to attempt to clean things up.    At that point I was unable to access the Unraid dashboard, nor could I reconnect (via RDP) to the Windows VM.    All Dockers were also not accessible.

     

    This occurred at: 11/13/2023 10:25 AM.    I attempted to access the command line from the internal graphics to get diagnostics by switching display cable, however nothing came up on monitor.    I powered off the machine shortly thereafter.   At which point when I rebooted the system, it went into bios immediately, rather than booting to the USB.  I had to use the Power switch on the PSU - keeping it off for a minute or so, for the system to boot via the USB, rather than directly into the bios.   (putting this info here in case it is relevant)

     

    I restarted at 11:09 AM.  And ran the Diagnostic tool immediately after boot.

    I did a quick look at the diagnostics, and noticed that the SMART data for the flash drive had the following error(?):

     

    Quote

    /dev/sda: Unknown USB bridge [0x0781:0x5571 (0x100)]
    Please specify device type with the -d option.

     

    Thanks for any help!!

    tower-diagnostics-20231113-1111.zip

  11. When I stop the array, almost all the time I'll end up in a loop trying to unmount /mnt/user.     I've looked through various posts where people have the same problem, and haven't been able to figure this out.    Running  'lsof /mnt/user' in the console outputs nothing.

     

    I do use rclone and mergerfs - which I thought might be the culprit.  However, even when I reboot, and don't run the rclone/mergerfs, I still get the problem.    I'm attaching the diagnostic file.   Thanks.

     

    Quote

    Mar 26 13:49:01 Tower emhttpd: Stopping File Activity...
    Mar 26 13:49:03 Tower Recycle Bin: Stopping Recycle Bin
    Mar 26 13:49:03 Tower emhttpd: Stopping Recycle Bin...
    Mar 26 13:49:03 Tower unassigned.devices: Unmounting All Devices...
    Mar 26 13:49:04 Tower emhttpd: /usr/local/emhttp/plugins/user.scripts/backgroundScript.sh "/tmp/user.scripts/tmpScripts/Rclone Unmount Shutdown/script" >/dev/null 2>&1
    Mar 26 13:49:04 Tower emhttpd: shcmd (137): /etc/rc.d/rc.samba stop
    Mar 26 13:49:04 Tower emhttpd: shcmd (138): rm -f /etc/avahi/services/smb.service
    Mar 26 13:49:04 Tower emhttpd: shcmd (140): /etc/rc.d/rc.nfsd stop
    Mar 26 13:49:04 Tower rpc.mountd[12805]: Caught signal 15, un-registering and exiting.
    Mar 26 13:49:05 Tower emhttpd: Stopping mover...
    Mar 26 13:49:05 Tower emhttpd: shcmd (141): /usr/local/sbin/mover stop
    Mar 26 13:49:05 Tower kernel: nfsd: last server has exited, flushing export cache
    Mar 26 13:49:05 Tower root: mover: not running
    Mar 26 13:49:05 Tower emhttpd: Sync filesystems...
    Mar 26 13:49:05 Tower emhttpd: shcmd (142): sync
    Mar 26 13:49:06 Tower emhttpd: shcmd (143): umount /mnt/user0
    Mar 26 13:49:06 Tower emhttpd: shcmd (144): rmdir /mnt/user0
    Mar 26 13:49:06 Tower emhttpd: shcmd (145): umount /mnt/user
    Mar 26 13:49:06 Tower root: umount: /mnt/user: target is busy.
    Mar 26 13:49:06 Tower emhttpd: shcmd (145): exit status: 32
    Mar 26 13:49:06 Tower emhttpd: shcmd (146): rmdir /mnt/user
    Mar 26 13:49:06 Tower root: rmdir: failed to remove '/mnt/user': Device or resource busy
    Mar 26 13:49:06 Tower emhttpd: shcmd (146): exit status: 1
    Mar 26 13:49:06 Tower emhttpd: shcmd (148): /usr/local/sbin/update_cron
    Mar 26 13:49:06 Tower emhttpd: Retry unmounting user share(s)...
    Mar 26 13:49:07 Tower rsyslogd: [origin software="rsyslogd" swVersion="8.2002.0" x-pid="30025" x-info="https://www.rsyslog.com"] start
    Mar 26 13:49:11 Tower emhttpd: shcmd (149): umount /mnt/user
    Mar 26 13:49:11 Tower root: umount: /mnt/user: target is busy.
    Mar 26 13:49:11 Tower emhttpd: shcmd (149): exit status: 32
    Mar 26 13:49:11 Tower emhttpd: shcmd (150): rmdir /mnt/user
    Mar 26 13:49:11 Tower root: rmdir: failed to remove '/mnt/user': Device or resource busy

     

    tower-diagnostics-20210326-1352.zip

  12. 20 minutes ago, dglb99 said:

    Would you mind posting what the suggested fix was?  I don't believe we have access to that channel.

     

    In my case, what was installed in the docker container was the 'libva-intel-driver' (i965).   And for 10th gen processors, what is needed is the 'intel-media-driver' (iHD) for Broadwell and above Intel cores.     See this page: https://wiki.archlinux.org/index.php/Hardware_video_acceleration#Configuring_VA-API  for reference to the two intel graphics drivers.  The suggested fix was to use the intel-media-driver.   

     

    This is the driver needed: https://github.com/intel/media-driver .     I had a tough time figuring out how to get this into a docker container, so I ended up moving over to doing what I wanted on a Windows VM, and built my own ffmpeg.exe with media-autobuild_suite on github.        

     

    Here's one more link I just found https://www.reddit.com/r/Tdarr/comments/hy7slr/use_i965_hw_encodingdecoding_driver_instead_of_ihd/   It's sort of the opposite of what we need, but it may help in figuring out what to do.

     

  13. 18 minutes ago, tazire said:

    cheers for the response. I have just started from scratch. Stopped using it completely for now and im looking into alternatives at this point. 

    I'm interested in an alternative too - Nextcloud (which I use for files) has a chat facility, but they don't have a windows app and I don't like chats in a browser.  Let me know if you find a good alternative  I'd appreciate!...

  14. On 12/6/2020 at 7:58 AM, tazire said:

    yea if im honest i havent liked mongo at all. i use maria with my nextcloud and have found it rock solid. but ive been using this with rocketchat and its very flakey I had it corrupt on me in my early use. Then it seemed more stable so i deployed if for my use case and now this. im on the most recent build for it. i have left it stopped until the most recent build just to see if that solved my problem but it hasnt. thanks for getting back to me but looks like its a fresh install and hope for the best.  

    I'm getting the same error you are - started this morning for me.   I'm going to try going back a build and see if that helps.   I've tried restoring the appdata mongodb folder from a day ago, and that didn't help.

     

    Found this over in the RocketChat thread - It worked for me.  

     

  15. On 11/17/2020 at 1:30 PM, Jurykov said:

    Having similar issues.  ...

     

    Not a 10th gen CPU issue like OP, but the delineation seems to be between 8th generation and 9th/10th. There is one successful report above of it working.  Anyone have suggestions of what else I could try?

     

    Run at root console:

    
    lspci -kvnn
    00:02.0 VGA compatible controller [0300]: Intel Corporation UHD Graphics 630 (Desktop 9 Series) [8086:3e98] (rev 02) (prog-if 00 [VGA controller])
            DeviceName: Onboard - Video
            Subsystem: Gigabyte Technology Co., Ltd Device [1458:d000]
            Flags: bus master, fast devsel, latency 0, IRQ 146, IOMMU group 2
            Memory at a4000000 (64-bit, non-prefetchable) [size=16M]
            Memory at 80000000 (64-bit, prefetchable) [size=256M]
            I/O ports at 4000 [size=64]
            Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
            Capabilities: [40] Vendor Specific Information: Len=0c <?>
            Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
            Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
            Capabilities: [d0] Power Management version 2
            Capabilities: [100] Process Address Space ID (PASID)
            Capabilities: [200] Address Translation Service (ATS)
            Capabilities: [300] Page Request Interface (PRI)
            Kernel driver in use: i915
            Kernel modules: i915

    Run in docker:

    
    # ls -al /dev/dri
    total 0
    drwxr-xr-x  2 root root       80 Nov 17 22:27 .
    drwxr-xr-x 13 root root     3320 Nov 17 22:27 ..
    crwxrwxrwx  1 root   18 226,   0 Nov 17 22:27 card0
    crwxrwxrwx  1 root   18 226, 128 Nov 17 22:27 renderD128
    # vainfo
    error: XDG_RUNTIME_DIR not set in the environment.
    error: can't connect to X server!
    libva info: VA-API version 1.1.0
    libva info: va_getDriverName() returns 0
    libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so
    libva info: Found init function __vaDriverInit_1_1
    libva error: /usr/lib/x86_64-linux-gnu/dri/i965_drv_video.so init failed
    libva info: va_openDriver() returns -1
    vaInitialize failed with error code -1 (unknown libva error),exit
    #

     

    I suspect you are running into a problem that I saw when trying to use Tdarr with these newer CPUs..       Here's two links that show my issue, and what needs to be updated in the docker. 

     

    Shows same vainfo output you are getting:

    https://discordapp.com/channels/623392507828371476/623799920574595092/748901930444783706

     

    And what the fix [probably] is - both my update, and a followon from user ProZach in Discord:

    https://discordapp.com/channels/623392507828371476/623799920574595092/748901930444783706

     

    This link https://wiki.archlinux.org/index.php/Hardware_video_acceleration describes the different VA-API driver needed (intel-media-driver, vs libva-intel-driver) in the linux docker....

     

    Ahhh - and maybe this will help you:  https://forums.plex.tv/t/pms-ignores-libva-driver-name-environment-variable/498999

     

    I run Plex in a Windows 10 VM on my Unraid - and use an Nvidia card for acceleration, so I haven't run into this...   Let us know if this helps you.

  16. @johnny2678,  I was able to get it working -   You don't mention whether you are on the Beta Unraid.   You need to be on the 6.9.0 beta to have /dev/dri show up with the 10th gen CPUs....   Assuming you are, here's what I'm running with, which is just a little diff from what you have.

     

    Go file:
    #!/bin/bash
    #enable module for iGPU and perms for the render device
    modprobe i915
    sleep 4
    chown -R nobody:users /dev/dri
    chmod -R 777 /dev/dri
    
    # Start the Management Utility
    /usr/local/sbin/emhttp &
    
    
    
    Syslinux pertinant line (I only have this setup for regular boot, non gui mode, but shouldn't matter)
    append kvm-intel.nested=1 pcie_acs_override=downstream,multifunction initrd=/bzroot

    You don't need the "i915.alpha_support=1" - although it shouldn't make a difference either way.

    For bios - try also plugging in a monitor if you haven't already, and turning that on prior to boot....  Just another thing you can try.

     

    Edit: added full Go file, to avoid confusion

     

    • Like 2
  17. Trying to get this to work also.   I just received and upgraded my build with an I7-10700K, and Gigabyte Aurus Ultra Z490.

     

    Going to Unraid's 6.9 beta 1 and setting up go file with:

    #!/bin/bash
    #enable module for iGPU and perms for the render device
    modprobe i915
    chown -R nobody:users /dev/dri
    chmod -R 777 /dev/dri
    
    # Start the Management Utility
    /usr/local/sbin/emhttp &

    Note: doesn't seem to matter whether syslinux.cfg is updated with "i915.alpha_support=1".  From what I've found on this forum, that is no longer needed.    

     

    6.9 beta does enable both the support for the iGPU (/dev/dri now available in docker containers), as well as the onboard NIC (2.5 GB intel NIC)  - NIC only shows up as a 1GB.

     

     

    However, I've tried this on two dockers and still can not get HW acceleration in transcoding.   Tried on Handbrake, and tdarr.  Going into the console for each, they both show:

     

    $ ls -l /dev/dri
    total 0
    drwxr-xr-x 2 root root       80 Jun  7 11:33 by-path
    crw-rw---- 1 root   18 226,   0 Jun  7 11:33 card0
    crw-rw---- 1 root   18 226, 128 Jun  7 11:33 renderD128
    
    
    $ vainfo
    error: XDG_RUNTIME_DIR not set in the environment.
    error: can't connect to X server!
    error: failed to initialize display

    I'm still a little fuzzy on the following, but I believe the docker container has to be at a certain level of the linux kernal.  And from the two links below, there appears to be other things the distro has to support...

     

    https://bugs.launchpad.net/ubuntu/+source/intel-vaapi-driver/+bug/1873262

    https://discourse.ubuntu.com/t/18-04-intel-va-api-support-for-intel-comet-lake/15132

     

    At this point I'm not sure if this is a just 'wait it out' and it will start working, or if the dockers will need to be updated with the underlying linux. 

     

     

     

     

×
×
  • Create New...