Stubbs Posted May 21, 2023 Share Posted May 21, 2023 I recently updated from 6.11.5 stable to 6.12 RC6, and since I did, Unraid performance has shocking. The WebUI hangs, SSH connections hang, docker containers randomly flicker between working and not responding, the "Main" page often takes forever to load all my array information. It gets especially bad when I stop the array; every single action seems to take forever while the array is stopped. To recount my actions before and during the update: I unstubbed my HBA Card. I was originally passing it through to a TrueNAS VM for a test zpool, but decided to let Unraid use it because of 6.12's zfs support. I changed the HBA card's PCIe slot from the second to the third (bottom) x16 slot. I did this because since Unraid will be using this card, I don't have to worry about IOMMU groups anymore. The bottom slot was always hard to separate from other interfaces. I installed a new Intel Optane P1600X M.2 SD in my motherboard's M.2 slot. In the system log I see a lot of this. I don't know if it's relevant, but there's a lot of it: May 21 21:49:11 Tower kernel: device veth7ada777 left promiscuous mode May 21 21:49:11 Tower kernel: docker0: port 10(veth7ada777) entered disabled state May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered blocking state May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered disabled state May 21 21:49:12 Tower kernel: device vethb79be41 entered promiscuous mode May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered blocking state May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered forwarding state May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered disabled state May 21 21:49:12 Tower kernel: eth0: renamed from vethecd0195 May 21 21:49:12 Tower kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb79be41: link becomes ready May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered blocking state May 21 21:49:12 Tower kernel: docker0: port 10(vethb79be41) entered forwarding state May 21 21:50:35 Tower kernel: docker0: port 1(vethbcd1caf) entered disabled state May 21 21:50:35 Tower kernel: vetha908dc1: renamed from eth0 May 21 21:50:35 Tower kernel: docker0: port 1(vethbcd1caf) entered disabled state May 21 21:50:35 Tower kernel: device vethbcd1caf left promiscuous mode May 21 21:50:35 Tower kernel: docker0: port 1(vethbcd1caf) entered disabled state May 21 21:50:45 Tower kernel: docker0: port 10(vethb79be41) entered disabled state May 21 21:50:45 Tower kernel: vethecd0195: renamed from eth0 May 21 21:50:45 Tower kernel: docker0: port 10(vethb79be41) entered disabled state May 21 21:50:45 Tower kernel: device vethb79be41 left promiscuous mode May 21 21:50:45 Tower kernel: docker0: port 10(vethb79be41) entered disabled state May 21 21:51:44 Tower kernel: docker0: port 1(vethe1280f3) entered blocking state May 21 21:51:44 Tower kernel: docker0: port 1(vethe1280f3) entered disabled state I attached two diagnostics. The one marked "initial" was right after the update when I booted the server back up. The one marked "21-05-2023" is one I initiated just now, with the whole system running horribly. (21-05-2023)tower-diagnostics-20230521-1505.zip initial-diagnostics-tower-diagnostics-20230520-0359.zip Quote Link to comment
JorgeB Posted May 21, 2023 Share Posted May 21, 2023 Try booting in safe mode first to rule out any plugin issues, also whatever is causing these might be a problem: May 20 14:38:53 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'JOHN-PC.local' 2>&1) took longer than 10s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:38:54 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:38:54 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! May 20 14:38:57 Tower inotifywait[7344]: Watches established. May 20 14:39:02 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'JOHN-PC' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:04 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'TRUENAS.local' 2>&1) took longer than 10s! May 20 14:39:11 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:39:17 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:18 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'JOHN-PC' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:27 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'TRUENAS.local' 2>&1) took longer than 10s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:29 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:39:30 Tower unassigned.devices: Remote Share '//JOHN-PC/Camera Roll' is not set to auto mount. May 20 14:39:30 Tower unassigned.devices: Remote Share '//TRUENAS/Photos' is not set to auto mount. May 20 14:39:33 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! Quote Link to comment
dlandon Posted May 21, 2023 Share Posted May 21, 2023 You have an issue with the remote server or the network. Quote Link to comment
Fiservedpi Posted May 22, 2023 Share Posted May 22, 2023 Im seeing the same thing out of nowhere after RC6 overall lag cant really pinpoint it to one thing. Diag attached tower-diagnostics-20230522-1806.zip Quote Link to comment
Stubbs Posted May 22, 2023 Author Share Posted May 22, 2023 On 5/21/2023 at 7:20 PM, JorgeB said: Try booting in safe mode first to rule out any plugin issues, also whatever is causing these might be a problem: May 20 14:38:53 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'JOHN-PC.local' 2>&1) took longer than 10s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:38:54 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:38:54 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! May 20 14:38:57 Tower inotifywait[7344]: Watches established. May 20 14:39:02 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'JOHN-PC' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:04 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'TRUENAS.local' 2>&1) took longer than 10s! May 20 14:39:11 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:39:17 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:18 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'JOHN-PC' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:27 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -an 'TRUENAS.local' 2>&1) took longer than 10s! ### [PREVIOUS LINE REPEATED 2 TIMES] ### May 20 14:39:29 Tower unassigned.devices: Warning: shell_exec(/usr/bin/nmblookup 'TRUENAS' | /bin/head -n1 | /bin/awk '{print $1}' 2>/dev/null) took longer than 5s! May 20 14:39:30 Tower unassigned.devices: Remote Share '//JOHN-PC/Camera Roll' is not set to auto mount. May 20 14:39:30 Tower unassigned.devices: Remote Share '//TRUENAS/Photos' is not set to auto mount. May 20 14:39:33 Tower unassigned.devices: Warning: shell_exec(/sbin/arp -a 'TRUENAS' 2>&1) took longer than 15s! In safe-mode, it took a long time to boot up and become accessible, but by the time it was, it seemed fine. It's hard to tell though, because even without safe mode it can perform fine, but then randomly things will start going wrong. Attached two diagnostics, both during and after safe mode. On 5/21/2023 at 9:02 PM, dlandon said: You have an issue with the remote server or the network. I don't understand. What remote server? Network? I didn't have any of these problems while I was on 6.11.5 stable. I virtualize my router through a pfSense VM with a quad NIC passed through. Again, worked fine on the last update. (22-05-2023) (reboot post-safemode) tower-diagnostics-20230522-1604.zip (22-05-2023) (safe mode) tower-diagnostics-20230522-1537(non-anon).zip Quote Link to comment
dlandon Posted May 22, 2023 Share Posted May 22, 2023 52 minutes ago, Stubbs said: I don't understand. What remote server? Network? I didn't have any of these problems while I was on 6.11.5 stable. Unassigned Devices is timing out when trying to perform operations on your remote server (TRUENAS). 1 Quote Link to comment
Stubbs Posted May 23, 2023 Author Share Posted May 23, 2023 2 hours ago, dlandon said: Unassigned Devices is timing out when trying to perform operations on your remote server (TRUENAS). This was a test TrueNAS VM I made and created a mount point for on Unraid. The VM has been off for weeks now, and I deleted that mount point as soon as I started having this issue. I currently have zero unassigned devices, but I'm still having issues with the UI hanging randomly. Even if Unassigned Devices was trying to perform operations on an offline remote, I don't see why that would cause the whole server to have issues. Nothing I run depended on those unassigned remotes. Quote Link to comment
dlandon Posted May 23, 2023 Share Posted May 23, 2023 5 minutes ago, Stubbs said: Even if Unassigned Devices was trying to perform operations on an offline remote, I don't see why that would cause the whole server to have issues. Nothing I run depended on those unassigned remotes. UD is trying to connect to the TRUENAS server on your network. There seems to be an issue with JOHN-PC also. Even if UD has nothing mounted, it is still trying to poll your remote servers to get an online status. Remove the remote shares from UD that are no longer being used. That will stop the logging of the UD messages, but probably not solve the server issues. Then reboot and post new dagnostics. Quote Link to comment
dlandon Posted May 23, 2023 Share Posted May 23, 2023 4 hours ago, Fiservedpi said: Im seeing the same thing out of nowhere after RC6 overall lag cant really pinpoint it to one thing. Diag attached tower-diagnostics-20230522-1806.zip 160.63 kB · 0 downloads Boot in safe mode and see if the issue persists. 1 Quote Link to comment
Stubbs Posted May 23, 2023 Author Share Posted May 23, 2023 5 minutes ago, dlandon said: UD is trying to connect to the TRUENAS server on your network. There seems to be an issue with JOHN-PC also. Even if UD has nothing mounted, it is still trying to poll your remote servers to get an online status. Remove the remote shares from UD that are no longer being used. That will stop the logging of the UD messages, but probably not solve the server issues. Then reboot and post new dagnostics. As I said, I deleted both those TrueNAS and JOHN-PC unassigned disks (mount points) as soon as I started having issues. Am I missing something with the deletion process? I don't see anything under /mnt/disks or /mnt/remotes either. I am not seeing any mentions of TRUENAS or JOHN-PC in my system log, which should be reflected in the last two diagnostics files I attached here: 3 hours ago, Stubbs said: In safe-mode, it took a long time to boot up and become accessible, but by the time it was, it seemed fine. It's hard to tell though, because even without safe mode it can perform fine, but then randomly things will start going wrong. Attached two diagnostics, both during and after safe mode. I don't understand. What remote server? Network? I didn't have any of these problems while I was on 6.11.5 stable. I virtualize my router through a pfSense VM with a quad NIC passed through. Again, worked fine on the last update. (22-05-2023) (reboot post-safemode) tower-diagnostics-20230522-1604.zip 191.87 kB · 1 download (22-05-2023) (safe mode) tower-diagnostics-20230522-1537(non-anon).zip 143.83 kB · 1 download Quote Link to comment
dlandon Posted May 23, 2023 Share Posted May 23, 2023 Your screen shot is for UD didks. It looks like you have some remote shares assigned to TRUENAS and JOHNS-PC. You need to remove those if they are no longer being used. Show a full screen shot of UD. Quote Link to comment
dlandon Posted May 23, 2023 Share Posted May 23, 2023 Ok. Looking at your logs again, I see you took care of the TRUENAS and JOHN-PC issues. I see this in your logs that shows a network issue of some sort: May 23 02:40:01 Tower root: Fix Common Problems: Warning: Share wikijs database set to cache-only, but files / folders exist on the array May 23 02:40:01 Tower root: Fix Common Problems: Error: Unable to communicate with GitHub.com ** Ignored May 23 02:40:02 Tower root: Fix Common Problems: Warning: unRaids built in FTP server is currently disabled, but users are defined May 23 02:40:02 Tower root: Fix Common Problems: Other Warning: Could not check for blacklisted plugins May 23 02:40:05 Tower root: Fix Common Problems: Other Warning: Background notifications not enabled May 23 02:40:09 Tower kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Down May 23 02:40:09 Tower kernel: bond0: (slave eth0): link status definitely down, disabling slave May 23 02:40:09 Tower kernel: device eth0 left promiscuous mode May 23 02:40:09 Tower kernel: bond0: now running without any active interface! May 23 02:40:09 Tower kernel: br0: port 1(bond0) entered disabled state May 23 02:40:11 Tower ntpd[1656]: Deleting interface #4 br0, 10.10.20.8#123, interface stats: received=0, sent=0, dropped=0, active_time=804 secs May 23 02:40:21 Tower kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX May 23 02:40:21 Tower kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex May 23 02:40:21 Tower kernel: bond0: (slave eth0): making interface the new active one May 23 02:40:21 Tower kernel: device eth0 entered promiscuous mode May 23 02:40:21 Tower kernel: bond0: active interface up! May 23 02:40:21 Tower kernel: br0: port 1(bond0) entered blocking state May 23 02:40:21 Tower kernel: br0: port 1(bond0) entered forwarding state May 23 02:40:22 Tower ntpd[1656]: Listen normally on 5 br0 10.10.20.8:123 May 23 02:40:22 Tower ntpd[1656]: new interface(s) found: waking up resolver May 23 02:40:23 Tower root: Fix Common Problems: Other Warning: Could not perform unknown plugins installed checks May 23 02:40:23 Tower root: Fix Common Problems: Warning: Share system set to use pool optane, but files / folders exist on the cache pool At one point you had no active interface for at least twelve seconds. FCP is also struggling to connect to the internet. Quote Link to comment
Stubbs Posted May 23, 2023 Author Share Posted May 23, 2023 1 hour ago, dlandon said: Your screen shot is for UD didks. It looks like you have some remote shares assigned to TRUENAS and JOHNS-PC. You need to remove those if they are no longer being used. Show a full screen shot of UD. My mistake, I forgot the disk and remote share boxes were separate. Nevertheless, as you found out the remote shares was empty too. 38 minutes ago, dlandon said: Ok. Looking at your logs again, I see you took care of the TRUENAS and JOHN-PC issues. I see this in your logs that shows a network issue of some sort: May 23 02:40:01 Tower root: Fix Common Problems: Warning: Share wikijs database set to cache-only, but files / folders exist on the array May 23 02:40:01 Tower root: Fix Common Problems: Error: Unable to communicate with GitHub.com ** Ignored May 23 02:40:02 Tower root: Fix Common Problems: Warning: unRaids built in FTP server is currently disabled, but users are defined May 23 02:40:02 Tower root: Fix Common Problems: Other Warning: Could not check for blacklisted plugins May 23 02:40:05 Tower root: Fix Common Problems: Other Warning: Background notifications not enabled May 23 02:40:09 Tower kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Down May 23 02:40:09 Tower kernel: bond0: (slave eth0): link status definitely down, disabling slave May 23 02:40:09 Tower kernel: device eth0 left promiscuous mode May 23 02:40:09 Tower kernel: bond0: now running without any active interface! May 23 02:40:09 Tower kernel: br0: port 1(bond0) entered disabled state May 23 02:40:11 Tower ntpd[1656]: Deleting interface #4 br0, 10.10.20.8#123, interface stats: received=0, sent=0, dropped=0, active_time=804 secs May 23 02:40:21 Tower kernel: igb 0000:09:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX May 23 02:40:21 Tower kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex May 23 02:40:21 Tower kernel: bond0: (slave eth0): making interface the new active one May 23 02:40:21 Tower kernel: device eth0 entered promiscuous mode May 23 02:40:21 Tower kernel: bond0: active interface up! May 23 02:40:21 Tower kernel: br0: port 1(bond0) entered blocking state May 23 02:40:21 Tower kernel: br0: port 1(bond0) entered forwarding state May 23 02:40:22 Tower ntpd[1656]: Listen normally on 5 br0 10.10.20.8:123 May 23 02:40:22 Tower ntpd[1656]: new interface(s) found: waking up resolver May 23 02:40:23 Tower root: Fix Common Problems: Other Warning: Could not perform unknown plugins installed checks May 23 02:40:23 Tower root: Fix Common Problems: Warning: Share system set to use pool optane, but files / folders exist on the cache pool At one point you had no active interface for at least twelve seconds. FCP is also struggling to connect to the internet. Hopefully I'll figure it out one day. Quote Link to comment
dlandon Posted May 23, 2023 Share Posted May 23, 2023 On 5/22/2023 at 10:44 PM, Stubbs said: My mistake, I forgot the disk and remote share boxes were separate. Nevertheless, as you found out the remote shares was empty too. Hopefully I'll figure it out one day. See if your router or switch has any diagnostic tools like cable tests. Try rebooting all your network equipment. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.