-
Posts
80 -
Joined
-
Last visited
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by SimpleDino
-
-
I have the same issue since the upgrade, all of my VM's with GPU passthrough don't work anymore.
I can access them through RDP, anydesk etc. but not with sunshine/moonlight without proper encode/decode function.
I can see the GPU's in all device managers and ...(kinda working?!) but none outputs any display to monitors.
Have any of you found any solution to this?
Br, -
SOLVED https://github.com/ausbitbank/stable-diffusion-discord-bot/issues/41
Volume mapping in docker-compose.yaml did it:
volumes: - /mnt/user/appdata/stable-diffusion/outputs/03-InvokeAI:/app/output - ./db:/app/db
and setting basepath in .env:
basePath="/app/output/"
-
1 hour ago, ghost82 said:
There are a lots of AER errors, I think the crash happens because of the log filling.
The source of the issue could be hardware or software.
If I were you, first of all, I would clean the slots and check the cables.
If it's software related it could be fixed with a new kernel update.
In the meantime to not show these, you can try pcie_aspm=off in your syslinux config.
Otherwise you can try pci=noaer
Thanks for the quick response.
I'm curious about whether the pcie_aspm=off command will affect all PCIe slots on a global level. Specifically, I'm wondering if it will have an impact on the 4xM.2 PCIe adapter.
Given the circumstances, pci=noaer might be the preferable solution for the time being. The peculiar thing is, I hadn't experienced any issues until recently.
The problem likely originates from PCIe slot 3, which houses the RTX 3060. As mentioned in my post, Unraid and the VM only freeze when the GPU is passed through to the Windows VM.
The GPU in PCIe slot 3 operates fine outside of the VM when used for stable diffusion or other tasks. However, the AER error persists, and this error first appeared after the initial system freeze.
-
@Squid @ghost82 Do you guys have any clues?syslog.txt
-
Yesterday, I remotely accessed one of my VMs with Nvidia GPU passthrough, and suddenly the whole VM froze and disconnected. Subsequently, the entire UNRAID server became unresponsive, and unfortunately, the only solution was a hard reset. This has occurred twice, and I'm now concerned that if I tempt fate and start the VM a third time, it could cause irreparable damage to the server, potentially leading to data loss.
memtest is OK!
Below is a snippet of what the logs show. Note that the Docker log is also 100% full:
May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP May 12 14:54:03 Tower kernel: pcieport 0000:40:01.3: AER: Multiple Corrected error received: 0000:46:02.0 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP May 12 14:54:03 Tower kernel: pcieport 0000:40:01.3: AER: Corrected error received: 0000:46:02.0 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP May 12 14:54:03 Tower kernel: pcieport 0000:40:01.3: AER: Multiple Corrected error received: 0000:46:02.0 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP May 12 14:54:03 Tower kernel: pcieport 0000:40:01.3: AER: Corrected error received: 0000:46:02.0 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP May 12 14:54:03 Tower kernel: pcieport 0000:40:01.3: AER: Corrected error received: 0000:46:02.0 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: device [10b5:8714] error status/mask=00000080/0000a000 May 12 14:54:03 Tower kernel: pcieport 0000:46:02.0: [ 7] BadDLLP
The log entries show recurring PCIe interface issues. "BadDLLP" is possibly because of corrupted data??
The device [10b5:8714] see IOMMU GROUP below is probably the because it's connected to the PCIe port.
The repeating "severity=Corrected" error suggests an underlying issue that, while currently correctable, persists. The 'AER: Multiple Corrected error received' messages from the PCIe port indicate multiple corrected errors from the device at 0000:46:02.0
QuoteIOMMU group 53:[10b5:8714] 45:00.0 PCI bridge: PLX Technology, Inc. Device 8714 (rev ab)
IOMMU group 54:[10b5:8714] 46:01.0 PCI bridge: PLX Technology, Inc. Device 8714 (rev ab)
IOMMU group 55:[10b5:8714] 46:02.0 PCI bridge: PLX Technology, Inc. Device 8714 (rev ab)
PCIe ACS override is set to = both
Attached, you will find the diagnostics.zip folder.
OS: unRAID 6.12.0-rc5
Hardware:
Threadripper 1950xASUS Zenith Extreme
PCIE 1: PNY NVIDIA Quadro P2000 5GB
PCIE 2: M.2 X16 Adapter GEN 3
PCIE 3: Nvidia RTX 3060 12GB
PCIE 4: HBA Card LSI SAS 9207-8i
-
Hello Unraid community,
I've been working on integrating the Stable Diffusion Discord Bot (ausbitbank/stable-diffusion-discord-bot) with the superboki or Sygils container on Unraid, and I've encountered some issues along the way. Despite several attempts to resolve these errors/issues, I've been unsuccessful so far. I've also posted about the issue on the bot's GitHub repo (ausbitbank/stable-diffusion-discord-bot/issues/41).
The problem I'm experiencing is that while the Discord bot successfully generates an image upon receiving a prompt, the image doesn't appear in the Discord channel. I can find the generated image in the Invoke Docker container output folder, but the path logic seems to be causing issues. I receive the following error from the Discord bot container:
logs:
QuoteError: ENOENT: no such file or directory, open '/mnt/user/SD-Outputs/InvokeAI/03-InvokeAI/output:/app/output000011.ac064d6c.2916674852.png'] { errno: -2, code: 'ENOENT', syscall: 'open', path: '/mnt/user/SD-Outputs/InvokeAI/03-InvokeAI/output:/app/output000011.ac064d6c.2916674852.png' }
I've made adjustments to the .env file and docker-compose.yaml file, as shown below:
.env file:
QuotebasePath="/mnt/user/SD-Outputs/InvokeAI/03-InvokeAI"
dbPath="${basePath}/db"
outputPath="${basePath}"
docker-compose.yaml:
Quoteversion: '3'
services:
discord-bot:
build: .
container_name: stable-diffusion-discord-bot
volumes: - /mnt/user/SD-Outputs/InvokeAI/03-InvokeAI/db:/app/db - /mnt/user/SD-Outputs/InvokeAI/03-InvokeAI/output:/app/output
environment: - channelID="xxxxx"
- adminID="xxxxx"
- apiUrl="http://xxxxxxx:9000/"
- discordBotKey="xxxxx"
# ... rest of the environment variables
Dockerfile:
QuoteFROM node:14
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
CMD ["npm", "start"]
I would greatly appreciate it if someone could help me create a working Docker container for the Stable Diffusion Discord Bot that can be integrated with the superboki or Sygils container on the Unraid CA App Store.
Any assistance or guidance would be immensely helpful.
Thank you in advance!
-
Should be one! I have an issue where InvokeAI (option: 03) works but when using A1111 (02) it crashed when pressing generate and Easy Diffusion (1) the webui doesn't even start. No clue why because it doesn't give any error logs either...really weird!!
Anyone experiencing something like this? -
On 12/8/2022 at 3:39 PM, ich777 said:
This is caused by the GPU Statistics plugin.
Please turn of the server completely, pull the power cord from the wall, press the power and reset button a few times to empty the caps, wait 30 seconds and then power up the server and see if this makes any difference.
If not the only thing that I would recommend next is pulling the GPU from the server and try it on a Desktop PC, also install the drivers and put a load on it and see if everything works.
nvidia-persistenced reports that it can't open the device.
[SOLVED]
Thanks for all the input, unfortunately it did not help!
Figured out the issue... Apparently if you populate the last PCIe GPU slot on the X570 Taichi Motherboard then the first one becomes inactive somehow. Second one is also populated, used in VM's etc.I populated the last gpu slot as mentioned with a PCIE M.2 Adapter with four 970 Plus 1TB nvme's but I did not use or activate in Unraid until the other day and that is when nvidia-plugin gave out a warning and first gpu slot (P2000) got inactive but recognized in devices.
I've removed the M.2 PCIE card until I get hold of an Threadripper system.
- 2
-
Hi,
Today I got a notification saying nvidia-plugin had crashed (don't remember the detail...and yes I am ashamed) and when I tried to go to the Nvidia-plugin under settings the whole screen/UnRaid froze! After rebooting I deleted the plugin and re-installed to the letter according to your instructions etc. No more freezing issues and the plugin is OK, but now I can not find the Nvidia P2000 gpu card under the info section in the plugin (see picture). Has anyone encountered this issue? Or have I just missed a step during uninstall & install part... Any idea?! The card is no connected to any VMs or bound to VFIO.
Also at htop am getting high cpu usage for this: nvidia-smi -q -x -g GPU-3181...........this is the gpu ID I guess. Could this be the p-state script for power consumptions or me trying nvidia-smi command in console and it's not finding anything?
Steps taken:
1. Uninstall plugin
2. Reboot
3. Install according to instructions
4. Reboot
5. Downgrade to previous driver...no luck
5.1 Reboot
6.0 Upgrade to current version
6.1 Reboot.
-
Solved!
My bad, I've pointed the source file to opencore image instead of MacOS disk image in the same folder!
Great when you discover your own mistakes...but you learn from them I guess! -
<devices> <emulator>/usr/local/sbin/qemu</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='writeback'/> <source file='/mnt/user/domains/Ventura/Monterey GPU/Monterey GPU/BigSur-opencore.img'/> <target dev='hdc' bus='sata'/> <boot order='1'/> <address type='drive' controller='0' bus='0' target='0' unit='2'/> </disk>
Here is the format of vdisk
-
Hi!
Just want to say that I've done this many times and created several VMS of the original VMs (copy or cloned) without issues before.
Steps done to Clone/copy VM:
1. Copy Full folder containing VM from domains
2. Paste in a new folder with new name in domains.
3. Start new Custom VM, copy and paste the XML from original VM
3.1 Give it a new name, delete uuid (new will be generated)
3.2 Delete previous source file and point to the new one (the new folder with copied VM in it)
4. Edit helper script and replace the name of vm with the new and run the script.
5. Voila! as I said, this method has worked until now.
But now after updating to the latest UnRaid OS, I am getting this error when trying create & fire up a NEW VM. Any clues how to fix the image file or some other related issue?
Br,
-
Hardware recommendations for unRAID servers with VM's
https://r.tapatalk.com/shareLink/topic?share_fid=18593&share_tid=54248&url=https://forums.unraid.net/index.php?/topic/54248-Hardware-recommendations-for-unRAID-servers-with-VM%27s&share_type=t&link_source=app
Skickat från min iPhone med Tapatalk -
On 6/17/2022 at 1:17 PM, Squid said:
Probably a coincidence, but you want to recreate the image file https://forums.unraid.net/topic/57181-docker-faq/#comment-564309
Also, I'd hazard a guess that you've been having problem with the image filling up (this may have caused the problem), as your current set value for the size is 100G, but it used to be 200G (and is still mounting that size).
If you've haven't managed to figure out why the image is filling up, I'd switch from using an image to instead use a folder for docker. More space efficient and you never need to worry about the sizing again
Thanks, recreating the image helped.
I have not figured it unfortunately, the only big change I've made is updating the OS. Is there any good guide for frp, image --> folder?
Thanks in advance!
On 6/17/2022 at 1:51 PM, JorgeB said:Btrfs is detecting data corruption in all your pools, you should run memtest.
I run memtest and got zero errors this time...but I did this after I solved the issue above with @Squid's help and a restart.
-
I backed up the flash drive before updating from 6.10.2 to 6.10.3.
After updating Docker Services refuse to start, please see warning texts below. How can those warnings be fixed?
I've attached diagnostics as well.
-
I do not know if this questions has been answered, at least I couldn't find any.
On Plex (linuxserver.io) can I upgrade the libraries within plex as seen in the picture or will this mess everything up? (see attached screenshot)
Bump! Any ideas?
Skickat från min iPhone med Tapatalk -
I do not know if this questions has been answered, at least I couldn't find any.
On Plex (linuxserver.io) can I upgrade the libraries within plex as seen in the picture or will this mess everything up? (see attached screenshot)
-
On 3/30/2022 at 12:00 AM, AnnabellaRenee87 said:
For anyone looking for this in the future, I had to modify the formatting of the script to be the following.
#!/bin/bash nvidia-smi --persistence-mode=1 nvidia-persistenced fuser -v /dev/nvidia*
@SimpleDino and @SpaceInvaderOne Can you all confirm on a system on 6.10.0 RC4?
I have not upgraded yet to RC4, still on RC2 Maybe @SpaceInvaderOne can confirm?
Br,
-
19 hours ago, Jorgen said:
Remote access to lan works, maybe some of the other options too.
The trick is to allow access from the wireguard network to your delugevpn docker (I’m assuming you’re using binhex’s delugevpn).
Add the wireguard network range to the LAN_Network variable of the vpn docker, comma separated from your normal lan range. If you are using the default wireguard settings the range to add is: 10.253.0.0/24
Sent from my iPhone using TapatalkOhh Thanks, it worked perfectly!
- 1
-
2 hours ago, Ernie11 said:
Hello Everyone,
Do es anyone know if the rx 6700 XT is supported yet? Thanks.
Check for supported GPUs here: AMD GPUs
-
@SpaceInvaderOne Thanks a lot for the container update and your guides per usual! Appreciated!
@ghost82 Thanks for all of your good and explanatory inputs! You have helped a lot of rookies like me!
Just want to report a success for once and not an issue/error.Steps I took when updating to new macinabox and using the new scripts on current VMs with complete success:
- Remove Macinabox plus the scripts and also rm -r /mnt/user/appdata/macinabox.
- Download Macinabox with the desired new settings, wait for scripts to load.
- Copy the name of whatever macOS VM you want to update with and input it into the new helper script.
- Run script twice, now XML of that VM is updated.
- Now I can make whatever changes (remove/attach PCI & USB devices, HDDs/SSDs) I want in the VM and then just run the script once or twice without further manual XML editing.
All inputs are welcome if I have forgotten or misunderstood something in the update
Goodnights!
-
I need help with containers that are routed through NordVPN & DelugeVPN container. On the local network I can access all of the routed containers WebUI but remotely (outside of lan network) when accessing through the wireguard vpn I can not access the same containers WebUI. Do I make any sense?
I've tried these peers without any luck:
- Remote access to server
- Remote access to lan
- Remote tunneled access
I have not tried lan to lan access, because honestly I do not know how to set this up if this is the solution.
- 1
-
3 hours ago, delgatto said:
root@a9dc4d0a983b:/# sudo apt update
Get:1 http://archive.ubuntu.com/ubuntu focal InRelease [265 kB]
Get:2 http://archive.ubuntu.com/ubuntu focal-updates InRelease [114 kB]
Get:3 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
Get:4 http://archive.ubuntu.com/ubuntu focal/restricted amd64 Packages [33.4 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Get:6 https://repo.nordvpn.com/deb/nordvpn/debian stable InRelease [6174 B]
Get:7 https://repo.nordvpn.com/deb/nordvpn/debian stable/main amd64 Packages [6280 B]
Get:8 http://security.ubuntu.com/ubuntu focal-security InRelease [114 kB]
Get:9 http://security.ubuntu.com/ubuntu focal-security/universe amd64 Packages [839 kB]
Get:10 http://security.ubuntu.com/ubuntu focal-security/main amd64 Packages [1470 kB]
Get:11 http://security.ubuntu.com/ubuntu focal-security/restricted amd64 Packages [889 kB]
Get:12 http://security.ubuntu.com/ubuntu focal-security/multiverse amd64 Packages [30.1 kB]
Get:5 http://archive.ubuntu.com/ubuntu focal/universe amd64 Packages [11.3 MB]
Get:13 http://archive.ubuntu.com/ubuntu focal/multiverse amd64 Packages [177 kB]
Get:14 http://archive.ubuntu.com/ubuntu focal/main amd64 Packages [1275 kB]
Get:15 http://archive.ubuntu.com/ubuntu focal-updates/universe amd64 Packages [1121 kB]
Get:16 http://archive.ubuntu.com/ubuntu focal-updates/multiverse amd64 Packages [33.7 kB]
Get:17 http://archive.ubuntu.com/ubuntu focal-updates/main amd64 Packages [1940 kB]
Get:18 http://archive.ubuntu.com/ubuntu focal-updates/restricted amd64 Packages [1003 kB]
Get:19 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [23.8 kB]
Get:20 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [50.8 kB]
Fetched 9879 kB in 9min 42s (17.0 kB/s)
Reading package lists... Done
Building dependency tree
Reading state information... Done
6 packages can be upgraded. Run 'apt list --upgradable' to see them.
root@a9dc4d0a983b:/# sudo apt speedtest-cli
E: Invalid operation speedtest-cliYeah sorry, I see others has already commented. I forgot the install part in sudo apt install speedtest-cli.
I have edited the post now also.
-
9 minutes ago, delgatto said:
Hi! Thanx for posting this!
But for me it doesn't work. After sudo apt speedtest-cli it's reporting
E: Invalid operation speedtest-cli.
Don't know why it doesn't work for you. could post your input commands and the following texts?
ASUS ROG Zenith Extreme Alpha X399
in VM Engine (KVM)
Posted
@jbartlett, have your VM's been effected or anything by unraid version 6.12.4?
My ones have stopped outputting display picture to monitor but are accessible with RDP, anydesk etc...but not with sunshine/moonlight because of encoder/decoder (RTX3060) is not working properly.