almulder

Members
  • Posts

    218
  • Joined

  • Last visited

Posts posted by almulder

  1. I currently have two VMs running Windows 11 on separate pass-through hardware configurations. One of these VMs, dedicated for personal use, operates without any issues. However, the other VM, which I primarily use for work, encounters occasional problems.

     

    Both VMs utilize different NVME drives as their primary storage (with Windows installed directly rather than on a virtual disk). The hardware pass-through for both VMs includes USB ports, NVME storage (designated as 😄 drives), and dedicated graphics cards.

     

    While the personal VM functions smoothly, the work VM occasionally experiences lock-ups over the weekend, requiring a forced stop instead of a simple reset. Upon attempting to restart the work VM, I encounter an error message.

    78149717_UnraidError.png.f734064c84bda57ecc4f2790187293af.png

     

    IOMMU.thumb.png.8002148fd6e9a7c29c85881a0704ab2f.png

     

    The only effective solution I've found is to shut down the entire server, as rebooting alone leads to an IOMU error persisting on the device. After a complete server shutdown and subsequent restart, the work VM functions again.

     

    I've attempted troubleshooting by swapping the NVME drives location between the personal and work VMs, as well as realigning the IOMU allocation in the VM settings. However, these efforts haven't resolved the issue, indicating it's not related to the NVME port allocation.

    Even after cloning the NVME drive and installing it anew, the problem persists, ruling out a fault with the NVME drive itself. Both VMs share identical power settings within Windows, with all peripherals configured to remain active and prevent sleep mode.

     

    Despite these measures, the issue persists, particularly noticeable during periods of VM inactivity over weekends. I suspect the problem lies either with Unraid or within the Windows environment, but I've been unable to pinpoint the exact cause.

     

    I'm reaching out here in hopes that someone might offer insights or suggestions toward resolving this perplexing issue.

     

  2. @SimonF

    So I have my QNAP TVS-871 and running the plugin and seems to work. My question is this: Is there a way to completely customize the screen? Like I want to enable it during boot and have it say something like loading Loading Unraid..., then once loaded change to Array Stooped, then once Array is started say Array Loading. then once loaded say Array Loaded. then wait 10 seconds and show IP address then after 10 mins turn screen off, if button is pressed (the 2 buttons on qnap) the turn display on and show x if button is pressed again show y, if pressed again show z ect...

     

    I am thinking something along the lines of user scripts to control the screen once loaded if possible??

     

    Is this even possible?

     

    Also Thanks form getting this far. It was annoying to see the screen never update to anything until now.

     

    Note: That on the setting page in unraid you have the "Online Manual" link still back to the APC UPS Manual. maybe update to bring you to this thread?

     

    I changed line 66 of the 'LCDSettings.page' to this

    <span style="float:right;margin-right:10px"><a href="https://forums.unraid.net/topic/136952-plugin-lcd_manager/" target="_blank" title="_(UNRAID Forums: [PLUGIN] LCD_MANAGER)_"><i class="fa fa-file-text-o"></i> <u>_(Unraid Plugin Forum)_</u></a></span>


     

    and now looks like this:

    image.png.f62c9d2398e45444f2effb7450a31c24.png

     

     

    I also updated the ICON to be more inline with UNRAID Icons (Attached if you want to use)

    image.png.fd9d69895e1929279ea1b19d2824a9fd.png

    lcd.png.bff7379a6c4227329ecac4445612afbb.png

     

  3. SO I have my trusty qnap TS-871 and have unraid (Current version) running on it. I never get crc errors on spinning drives, but I have 5 SSD's I have tested 2 brand new and they all throw CRC errors. Did a pre Clear and still got CRC errors. But if I replace them with a spinning disk I get no issues.

     

    My other server also hates SSD's. So my question is this. Does unraid hate SSD's (Also my NVME's are fine)

     

    Seems like almost anyone I know who uses SSD's get CRC errors. Just seems odd (atleast 10 people)

  4. On 6/24/2023 at 1:24 PM, ziopimpi said:

    I solved the problem, the docker configuration was wrong. Here there's the right one. The configuration file is not necessary anymore with this simple configuration.

    docker run
      -d
      --name='Rec'
      --net='bridge'
      --privileged=true
      -e TZ="Europe/Berlin"
      -e HOST_OS="Unraid"
      -e HOST_CONTAINERNAME="Rec"
      -e 'OTR_PORT'='0'
      -l net.unraid.docker.managed=dockerman
      -p '8083:8083/tcp'
      -v '/mnt/user/software/OwnTracks/Storage/':'/store':'rw'
      -v '/mnt/user/software/OwnTracks/Logs/':'/log':'rw' 'owntracks/recorder'

     

    Also, on the mobile device all the HTTP request must be submitted to the /pub endpoint.

    So I just downloaded and installed the docker, can you elaborate on the settings you show and where I enter in the info? And are you just using you external IP address or using an cname?

     

  5. So I have my UPS do weekly test for 10 seconds, when it does it triggers an email that its on battery, but does not send that its back on main. The test is very short and looking at the logs its happens so fast it only sends the one email. Is there a way to set a time that it must be on batteries before sending notification, something like 15-30 seconds?

     

    Oct  9 11:05:08 HuskyServer apcupsd[12213]: Power failure.
    Oct  9 11:05:14 HuskyServer apcupsd[12213]: Running on UPS batteries.
    Oct  9 11:05:14 HuskyServer apcupsd[12213]: Mains returned. No longer on UPS batteries.
    Oct  9 11:05:14 HuskyServer apcupsd[12213]: Power is back. UPS running on mains.
    Oct  9 11:05:14 HuskyServer sSMTP[23651]: Creating SSL connection to host
    Oct  9 11:05:14 HuskyServer sSMTP[23651]: SSL connection using TLS_AES_256_GCM_SHA384
    Oct  9 11:05:17 HuskyServer sSMTP[23651]: Sent mail for ***********@gmail.com (221 2.0.0 closing connection x14-20020aa784ce000000b0068fb5e44827sm6553524pfn.67 - gsmtp) uid=0 username=****** outbytes=778

     

     

    Also what is an android free push notification that works best with the unraid agents? I see several option but wonder what one is free.

  6. got my new server up and running and reinstalled, however I am unable to access GUI. Looking at the log I see this error: (Token removed from my post)

     

    2023-10-02 16:49:15,300 DEBG 'start-script' stdout output:
    [warn] Unable to successfully download PIA json payload from URL 'https://10.10.112.1:19999/getSignature' using token 

     

    It tries this 10 times and then stops and never connects.  Never had an issue on old server.

  7. So I have setup another unraid machine and setup 2 vms with windows 11 pro installed on each.

     

    Both units have dedicated video cards/sound, NVME, & USBs passed through to each. Randomly one CPU on a vm will get stuck at 100%

     

    These are the cores assigned to one vm and as you can see one is now stuck at 100%

    image.png.811f649dcba60d26e6d73be15b780ec8.png

     

    when looking at task manager in the VM, no proccesses are really hitting CPU, and looking at performance / CPU / Logical Processors, they are not being used much, and the core in question does not show 100% usage.

     

    Is this just an issue with UNRAID GUI?

     

    It has happened on my other vm also but on a different core.

     

    Things I should look at?

  8. 35 minutes ago, Mainfrezzer said:

    But i did find this post on level1tech with a "hacky way", in the bottom of the comment chain, to disabling the logging for that issue https://forum.level1techs.com/t/asus-pro-ws-wrx80e-sage-dmesg-is-full-of-corrected-pcie-and-or-aer-errors/178004
     

     

    Thanks for that link. I think I have tracked down my issue to my 980 Pro NVME. Its the only one throwing the error. All my other NVMEs in my cards are 990 Pros 2TB (Except for a 980 pro 1TB)

     

    IOMMU group 13:			 	[144d:a80a] 62:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9A1/PM9A3/980PRO
    [N:0:6:1]    disk    Samsung SSD 980 PRO 2TB__1                 /dev/nvme0n1  2.00TB

     

    Seems other have the same issue if they are not the 990 Pros. I have ordered a replacement 990 (I can use the 980 elsewhere) And I believe this will solve my issue. (Hopefully). Thanks again so much for helping. (Google was not my friend this time, did not think to search for my motherboard as the issue / card)

    • Like 1
  9. 23 minutes ago, Mainfrezzer said:

    mhmm thats unfortunate.

    overall i would say its save to ignore ("It has been corrected by h/w and requires no further action")

    But i did find this post on level1tech with a "hacky way", in the bottom of the comment chain, to disabling the logging for that issue https://forum.level1techs.com/t/asus-pro-ws-wrx80e-sage-dmesg-is-full-of-corrected-pcie-and-or-aer-errors/178004


    i would just leave it as is, as long its just throwing a fuzz from time to time in the logs.

    ya I did notice I missed the e, so I guess I deleted my post as you repiled, but I did fix that and issue still there.

     

    Guess I will leave as is until I notice an issue. (did notice a bit of a speed increase when booting when i updated it)

     

    For reference to others this is what my line looks like now.

    label Unraid OS GUI Mode
      menu default
      kernel /bzimage
      append pci=realloc=off pcie_aspm=off isolcpus=18-31,50-63 initrd=/bzroot,/bzroot-gui

     

  10. 24 minutes ago, almulder said:

    So I have moved to a new unraid system and needed to restore my appdata (fresh install of everything) I have my backupd from just before I decommissioned my old server, restore sees them and tried to restore them, but I get an error when I do.

    I have tried one at a time also and get the same errors. Here is the log from when I tried to restore swag.

     

    So I figured out my issue. My New setup is zfs and datasets not folders, renamed the dataset appdata to appdata1 and then ran the restore and it work great. 

     

    So incase people run into the issue, you cant restore the folders if they were converted to datasets. (and I am sure vs versa)

  11. I am seeing the following error. What could be causing it?

     

    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 514
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]: It has been corrected by h/w and requires no further action
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]: event severity: corrected
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:  Error 0, type: corrected
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   section_type: PCIe error
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   port_type: 0, PCIe end point
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   version: 0.2
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   command: 0x0406, status: 0x0010
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   device_id: 0000:62:00.0
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   slot: 0
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   secondary_bus: 0x00
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   vendor_id: 0x144d, device_id: 0xa80a
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   class_code: 010802
    Sep 23 14:18:28 HuskyServer kernel: {9}[Hardware Error]:   bridge: secondary_status: 0x0000, control: 0x0000
    Sep 23 14:18:28 HuskyServer kernel: nvme 0000:62:00.0: AER: aer_status: 0x00000001, aer_mask: 0x00000000
    Sep 23 14:18:28 HuskyServer kernel: nvme 0000:62:00.0:    [ 0] RxErr                  (First)
    Sep 23 14:18:28 HuskyServer kernel: nvme 0000:62:00.0: AER: aer_layer=Physical Layer, aer_agent=Receiver ID

     

  12. So I have moved to a new unraid system and needed to restore my appdata (fresh install of everything) I have my backupd from just before I decommissioned my old server, restore sees them and tried to restore them, but I get an error when I do.

    I have tried one at a time also and get the same errors. Here is the log from when I tried to restore swag.

     

    (it just repeats the same message for all the files.

    image.thumb.png.d0078dea909d9a4c00f085c2a17d99fa.png

     

    Debug log ID: d4fa841f-c17f-4dd1-b7bc-d435c0372cb4

  13. Ok I got help from the discord and the issue was I had things in my que before I enabled the "allow GPU workers to do CPU tasks" Restarted from scratch and made sure it was checked before adding libraries and started a fresh scan, now issue is gone. Hope this helps others in the future.

     

    Note: It was in the staging area that said CPU required and only CPU option then was working, now GPU does it all.

  14. Anyone else using the /tmp ram folder for transcoding temp location? I am but it just fills up the tmp folder (ram) and eventually locks me out of unraid gui as there is no ram left.

     

    I keep getting these errors: anyone know how to fix? (After the / is the file name)

    image.png.bf9d867d694e69dbba8addc4b53c3cbf.png

     

    its not removing the cachefiles: (From MC)

    image.png.52c2ef48a1128f1abc7389f4fe3ee502.png

     

    Note: Both Tdarr and Tadarr-Node have the same paths setup

    image.png.46448776d20a37b23270c7fe3e250c2f.png

  15. On 9/5/2023 at 5:50 PM, Mainfrezzer said:

    Just as a one off random chance shot in the dark.... You're not trying to boot in uefi mode are you? The 730 only works in legacy/bios mode. 

    yes uefi mode, and i did get it to work, I had the most current firmware installed for nvidia, but that kept giving me issues, so I seen that ver 470.199.02 was listed and thought to give that a try. I installed that and the GT 730 booted into GUI mode without issue. Took me forever to figure out the issue but its resolved now.

     

    image.thumb.png.77218ff2fa84f8ab10cbb84841898a4e.png

    • Like 1
  16. So I have a new server that I am working on getting ready, Fresh new Thumb Drive with 6.12.4 installed. I try and boot to GUI and non GUI mode, the screen stops responding at a line "ACPI: BUS TYPE DRM_CONNECTOR REGISTERED"

     

    Read up its because I have and Nvidia card installed without the drivers installed. So i put in an Radeon card and that works fine, so I installed the nvidia driver, rebooted once back to GUI i shut down, swapped cards and its again stuck. I can access the GUI via IP address, but still locks up at "ACPI: BUS TYPE DRM_CONNECTOR REGISTERED" regardless if I select GUI boot or not.

     

    Plus the nvidia driver page  it shows this:

    image.png.739a08fb7c6185e476ff45c1bf530afd.png

     

    Yet I have the latest driver installed.

    image.png.aab1c2103512231707862198b6832d93.png

     

    How can I get an nvidia card to work. 

     

    Note:

    Motherboard = ASUS Pro WS WRX80E-SAGE SE WIFI

    Video Card: Nvidia GT 730

  17. So I am posting this for a friend because they are busy at work.

     

    He has unraid 6.12.3 and has been running for several weeks with the upgrade without issue. The he had a drive fail and while rebuilding another drive failed in the array, but that is due to age of drives and not the reason for the post.

     

    He has removed the bad drives and started a new config, and now when he starts docker it locks up and nothing will respond. everything worked fine prior to the drive crash, Docker image was on cache so not affected by drives that crashed. he even went as far as removing docker.img and started fresh, same issue. He has been pulling his hair out over the weekend trying to get it back up and running but a every turn it keeps locking up.

     

    Can you take a look at the 2 different diagnostics he was able to pull and see if you see anything that would cause issues other than the drives failing. And he has to force shutdown his system each time.

     

    Thanks

    tower-diagnostics-20230828-1954.zip tower-diagnostics-20230828-2013.zip

  18. Trying to install from community apps, but it errors out. Saying the package does not exist, same with developer version.

     

    plugin: installing: disklocation-master.plg
    Executing hook script: pre_plugin_checks
    plugin: downloading: disklocation-master.plg ... done
    
    Executing hook script: pre_plugin_checks
    
    Removing old plugin data before installing, if they exists...
    
    Installing plugin...
    
    Plugin folder /boot/config/plugins/disklocation already exists
    
    Checking existing package /boot/config/plugins/disklocation/disklocation.2023.08.21.zip...
    
    Latest package does not exist /boot/config/plugins/disklocation/disklocation.2023.08.21.zip
    
    Saving any previous packages from /boot/config/plugins/disklocation
    
    Attempting to download plugin package https://github.com/olehj/disklocation/archive/master.zip...
    
    Package server down https://github.com/olehj/disklocation/archive/master.zip - Plugin cannot install
    
    Reverting back to previously saved packages...
    
    No previous packages to restored
    
    Plugin install failed
    plugin: run failed: '/bin/bash' returned 1
    Executing hook script: post_plugin_checks

     

    I manually downloaded the "master.zip" and renamed to "disklocation.2023.08.21.zip" and copied it to the "/boot/config/plugins/disklocation/" and then ran community apps and it installed since I manually put it there. Else it would not download for some reason. Just wanted to make you aware