Jump to content

Glasti

Members
  • Posts

    40
  • Joined

  • Last visited

Posts posted by Glasti

  1. 58 minutes ago, Glasti said:

    love the dashboard! have been trying to set it up. 90% works. the only thing i am not able to figure out is how to get the SSD section to work. i think it is because for some reason i am not able to get disks to show here 
    image.thumb.png.d2872a42838d5a59e3916dd6504feb1a.png

     

    they all show as in the screenshot.. 

    i went through my telegraf.conf again with Gilbn's guide, but everything is setup according the guide.

    any ideas? 

    Found the issue! 

    i didnt had  device_tags = ["ID_SERIAL"]  set in [[inputs.diskio]]
    All working now! 

    • Thanks 2
  2. love the dashboard! have been trying to set it up. 90% works. the only thing i am not able to figure out is how to get the SSD section to work. i think it is because for some reason i am not able to get disks to show here 
    image.thumb.png.d2872a42838d5a59e3916dd6504feb1a.png

     

    they all show as in the screenshot.. 

    i went through my telegraf.conf again with Gilbn's guide, but everything is setup according the guide.

    any ideas? 

  3. 8 minutes ago, hugenbdd said:

    Should not be an issue.  adding to cron is a bit different in unRaid.  I can't remember the way they recommend.  However, you would want to add something like /usr/local/sbin/mover.old start

     

    Maybe someone else on the thread can give you more specific cron instructions.

    I will poke around and see if it is possible to either setup multiple cron jobs OR add `/usr/local/sbin/mover.old start` to the mover button. I will post it here if i find a solution! Thank you for your work!

  4. 3 minutes ago, hugenbdd said:

    Correct.

    I'm looking at adding a button on the page to call the original mover. 

     

    You can still call the original mover from the command line.

    /usr/local/sbin/mover.old

    Thank you for the info, this makes the Mover button in the `Main` tab obsolete. No big deal for me. 
    Would it be possible to add multiple cron jobs? So you can force move multiple times a day if necessary?

  5. On 3/10/2021 at 5:03 PM, hugenbdd said:

    This is expected behavior.

     

    Just for clarity.  When hitting the button, mover will run, however, the cache threshold may not be met, so it doesn't move anything and "skips" the custom mover command. (Check the logs to see the statements)

     

    I will look at putting a new button in, that calls the original mover.  (This would not look at any of the config settings.)

    Does this mean that the when the plugin is enabled it will only move above the treshold set and until the treshold is met? That is what it looks to be doing here. 

    ATM i can not invoke mover manually if below 75% and it will not move anything below my treshold

    Edit: i do have a cron schedule set.

  6. 7 hours ago, quizzy99 said:

    Hi, I'm getting the "No devices were found" error in the plugin and when trying "nvidia-smi" even though it detects my Quadro K4000 in system devices and with "lscpi" in terminal. I'm on stable Unraid 6.9.0 and Nvidia-driver version 2021.03.07. I am using a Gigabyte Z77X-UP4 TH motherboard and have Vt-x and Vt-D enabled in the bios. I tried it with and without the card plugged into a display. I've also tried bios resets, many restarts, re-seating the card, legacy boot mode, checking and unchecking vfio bind as advised by previous posts. I've also confirmed that the card can be detected in a Windows VM. Any advice would be greatly appreciated. Thanks!

     

    Edit: I'm also getting a weird error in the logs RmInitAdapter failed and failed to copy vbios to system memory whenever i tried to run nvidia-smi or open the nvidia-driver settings page, as per the image.

    Also the card outputs properly to the external monitor and functions normally in a windows computer.

     

    systemDevices.PNG

    nvidiaPluginNoDevices.PNG

    weirdLog.PNG

    I had a similar issues fixed it by doing the following. 

     

    - delete the nvidia plugin

    - grab fresh unraid kernel

    - install plugin and latest driver. 

     

    Maybe it helps. 

     

    I have not changed any hardware  settings. 

     

    Ryzen 3700x, strix b450-f and a gtx 1660

    • Thanks 1
  7. On 3/4/2021 at 1:51 PM, Glasti said:

    The issue/error has returned again. 

    Testing with not transcoding to ram but to 1 of my cache pools. If that doesnt help i am going to run a MemTest. 


     

    Update: Hopefully the last one. 
    I haven't run Memtest yet. But tried to go back to RC2 but that didnt solve the issue.. 
    So i decided to uninstall the plugin and upgrade to Stable again. After updating OS i grabbed the plugin and i saw that newer drivers where added, i was using 455.** before. 

    Now i am running driver 460.56 and the issue hasnt occured for 28 hours, which is great.. 

    Of course i wasnt smart enough to grab the driver i was using before after reinstalling the plugin to see if it was an issue with upgrading from RC2 to Stable without reinstalling the plugin and/or driver.

    But i am happy it (hopefully) is solved, and if hopefully this can help anyone else that sees this issue.
     

    • Like 1
  8. On 2/21/2021 at 2:28 PM, Glasti said:

    Hello, 

     

    SInce a couple of days my GPU will stop showing in the plugin and with nvidia-smi and i am seeing these errors in the logs. It will come back, and then dissapears again.

    
    NVRM: GPU 0000:08:00.0: Failed to copy vbios to system memory.
    NVRM: GPU 0000:08:00.0: RmInitAdapter failed! (0x30:0xffff:802)
    NVRM: GPU 0000:08:00.0: rm_init_adapter failed, device minor number 0

     

    It has worked fine since then for about 8 months.  I am not able to find to much information about these messages. But sounds like it could be a driver issue. 

    Any hints to where to start troubleshooting?

     

    Thanks in advanced


    edit: When it is available and i stress test it, it looks to work fine. Do not think that it is a hardware failure. But have not tested yet.

    tower-diagnostics-20210221-1423.zip 155.68 kB · 2 downloads

    nvidia-bug-report.log.gz 1 MB · 1 download

    The issue/error has returned again. 

    Testing with not transcoding to ram but to 1 of my cache pools. If that doesnt help i am going to run a MemTest. 


     

  9. 19 hours ago, ich777 said:

    Have you changed anything lately to the config of your system or do you installed any new hardware?

    Have you installed a VM or a new container or bound a device to VFIO?

     

    What kind of server do you have, Custom one or a prebuilt?

    Thank you for your reply. I should have given a bit more details

    It is a custom build machine. 
    - ROG Strix B450-F Gaming board
    - Ryzen 3700x

    I have not changed anything hardware wise in the last few months, beside replacing some HDD's in december. 

    There are no VM's running or any devices bound to VFIO. Also, plex is the only container using the GPU.

    What i did realize, and forogot to mention here. I recently didnt unplug the HDMI cable from the GPU, but unplugged it from the monitor. I have since removed it. 
    The GPU dissapeared once after, but it has been available now for the last like 20 hours.

    Pretty sure the problems started after i left the cable plugged in, feels like that was causing the issue..

     

    • Like 1
  10. Hello, 

     

    SInce a couple of days my GPU will stop showing in the plugin and with nvidia-smi and i am seeing these errors in the logs. It will come back, and then dissapears again.

    NVRM: GPU 0000:08:00.0: Failed to copy vbios to system memory.
    NVRM: GPU 0000:08:00.0: RmInitAdapter failed! (0x30:0xffff:802)
    NVRM: GPU 0000:08:00.0: rm_init_adapter failed, device minor number 0

     

    It has worked fine since then for about 8 months.  I am not able to find to much information about these messages. But sounds like it could be a driver issue. 

    Any hints to where to start troubleshooting?

     

    Thanks in advanced


    edit: When it is available and i stress test it, it looks to work fine. Do not think that it is a hardware failure. But have not tested yet.

    tower-diagnostics-20210221-1423.zip

    nvidia-bug-report.log.gz

  11. He Guys,


    Running the Hotio Plex Container.
    Unraid 6.9 RC2

     

    I have a some costum mapping to move some folder outside the appdata folder for more efficient backups.

    These are my mappings.

     

    /config <---> /mnt/user/appdata/plex
    
    
    /transcode <---> /dev/shm
    
    
    /data <---> /mnt/user/data
    
    
    /config/app/Plex Media Server/Media <---> /mnt/user/plex_data/Media
    
    
    /config/app/Plex Media Server/Cache/Transcode/Sync+ <---> /mnt/user/plex_data/Cache/Transcode/Sync+
    
    
    /config/app/Plex Media Server/Cache/PhotoTranscoder <---> /mnt/user/plex_data/Cache/PhotoTranscoder
    
    
    /config/app/Plex Media Server/Metadata <---> /mnt/user/plex_data/Metadata

    Now i have noticed that sometimes Metadata is still being written inside appdata Metadata folder. Also Metadata is not showing properly inside plex, meaning it is reading from the wrong folder. This seems to happen after script restart, docker restart and/or server reboot.
    If i then restart the container manually it looks at the correct folders again.

     

    I have no idea what is causing this and or how to fix it. I know other people with the same container and same setup that are not seeing this issue.. 

     

    Hope that someone can give me some suggestions on how to possibly fix this!

     

    Thanks

    Edit: Checked  my backups from before i upgraded to 6.9RC1 and 2, and there the mapping worked. Getting a bit closer to the issue.

  12. 23 hours ago, trurl said:

    I don't use that plugin since many of these buttons are already on the Dashboard page. Are you sure you don't have a problem with flash being read-only or disconnected? You get unclean shutdown if flash can't be written to update the started/stopped status of the array before shutting down.

    Shutting down or restarting from the dashboard works fine. Permissions is something I checked and confirmed after posting here.

    I changed my usb to a Cruzer recently so I can not disconnect it as easily, by bumping against it when like moving my case a little. Something I have done in the passed accidentally with other usb sticks. 

     

    23 hours ago, Frank1940 said:

    I have not had the same problem with the System Buttons as you BUT I have observed that the Folding@Home docker does not shutdown (within the period of time that I waited).  In the case of this Docker (since it does not write to the array), there was a clean shutdown of the array (hence no parity check on restart) but it did require that I reboot the server manually to complete the reboot.  You might try stopping any Dockers before trying the button and see if that 'fixes' the problem.  (Of course, that would still leave a 'bug' to be resolved but the cause is now more isolated...)

    I deleted the plugin for now. Don't have the time/feel like troubleshooting right now. When I have time it will be a fun little project to see if it's a docker issue. It's a good point, could very well be. 

  13. Hello! 

    I have been trying to get wireguard to work on my container, Halianelf gave me some tips on how to get it working. But now my container hangs on starting the wireguard client.

     

    `2020-12-06 21:06:26,512 DEBG 'start-script' stderr output:
    [#] ip link add wg0 type wireguard
    
    2020-12-06 21:06:26,513 DEBG 'start-script' stderr output:
    [#] wg setconf wg0 /dev/fd/63
    
    2020-12-06 21:06:26,520 DEBG 'start-script' stderr output:
    [#] ip -4 address add 10.13.124.161/24 dev wg0
    
    2020-12-06 21:06:26,523 DEBG 'start-script' stderr output:
    [#] ip link set mtu 1420 up dev wg0
    
    2020-12-06 21:06:26,531 DEBG 'start-script' stderr output:
    [#] resolvconf -a wg0 -m 0 -x
    
    2020-12-06 21:06:26,538 DEBG 'start-script' stderr output:
    could not detect a useable init system
    
    2020-12-06 21:06:26,561 DEBG 'start-script' stderr output:
    [#] wg set wg0 fwmark 51820
    
    2020-12-06 21:06:26,562 DEBG 'start-script' stderr output:
    [#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820
    
    2020-12-06 21:06:26,563 DEBG 'start-script' stderr output:
    [#] ip -4 rule add not fwmark 51820 table 51820
    
    2020-12-06 21:06:26,563 DEBG 'start-script' stderr output:
    [#] ip -4 rule add table main suppress_prefixlength 0
    
    2020-12-06 21:06:26,565 DEBG 'start-script' stderr output:
    [#] sysctl -q net.ipv4.conf.all.src_valid_mark=1
    
    2020-12-06 21:06:26,566 DEBG 'start-script' stderr output:
    [#] iptables-restore -n
    
    2020-12-06 21:06:26,568 DEBG 'start-script' stderr output:
    [#] '/root/wireguardup.sh'`



    Any ideas? 

    Thanks!

    edit: it seems to happen with all vpn containers, i recreated the docker img and that didnt help either.. i am a bit lost 

  14. Hello,

     

    I am seeing this interesting issue where i get the FCP message saying: Your server has run out of memory. 
     

    This has started happening after configuring these awesome scripts:

    https://github.com/SpartacusIam/unraid-scripts

    The scripts dont seem to be the problem, have checked and confirmed. But the error comes after the 'backup_all_appdata' script ends.

     

    Script ends at 04:26 AM, FCP error at 04:40 AM. I normally check the system around 9 AM and then there is no ram issues.
     
    All my appdata runs of Unassigned drives:

    - Main appdata folder is on a nvme UA

    - Plex appdata is on a ssd UA

     

    I had set Lidarr to use 4gb memory, which it is not liking according to the logs. Maybe that was causing the issue. Have now set it to 8gb.

    Before i start hardware testing i wanted to checkin here to see if someone maybe finds the problem.. 

    Thanks in Advanced!! 

    Glasti

     



     

     

    tower-diagnostics-20201123-0954.zip

  15. OK. So when I added the new drives I removed a cache drive. For this I moved all shares from the cache drive with the mover function. I didn't rebuild the docker image. This is what I did last night. Will monitor if it still happens

  16. after adding some drives to my array i have been getting the following when my VM (w10) is running. from what i could find on google it is not anything to worry about. But when the vm is running my unpinned CPU's are on max load. 

     

    Quote

    May  6 21:55:34 Tower kernel: CPU 0/KVM: page allocation failure: order:4, mode:0x6080c0(GFP_KERNEL|__GFP_ZERO), nodemask=(null)
    May  6 21:55:34 Tower kernel: CPU 0/KVM cpuset=vcpu0 mems_allowed=0
    May  6 21:55:34 Tower kernel: CPU: 3 PID: 3577 Comm: CPU 0/KVM Tainted: G        W         4.19.107-Unraid #1
    May  6 21:55:34 Tower kernel: Hardware name: Gigabyte Technology Co., Ltd. Z370 AORUS Gaming 7/Z370 AORUS Gaming 7, BIOS F15a 11/28/2019
    May  6 21:55:34 Tower kernel: Call Trace:
    May  6 21:55:34 Tower kernel: dump_stack+0x67/0x83
    May  6 21:55:34 Tower kernel: warn_alloc+0xd6/0x16c
    May  6 21:55:34 Tower kernel: __alloc_pages_nodemask+0xa81/0xae1
    May  6 21:55:34 Tower kernel: ? flush_tlb_kernel_range+0x5e/0x78
    May  6 21:55:34 Tower kernel: dsalloc_pages+0x38/0x5e
    May  6 21:55:34 Tower kernel: reserve_ds_buffers+0x19e/0x382
    May  6 21:55:34 Tower kernel: ? kvm_dev_ioctl_get_cpuid+0x1d3/0x1d3 [kvm]
    May  6 21:55:34 Tower kernel: x86_reserve_hardware+0x134/0x14f
    May  6 21:55:34 Tower kernel: x86_pmu_event_init+0x3a/0x1d5
    May  6 21:55:34 Tower kernel: ? kvm_dev_ioctl_get_cpuid+0x1d3/0x1d3 [kvm]
    May  6 21:55:34 Tower kernel: perf_try_init_event+0x4f/0x7d
    May  6 21:55:34 Tower kernel: perf_event_alloc+0x46e/0x821
    May  6 21:55:34 Tower kernel: perf_event_create_kernel_counter+0x1a/0xff
    May  6 21:55:34 Tower kernel: pmc_reprogram_counter+0xd9/0x111 [kvm]
    May  6 21:55:34 Tower kernel: reprogram_fixed_counter+0xd8/0xfc [kvm]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6b8/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6ac/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: intel_pmu_set_msr+0xf4/0x2e4 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6ac/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: kvm_set_msr_common+0xc6e/0xd24 [kvm]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6b8/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6ac/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6b8/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6ac/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: ? vmx_vcpu_run+0x6b8/0xa97 [kvm_intel]
    May  6 21:55:34 Tower kernel: handle_wrmsr+0x4b/0x85 [kvm_intel]
    May  6 21:55:34 Tower kernel: kvm_arch_vcpu_ioctl_run+0x10d0/0x1367 [kvm]
    May  6 21:55:34 Tower kernel: ? futex_wake+0x120/0x147
    May  6 21:55:34 Tower kernel: kvm_vcpu_ioctl+0x17b/0x4b1 [kvm]
    May  6 21:55:34 Tower kernel: ? __seccomp_filter+0x39/0x1ed
    May  6 21:55:34 Tower kernel: vfs_ioctl+0x19/0x26
    May  6 21:55:34 Tower kernel: do_vfs_ioctl+0x533/0x55d
    May  6 21:55:34 Tower kernel: ksys_ioctl+0x37/0x56
    May  6 21:55:34 Tower kernel: __x64_sys_ioctl+0x11/0x14
    May  6 21:55:34 Tower kernel: do_syscall_64+0x57/0xf2
    May  6 21:55:34 Tower kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
    May  6 21:55:34 Tower kernel: RIP: 0033:0x145effd1a4b7
    May  6 21:55:34 Tower kernel: Code: 00 00 90 48 8b 05 d9 29 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 29 0d 00 f7 d8 64 89 01 48
    May  6 21:55:34 Tower kernel: RSP: 002b:0000145efd3fe678 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    May  6 21:55:34 Tower kernel: RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 0000145effd1a4b7
    May  6 21:55:34 Tower kernel: RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000017
    May  6 21:55:34 Tower kernel: RBP: 0000145efda5e400 R08: 0000562643da6770 R09: 00000000ffffffff
    May  6 21:55:34 Tower kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
    May  6 21:55:34 Tower kernel: R13: 0000000000000006 R14: 0000145efd3ff700 R15: 0000000000000000

    tower-syslog-20200506-2157.zip

  17. 1 hour ago, bastl said:

    Worth a try.

     

    Is your Intel GPU enabled and set to primary GPU? If not enable it and set it as primary in the BIOS so Unraid itself uses it. In case unraid picks up your 1080 as primary all the time, put it down in another slot and use another GPU in the first slot. In my case 1080ti in first slot and 1050ti in third slot are both able to passthrough. If I switch them only the 1080ti in third slot as working. Remember I'am on AMD TR4 and have no GPU onboard or in the CPU itself.

     

    Using another slot may change your IOMMU groupings and also the IDs might change. Check if using another slot may throw the card in a group with other devices. Also keep in mind, often the lower slots are shared with other devices like USB or Sata controllers.

    Got it fixed! It was a fairly simple fix! Found a working vbios and unplugged the screen from power. I'll definitely keeps these tips for possible future troubleshooting if needed! 

    • Like 1
×
×
  • Create New...