Jump to content

jonp

Members
  • Posts

    6,443
  • Joined

  • Last visited

  • Days Won

    24

Posts posted by jonp

  1. FINALLY FIGURED IT OUT!!!  Can't believe how long this took me.  So sorry for the delayed response.  I was overthinking the problem and after reviewing your logs, the issue is that your GPU has a built in USB controller.  Unraid binds the driver for that controller to the VM.  You need to bind the USB controller to the VFIO-PCI device driver to prevent it from being attached to the full driver.  You can do this from the Tools > System Devices page.  Scroll down to find device 01:00.2 and stub that device.  A reboot will be required thereafter.  Let me know if you need further assistance.

  2. Hi Lance,

     

    With everything going on here, you may need to work with an expert to review the system and figure out the problem.  All of your posts here contain lots of information, but nothing really crystal clear as to what is causing the issue.  We do offer professional services for a fee if you're interested.  You can schedule a session with us via https://unraid.net/services.  If you'd rather keep working through here, what I would do is completely start from scratch.  Also I wouldn't use an external hard drive the way you are.  I'm also confused as to how libvirt can't find the storage for your VMs unless you're doing something like passing through the storage device that also contains the vdisk images themselves.

  3. Wow that is odd.  The next thing I'm going to want to check is the permissions on the files inside the folders that are showing up empty.  But before we go down that path, how did the data end up on the drives to begin with?  Were the files just copied over the network to the share?  From what machine and using what protocol (SMB, AFP, etc.)?

  4. The first thing I would do is revert to stock Unraid to see if the issue persists.  I would imagine so, but its good to verify just in case.  As far as the safety of upgrading to a beta, it should be perfectly fine given how late in the beta process we are, but if you'd rather hold off until we reach RC, I'm fairly confident we are going to see that very soon.

  5. What kind of server is this?  I've never seen so many NVIDIA components in an lspci.txt file before ;-).  Also noticing two marvell controllers in there which are notorious for causing problems if IOMMU is enabled on the motherboard.  What I would suggest is to stay on the version that's working for now or try to upgrade to 6.9-beta to see if the upgraded Linux kernel yields better results.  Could be there is an issue with a device driver that you're using in 6.8.3.  Lastly, if the issue persists, you can upgrade and then attach a monitor and keyboard to your server.  When it boots, login to the command line and enter the following:

     

    /tail/var/log/syslog -f

     

    This will begin printing the system log out to the screen and when it crashes, it should show the last message before the crash which could be a key indicator as to what's going on.

  6. Hi there and thanks for sharing this with us.  So to recap the issue to confirm I'm understanding this correctly, when you add a user share to Unraid and navigate to your server using Finder on Mac, the new share shows up correctly, but if you add new folders inside that share, you're stating that those don't show up within Finder?  What happens if you reboot your Mac?  Any change?  A short video demonstrating exactly what's happening would be ideal (you can use Loom to create it pretty easily and share it here).

  7. I set up a Ubuntu VM pretty quick and getting similar results. I could try reinstalling unraid back to the stable version to verify that everything was fast af. I was just having some issues with creating VMs with my hardware setup and the upgrade seemed to make things much smoother.
    QEMU__Ubuntu__-_noVNC.jpg.27018446f30705107a43202c3619a040.jpg
    is there a way to verify on your micro tick router that traffic between the VM and the host is not actually traversing the physical network? A 10 gig interface isn't even required to do what you're trying to do.

    Sent from my Pixel 3 XL using Tapatalk

  8. Hi again Matt,

     

    No worries on the delayed reply!  This definitely explains a bit more for me.  I've tried recreating this issue in our lab setup, but I don't have the same ASUS NIC as you.  So to confirm, a direct transfer from Mac to server is attaining pretty high speed, but when the VM is the source instead of the Mac, you see the speeds drop.  Is that correct?  Beyond iperf, have you tested just copying files over?  I'm assuming you're copying them to a share located on SSD(s).

  9. Hi there,

     

    A few things if you want some help with this issue.  First and foremost, please provide a detailed accounting of exactly what you have and have not tried at this point.  The thread you were linked to earlier in this topic was specific to Ryzen issues and there was a lot of detailed feedback in there.  We need to know exactly what you've tried thus far (C-State switches, BIOS updates, disabling overclocking on CPU/RAM, etc.).  In addition, the logs you initially provided are useful only in so far as that they provide us with your configuration, but they won't reveal what is causing the crash as they are being collected before a crash occurs.  To collect critical diagnostic data from a crash, you should connect a monitor and keyboard to your server and boot it up.  Once at the command line, login with your 'root' account and from the command prompt, type the following:

     

    tail /var/log/syslog -f

     

    This will begin printing log messages directly to the monitor and when the system crashes, take a picture of what you see on the screen before restarting the server.  This will at least let us know where the crash is stemming from and what the root cause may be.

     

    At the end of the day, this is likely a hardware issue, as if it was software, it would be much more wide-spread.  There are plenty of folks in the community here that are utilizing Unraid with Ryzen and have managed to get past all the quirks that it can bring.  We just need to narrow down what component in your setup is amiss so you can either fix or replace it.

     

    All the best,


    Jon

  10. Have you guys had any progress on this issue? Thanks
    Yes and no. We have determined some of the excess writes on SSDs that some users were seeing may have been due to the boundary size we set which we have resolved. That will be in an upcoming update, but will require a process to convert which we are still sorting out.

    Separately we are testing smb performance in a 10gbps environment to see how we hold up in that setting. That testing is still underway and should be completed soon.

    However the CPU usage issue reported here hasn't been able to be reproduced just yet, so we are still trying.

    Sent from my Pixel 3 XL using Tapatalk

  11. Hi there,

     

    I saw your message into support.   First and foremost, it sounds like you desperately need to get yourself a UPS if you're having that many power failures.  Having a power failure while data is being written to the system can cause filesystem corruption and data loss.  From reviewing your logs, you do have a few devices showcasing issues.  The best advice I can give you at this point would be to start the array in maintenance mode and attempt to do file system checks / repairs.

     

    Start here in the wiki:  https://wiki.unraid.net/UnRAID_6/Storage_Management#Drive_shows_as_unmountable

     

    All the best,

     

    Jon

  12. We may need to consider closing this subforum and directing all support inquiries directly to general support.  Sorry for the lack of attention this thread received.  Had I seen it sooner, I would have told you that your issue really stems down to your hardware.  There is one key problem with your build:  AMD.

     

    AMD hardware just doesn't perform as well for this use-case as does Intel/NVIDIA.  We've seen many AMD-based systems just fail to support IOMMU properly.  We've also seen plenty of AMD GPUs that just don't support function-level resets which are required to pass through a GPU to a VM.  And you've already seen the third problem which is that even if you can get the GPU to pass through, rebooting the guest VM can cause the whole system to crash.

     

    It is for these reasons we never recommend the use of AMD hardware for any setups that involve GPU pass through.  There are some users that have managed to make this work, but for most folks, these kinds of problems are common.  We've contacted AMD about this in the past but unfortunately our notices to them have fallen on deaf ears.

     

    The best advice I can give you would be to A) contact AMD about your issue or B) switch to an Intel + NVIDIA setup.

  13. Hi there,

     

    If everything on this system was working fine, then you had an unexpected power outage and the behavior has only happened since then, it's pretty clear that something went awry with the hardware and it needs to be replaced.  Now we begin the journey if identifying what's broken.  First step is to come up with steps to reproduce.  Does it crash after boot even if you never start the array?  If so, try removing all components that you can (graphics cards, hard drives, SSDs, USB devices, etc.).  Boot up the system and see if you can get it to crash with none of those devices installed.  If the array has to be started for the crash to occur, then leave your storage devices attached but remove everything else.  If it still crashes, you can try replacing the memory (or at least reducing the # of RAM sticks you have installed).  If it still crashes, it is likely an issue with the motherboard and it needs to be replaced.

  14. Hi there,

     

    What can you tell us about this network controller?

     

    04:00.0 Ethernet controller [0200]: Broadcom Inc. and subsidiaries NetXtreme BCM5723 Gigabit Ethernet PCIe [14e4:165b] (rev 10)
        Subsystem: Hewlett-Packard Company NC107i Integrated PCI Express Gigabit Server Adapter [103c:705d]
        Kernel driver in use: tg3
        Kernel modules: tg3
     

    It appears that is changing the MAC and I don't know much about that specific adapter.  This is definitely not an Unraid-specific issue as far as I can tell.  And in the logs that your MAC hasn't changed during that boot sequence.  Can you send us logs that you capture after the MAC address has changed in a single boot?

  15. If you do this test and the errors go away, then it looks like you'll need a new motherboard if you want to use IOMMU.  Unfortunately lots of mobo manufacturers don't thoroughly test this feature and sometimes it can cause problems.

  16. Are you using any VMs and if so, are you using them with any type of PCI device pass through?  If so, try stopping their usage and in fact, go into the motherboard BIOS and disable IOMMU.  You may have faulty hardware that doesn't like IOMMU.  Top search result on Google when I search for those error messages.

×
×
  • Create New...