Jump to content

jonp

Administrators
  • Content Count

    6143
  • Joined

  • Last visited

  • Days Won

    17

jonp last won the day on July 15 2019

jonp had the most liked content!

Community Reputation

388 Very Good

About jonp

  • Rank
    Advanced Member

Converted

  • Gender
    Male
  • URL
    http://lime-technology.com
  • Location
    Chicago, IL

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. A few things I'd like you to try: 1) Check for a BIOS update. 2) Create a new VM using SeaBIOS instead of OVMF. 3) Create a new VM using Q35 instead of i440fx. Try the combination of SeaBIOS + Q35, SeaBIOS + i440fx, and OVMF + Q35 and see if any of those combos have any impact on the crashing. If not, this may be a hardware / BIOS issue that we can't resolve from the software side. I know that AMD offers a good price for performance product, but unfortunately their testing in the VM department leaves a lot to be desired. There is only so much we can do from the software side to quirk the kernel/QEMU. Your last resort would be to contact the motherboard manufacturer and notate the problems you are having to see if they have a beta BIOS that may resolve it for you.
  2. We're not opposed to it. The big thing is the user can't have to download and install any additional "components" in order to make it work. So it has to be a self-contained executable on all platforms.
  3. There are additional things that our creator does that a simple IMG file does not. Our tool validates the USB flash GUID as usable (most of the time ;-), it allows the user to toggle EFI boot mode, as well as customize hostname and networking options. It even lets the user select which release to install (from a backup, the current available release, or from our Next branch). These are features that are important to ease of use for new users and while we can appreciate that not everyone needs this, those that do really appreciate it. It is out of scope for the purpose of this RFQ, but know that we are investigating other licensing methods for future inclusion. Changing licensing is always a real iceberg of a problem. Seems small and simple from above the water line, but below it is a gigantic thing just waiting to sink your ship ;-). That is going to have to be another battle for another day. While we definitely appreciate what certain users want, we have to address the wider market of users that aren't as savvy. While I definitely agree if you're savvy enough to build a computer, you're probably savvy enough to figure out how to image a USB flash using some generic tool, but we're not just targeting that kind of customer and perhaps longer term users won't be building their own servers at all. The point is, our flash creator tool should work fine, but it's been a bit more of a bear to maintain than we'd like, so we're looking for offers from developers that want to earn a little extra cash to help build this thing. And yeah, we probably will have to fix it again after the next Mac release comes out, but that's fine and something we're also willing to accept.
  4. Hi Unraid Community! We have a special request for anyone who is familiar with the work required and wants to make a little $cash! A few years back we released our own USB flash creator tool for Unraid OS. For those of you who remember, installation of Unraid used to require a manual process (documented here), but we wanted this new tool to be a far easier way to get up and running. Here we are a few years later and the tool desperately needs an update, especially the macOS version. Our problem is that the development team is heads down focused right now on getting 6.9 and 6.10 out the door. As such, we wanted to throw out a request to our Community to see if anyone has the tools and talent to help us with this. This is a formal RFQ (Request for Quote) to correct issues in the current USB flash creator for Unraid OS, for both Windows and Mac platforms. We're not necessarily looking for any increased functionality at this time, though creative ideas on how to make it better will be considered. To respond to this RFQ, please email jonp@lime-technology.com with your bid and time estimate for the work. We will update this post once a bid has been accepted. If you have questions regarding the RFQ, please post them here so our responses can be made in the post publicly for all to see. Thanks everyone!! All the best, Team Lime Tech
  5. Hi there, Saw your email into support and wanted to chime in on your thread here. Unfortunately johnnie.black is right in that you're going to need to take the "one at a time" approach to figure out the root cause. The main problem here is that there wasn't some "event" that occurred prior to these issues that we can point to. Everything was fine until it wasn't. When issues like that happen, 99 times out of 100 it's because of something amiss with the hardware or a plugin/container update that broke something. Do you have your containers set to auto-update or do you manually update them? You can absolutely check out HTOP through a command line (just type htop from a terminal session) and see a more detailed process reporting, but even then, you will likely still have to resort to shutting down all your containers, letting the system run for a while to see if the CPU usage spikes just randomly and if not, start slowly turning on containers one by one until you find the culprit. I wish I had better advice for you, but again, when the issues just come out of nowhere like this and there wasn't some event that occurred right before the issues manifested, there is just no other way to narrow it down.
  6. What would be helpful is to know how long the server was operational before these issues came up and if anything changed in the week or so prior to the issues occurring. If nothing changed and the server all of the sudden started having this behavior, it really has to be something amiss with the hardware. Maybe dust buildup shorted something out or is causing heat or other issues. The PSU might be failing (a failing PSU could cause issues if it can't supply enough power or supplies too much power under certain situations). You can try running a memtest, though that is only 1 possible culprit. I wish I had more advice for you in this scenario, but unless you can point to a trigger that is causing the issue, it's really like looking for a specific needle in a pile full of needles ;-).
  7. Also, if anyone has seen this issue affect their cache pool using HDDs, can you please reply in this thread and let us know? I'm fairly certain this is an SSD-only issue, but better to ask than assume.
  8. Hi everyone and thank you all for your continued patience on this issue. I'm sure it can be frustrating that this has been going on for as long as it has for some of you and yet this one has been a bit elusive for us to track down as we haven't been able to replicate the issue, but we just ordered some more testing gear to see if we can and I will be dedicating some serious time to this in the weeks ahead. Gear should arrive this weekend so I'll have some fun testing to do during the 4th of July holiday (and my birthday ;-).
  9. Hi there, Saw your email into support and have read through this thread. You're definitely getting some good advice by folks in here. Johnnie's last post is especially important, although you can also use the idrac connection to print the log to the screen in real time with this command: tail /var/log/syslog -f Then when the server hard crashes, you can take a screenshot of what you see and post it back here. The trouble with crashes is that often times the crash occurs before the events can be written to the log file, so having it printed to the screen can be just fast enough to capture the crash event in the log. It would be helpful to know when these issues started occurring. If you're hardware has been fine for months and months and all of the sudden these issues come out of nowhere, it is likely the result of a problem in the hardware. Might need better cooling, cabling, etc. and keep in mind that the temperature warnings in Unraid are generic. You have to look up the actual hardware you're using to find the temp ratings and then manually set those temps in the disk settings page for those devices. Otherwise you may be giving yourself a false impression that heat isn't the issue when perhaps the drives are being forced to operate at a much higher temp than they should be. Lastly, if you are banging your head against the wall on this, the best thing to do is to try reducing the number of components in your system setup (apps, containers, VMs, cache drive) until you isolate the issue. Try running the array without the cache assigned at all. Turn off all docker containers. Let it just run for a while and see if it crashes. Then slowly start turning services on one by one until you recreate the issue. Once you've found the key variable that causes the crash, we can try and replicate to recreate if its a bug, or give you advice on how to solve it within the software via a configuration tweak.
  10. This is a tricky one. I think we need to go step by step in testing to verify the source of the issue. First, disable your plugins, your VMs, and your Docker Containers and reboot. Let the system just idle with the array started for a while and see if the server remains responsive. Then turn on your Docker containers. If you run into issues, reboot with docker disabled, and start turning containers on one by one. If you can get all containers running, next we move on to VMs. There is just nothing glaring in the logs immediately before all the error messages start showing up, so we need to resort to isolating the Apps and VMs to figure it out.
  11. Hi there, Sorry to hear you're having issues. I'm curious if any of your VMs are taking advantage of passing through a PCI device. If so, I'd like to ask that you disable IOMMU on your motherboard, reboot, and try again with your VMs off. From doing some research on some items in the log, it appears that in some cases, certain hardware has been known to cause issues when IOMMU is enabled. It would also help to get the full hardware specs from you (motherboard, etc.) to know what kind of gear we are working with. It's also odd to hear that things were slow, then reboots stopped being effective at resolving the issue, then after a while everything was normal again. That is really odd and more indicative of something else on the network than the server itself (otherwise you'd expect the behavior to remain consistent). Let us know if the behavior happens again and if so, please try the test I am proposing and let us know if that has any impact.
  12. Hi there, On the edit VM page, try changing the USB controller type for what's being emulated. USB device assignment can be tricky and as some upgrades occur to QEMU, you may see some devices not work anymore with the default virtual controller.
  13. Hi Landon and so sorry to hear you are running into trouble! Let's see if we can sort this out. So just to make sure we're on the same page, your first indication of the issue was when you had the VM running and left it running for a while (and you were in Windows and everything was fine), then you leave it for a while and when you came back, the monitor itself was displaying the TianoCore logo and would never proceed to boot after that, correct? Then you tried updating the OS, which required a reboot of the physical hardware, and after rebooting, do you A) get the exact same behavior with a TianoCore logo on the monitor or B) does the monitor just stay black now? Next question would be what happens if you remove the USB controller but not the GPU? What about removing the GPU but not the USB controller? Would like to narrow this down to a specific device causing the problem if possible. Lastly, I see your system has an integrated graphics controller on the CPU. Do you have a monitor attached to that? If not, can you? Also check the BIOS settings again as some motherboard will randomly flip which graphics device is the default for use with the VM. Another thing you can try is to recreate the VM using SeaBIOS instead of OVMF and see if that works. Let us know and I'll keep an eye on this thread to support you.
  14. hmm, I don't see any network config files in your diagnostics. Try copying the network.cfg file I've attached here into the config folder of your flash and reboot your server. network.cfg
  15. Hi again, The best advice I can give to try and restore Unraid's networking to a default state would be to delete the network.cfg and network-rules.cfg files from the 'config' folder on the flash device and try rebooting. This will reset the network settings to default. You may also need to delete your libvirt.img file and recreate it as well, but you can test without doing that first and see if just the network files will be enough. A link to what you were reading could be helpful in us determining if this was really a bug upstream or not.