jonp

July 30, 2020

58 minutes ago, pelux said:

I've done this and can now access the GUI but cannot start the array due to invalid key. I'm assuming the next step is to replace the key with a new one bound to the new flash GUID but this note has me a bit worried:

Note: Replacing a Registration key results in permanently blacklisting the previous USB Flash GUID.

Does this mean that if things don't work out with this new flash drive, I won't have the option to fall back on my current, semi working, drive. Even by replacing the key again?

That's correct. Once a flash device is blacklisted, it can't be used with Unraid anymore. There is definitely something amiss with the config/plugins on your old flash drive. You can try going back to that device and redoing it the same way I had you do the new device and see if that works.

July 28, 2020

Ok, redo your new flash drive so it's fresh and clean (no old config folder copied in there yet). Next, copy over the following files from your old config folder to the new one:

super.dat

disk.cfg

ident.cfg

passwd

Plus.key

smbpasswd

You should also copy over the "shares" subfolder, but do not copy over any of your plugins. Then start up the flash and see if that works.

July 28, 2020

On 7/27/2020 at 6:23 AM, [email protected] said:

Good Day All

I an New to the Unraid world and know my Way around NAS and Servers but i keep getting Freeze ups when doing random Task on my Server, some days when i let it Run and do nothing but as soon as i copy data over to a Share or Run a VM like MacinaBox the whole System Freezes up.

where the Web Ui does not respond and Network Drives go offline and the Server Freezes, i have also Ran it in GUI mode on the Server and it does the Same not matter the Mode i launch it in.

Please help

hydrogen-diagnostics-20200727-1256.zip 96.12 kB · 1 download

Okay update on the 28 July 2020

Ran a Mem Test and all passed fine

Also i get this error on boot up

hydrogen-diagnostics-20200728-2024.zip 109.88 kB · 0 downloads

Hi there,

What I think you'll need to do is either setup a syslog server or with the monitor/keyboard connected, login to the console and type the following command right after it boots:

tail /var/log/syslog -f

This will begin printing the log out to the screen. When the crash occurs, capture what you can off the monitor and post it back. I'd also check for any BIOS updates to see if something may be amiss in that configuration.

July 28, 2020

Hi there,

One major question before we can start diagnosing the issue is in regards to this:

On 7/26/2020 at 5:25 PM, Derrikdj said:

but in the last few days the network connection to my server suddenly disconnects

What happened in the last few days? I'm asking because it's very unusual for a working system to all of the sudden start exhibiting this kind of behavior without any changes to the software or hardware. Did you recently update Unraid and this started occurring after that? What about updates to your router or switch? Any hardware changes on the server or network recently? Also from looking at the logs, it appears the behavior doesn't start until a little over 7 hours after you've booted it up. Does that sound right?

Another thing you can try is to disable the use of eth0 and eth1 in the bonding group. Stop the array and navigate to the Network Settings page and try taking those unused devices out of the bond configuration.

July 24, 2020

Ok, I took another look at your system diagnostics. First let's try this. If you have another USB flash lying around, please download and install the latest Unraid release onto it (not NVIDIA build) and try and boot your server using that (remove your primary USB and set aside during this test). The sole purpose of this is to determine if a stock configuration of Unraid loads correctly or not. You have a lot of plugins installed that could be causing issues, so this is a way to verify whether or not that is the case here. Alternatively you can try booting Unraid in safe mode.

July 22, 2020

On 7/13/2020 at 4:04 PM, ThePockets said:
@jonp Sorry for the late update, thank you all for the suggestions! Using OVMF + Q35 did solve the problem of it crashing the entire server. If I continue using VNC for graphics it looks like I can pass anything else through fine. I passed a USB controller and onboard audio through and both worked great. If I try to pass through my GPU though, all I get is a black screen. On the VM logs I get this:
2020-07-09 02:31:15.643+0000: Domain id=1 is tainted: high-privileges
2020-07-09 02:31:15.643+0000: Domain id=1 is tainted: custom-argv  				<---- THIS LINE IS MARKED RED
2020-07-09 02:31:15.643+0000: Domain id=1 is tainted: host-cpu
char device redirected to /dev/pts/0 (label charserial0)
2020-07-09T02:31:17.580603Z qemu-system-x86_64: -device vfio-pci,host=0000:08:00.0,id=hostdev0,bus=pci.2,addr=0x0,romfile=/mnt/cache/domains/vbios/TU116_edited.rom: Failed to mmap 0000:08:00.0 BAR 3. Performance may be slow		<----- THIS LINE IS MARKED YELLOW
2020-07-09T02:48:15.583801Z qemu-system-x86_64: terminating on signal 15 from pid 6361 (/usr/sbin/libvirtd)
I dumped the GPU's vBIOS using GPU-Z on another computer, and I tried using an unedited vBIOS as well as the changes that SpaceInvader One explained in his NVIDIA GPU passthrough video. Neither version seems to work. I have been trying to troubleshoot these problems (hence why I haven't replied in a while), but I haven't found anything that has helped. Thanks again for your help!

Ok, so is it fair to say that in your current state, using OVMF + i440fx GPU pass through "works" but you get a server crash. Using OVMF + Q35 "works" for everything BUT GPU passthrough. Correct? What about Q35 + SeaBIOS? Did you try that and did that work for GPU pass through?

July 22, 2020

Hi again,

I'm a little confused on one of your updates from earlier. You say you created a new docker image called docker2.img and then you say this:

On 7/21/2020 at 3:01 PM, JPDom1 said:

I have access to my dockers again but when trying to load a docker from the templated i get a failure every time see photo below.

But then later you say:

On 7/21/2020 at 3:16 PM, JPDom1 said:

I got pi**ed off and walked away....Came back and

Was happy to see my containers and VM again.

So I'm confused. Did you redownload all these into the new docker2.img or is this once again trying to use the original docker.img? If the new docker2.img, how did you get past the error you previously reported?

Another thing to try: disable the docker service from the Docker Settings page and see if the Community Apps plugin works again. Maybe one of your containers is causing a weird network conflict? If so, you should try turning your containers on one at a time until you can recreate the problem.

The key for us to help solve this is to figure out what has tripped the system into this state. None of this is normal behavior and I am desperate to try and find a way to recreate your issue on my end so we can debug.

July 22, 2020

Hi there,

The challenge might be the graphics card you are using. But if you had this working previously on your homebrew setup, it should work on Unraid. Can you try changing the machine type of the VM to i440fx? If that doesn't work, try changing the BIOS type to SeaBIOS and see if that works.

July 19, 2020

Hi there,

If this issue has been occurring since you originally went to configure Unraid, then it is likely there is something amiss with the hardware or BIOS that is causing these issues. If you had a working setup originally and then one day this started happening, we have to figure out what changed that caused this to start happening.

Perhaps try formatting the flash to default settings and see if you can get Unraid to boot there. If so, then you know something is amiss in the configuration of your previous flash.

July 16, 2020

Under SMB settings there is an "Enhanced macOS" setting you can try turning on. I think that will mainly affect the speed by which files are loaded in "Finder" but give it a try!

July 16, 2020

While I can't really comment on the Community Apps issue ( @Squid is the expert there), I can tell you that the warning on the Docker settings page is valid and you need to perform the action requested (delete your Docker.img file and recreate it).

July 16, 2020

Hmm, I'm just not seeing anything jumping out at us as to what the issue could be. First and foremost, I would disable the use of all plugins and try booting. If that doesn't work, let's try a different flash drive.

July 16, 2020

On 7/11/2020 at 11:19 PM, siva_haran said:

...I wish it would load folders and files at a faster speed for me to browse... or is this a pipe dream? Should I look for other solutions?

Try this: navigate to the Settings > SMB page and turn on "Enhanced macOS interoperability" and see if that improves performance for you.

July 13, 2020

This is definitely a hardware-specific issue. I don't know why your particular gear is showing these kinds of problems, but I would expect to be hearing from a lot more people if this was a more generic issue. The best we can do with this is try and report the issue upstream to the QEMU/KVM developers in hopes that they know what is going on.

July 9, 2020

Ok, to be fair, Hyper-V and KVM are not anywhere close on the spectrum of hypervisors and if other underlying gear changed (including the HBA and storage), that obviously could have an impact. What about BIOS updates? Any available? Another thing you could try would be to disable IOMMU in the BIOS to see if that has any impact.

July 9, 2020

Wow, that's pretty concerning. If there is no hardware pass-through happening and you're getting these kinds of crashes, it leads me to believe a buggy BIOS on your hardware. What is the underlying hardware on this system?

July 9, 2020

Ok, what happens if you path the storage to something other than that PCIe NVMe Unassigned Device? Again, the goal here is to narrow down the root cause or what combination is causing it.

Another thing you could try would be changing the Machine Type or the BIOS type to see if that has an affect.

July 9, 2020

Hi there,

Are you trying to pass through the NVMe drive to the VM directly? If so, try not doing that and see if you can reproduce the lockup. If so, then the issue stems from the underlying hardware/VM configuration. If the issue goes away, then you know it's isolated to that PCIe device.

July 8, 2020

A few things I'd like you to try:

1) Check for a BIOS update.

2) Create a new VM using SeaBIOS instead of OVMF.

3) Create a new VM using Q35 instead of i440fx.

Try the combination of SeaBIOS + Q35, SeaBIOS + i440fx, and OVMF + Q35 and see if any of those combos have any impact on the crashing. If not, this may be a hardware / BIOS issue that we can't resolve from the software side. I know that AMD offers a good price for performance product, but unfortunately their testing in the VM department leaves a lot to be desired. There is only so much we can do from the software side to quirk the kernel/QEMU. Your last resort would be to contact the motherboard manufacturer and notate the problems you are having to see if they have a beta BIOS that may resolve it for you.

July 1, 2020

2 hours ago, bamhm182 said:

Would you be open to it being made with Python? Seems to me that python would be a good choice since it is easily extensible, cross platform, and easier to maintain. Last I looked into it, you could easily build for Linux, Windows, and OS X. The only stipulation is that the OS X executable needs to be made on OS X.

We're not opposed to it. The big thing is the user can't have to download and install any additional "components" in order to make it work. So it has to be a self-contained executable on all platforms.

July 1, 2020

On 6/29/2020 at 6:25 PM, Fizzyade said:

Without reinventing the wheel, why not just supply an compatible img and then tell people to use balena etcher which is pretty much the go-to image writer, you just have to supply an img in a suitable format.

There are additional things that our creator does that a simple IMG file does not. Our tool validates the USB flash GUID as usable (most of the time ;-), it allows the user to toggle EFI boot mode, as well as customize hostname and networking options. It even lets the user select which release to install (from a backup, the current available release, or from our Next branch). These are features that are important to ease of use for new users and while we can appreciate that not everyone needs this, those that do really appreciate it.

13 hours ago, jammin said:

Is it out of scope to discuss a different method of license enforcement than the USB key serial number? It's a scary single point of failure and kinda restricting to require a USB stick at all.

If it has to be hardware, maybe the check could be when starting the array instead of on boot, and base it on one or more array member serial numbers? Or the MAC address of the NIC?

It is out of scope for the purpose of this RFQ, but know that we are investigating other licensing methods for future inclusion. Changing licensing is always a real iceberg of a problem. Seems small and simple from above the water line, but below it is a gigantic thing just waiting to sink your ship ;-). That is going to have to be another battle for another day.

11 hours ago, Fizzyade said:

Maybe I'm massively missing something, but I've hated the fact that I've had to use the Unraid tool and have longed to have an image that I could just flash with my favourite tool.

While we definitely appreciate what certain users want, we have to address the wider market of users that aren't as savvy. While I definitely agree if you're savvy enough to build a computer, you're probably savvy enough to figure out how to image a USB flash using some generic tool, but we're not just targeting that kind of customer and perhaps longer term users won't be building their own servers at all. The point is, our flash creator tool should work fine, but it's been a bit more of a bear to maintain than we'd like, so we're looking for offers from developers that want to earn a little extra cash to help build this thing. And yeah, we probably will have to fix it again after the next Mac release comes out, but that's fine and something we're also willing to accept.

June 29, 2020

Hi Unraid Community!

We have a special request for anyone who is familiar with the work required and wants to make a little $cash!

A few years back we released our own USB flash creator tool for Unraid OS. For those of you who remember, installation of Unraid used to require a manual process (documented here), but we wanted this new tool to be a far easier way to get up and running.

Here we are a few years later and the tool desperately needs an update, especially the macOS version. Our problem is that the development team is heads down focused right now on getting 6.9 and 6.10 out the door. As such, we wanted to throw out a request to our Community to see if anyone has the tools and talent to help us with this.

This is a formal RFQ (Request for Quote) to correct issues in the current USB flash creator for Unraid OS, for both Windows and Mac platforms. We're not necessarily looking for any increased functionality at this time, though creative ideas on how to make it better will be considered.

To respond to this RFQ, please email [email protected] with your bid and time estimate for the work. We will update this post once a bid has been accepted. If you have questions regarding the RFQ, please post them here so our responses can be made in the post publicly for all to see. Thanks everyone!!

All the best,

Team Lime Tech

June 26, 2020

Hi there,

Saw your email into support and wanted to chime in on your thread here. Unfortunately johnnie.black is right in that you're going to need to take the "one at a time" approach to figure out the root cause. The main problem here is that there wasn't some "event" that occurred prior to these issues that we can point to. Everything was fine until it wasn't. When issues like that happen, 99 times out of 100 it's because of something amiss with the hardware or a plugin/container update that broke something. Do you have your containers set to auto-update or do you manually update them?

You can absolutely check out HTOP through a command line (just type htop from a terminal session) and see a more detailed process reporting, but even then, you will likely still have to resort to shutting down all your containers, letting the system run for a while to see if the CPU usage spikes just randomly and if not, start slowly turning on containers one by one until you find the culprit. I wish I had better advice for you, but again, when the issues just come out of nowhere like this and there wasn't some event that occurred right before the issues manifested, there is just no other way to narrow it down.

June 25, 2020

What would be helpful is to know how long the server was operational before these issues came up and if anything changed in the week or so prior to the issues occurring. If nothing changed and the server all of the sudden started having this behavior, it really has to be something amiss with the hardware. Maybe dust buildup shorted something out or is causing heat or other issues. The PSU might be failing (a failing PSU could cause issues if it can't supply enough power or supplies too much power under certain situations). You can try running a memtest, though that is only 1 possible culprit.

I wish I had more advice for you in this scenario, but unless you can point to a trigger that is causing the issue, it's really like looking for a specific needle in a pile full of needles ;-).

June 24, 2020

Hi there,

Saw your email into support and have read through this thread. You're definitely getting some good advice by folks in here. Johnnie's last post is especially important, although you can also use the idrac connection to print the log to the screen in real time with this command:

tail /var/log/syslog -f

Then when the server hard crashes, you can take a screenshot of what you see and post it back here. The trouble with crashes is that often times the crash occurs before the events can be written to the log file, so having it printed to the screen can be just fast enough to capture the crash event in the log.

It would be helpful to know when these issues started occurring. If you're hardware has been fine for months and months and all of the sudden these issues come out of nowhere, it is likely the result of a problem in the hardware. Might need better cooling, cabling, etc. and keep in mind that the temperature warnings in Unraid are generic. You have to look up the actual hardware you're using to find the temp ratings and then manually set those temps in the disk settings page for those devices. Otherwise you may be giving yourself a false impression that heat isn't the issue when perhaps the drives are being forced to operate at a much higher temp than they should be.

Lastly, if you are banging your head against the wall on this, the best thing to do is to try reducing the number of components in your system setup (apps, containers, VMs, cache drive) until you isolate the issue. Try running the array without the cache assigned at all. Turn off all docker containers. Let it just run for a while and see if it crashes. Then slowly start turning services on one by one until you recreate the issue. Once you've found the key variable that causes the crash, we can try and replicate to recreate if its a bug, or give you advice on how to solve it within the software via a configuration tweak.

jonp

Posts

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Downloads

Store

Gallery

Bug Reports

Documentation

Landing

Posts posted by jonp

Can't access web GUI after unclean shutdown

Can't access web GUI after unclean shutdown

Freezing Issues

Network Connection Keeps Disconnecting

Docker containers not appearing in Dashboard or Docker tabs & Community apps also broken.

Windows 10 VM crashes entire server if PCIe device is passed through

Docker containers not appearing in Dashboard or Docker tabs & Community apps also broken.

Intel NUC8i7HNB GPU pass through

[Solved] Stuck on Kernel Panic after reboot

Inconstant write speeds.

Docker containers not appearing in Dashboard or Docker tabs & Community apps also broken.

Can't access web GUI after unclean shutdown

Slow File Read from Unraid to my Mac - Please help!

System lockups

System lockups

System lockups

System lockups

System lockups

Windows 10 VM crashes entire server if PCIe device is passed through

RFQ: USB Flash Creator Rework

RFQ: USB Flash Creator Rework

RFQ: USB Flash Creator Rework

Unraid 6.8.3 Random bouts of 100% CPU useage. Making server useless.

Trouble in Paradise

Experiencing unresponsive array among other issues