unRAID Server Release 6.2.0-beta19 Available


Recommended Posts

So i have had yet again another system hang.

 

I am now posting my entire system diagnostic as i was not able to get one before the system crashed. i have attached my syslog which was being tailed.

 

The system crashed at around 12pm it seems so yet again, not a single thing in the syslog to tell me what went wrong. I'm losing faith in this now as i still have yet to get more then 4 days.

 

I have posts about this issue going back 4 months now on every single version. Multiple diagnostics, multiple syslogs, i was told this should be fixed by making my vms ovmf which is the only 1 i had started this time and still the same thing.

 

I know its a bank holiday but i seriously need some help on this. I have done everything i have been recommended and told/asked to do and none of it has helped

Sorry bigjme. Your issue is difficult to solve because we can't recreate it.  We are planning another release shortly that may help with some of this.

 

I am more then happy to run anything on this server at all if it gives you a better idea of the issue, even if i have to give you remote access to the server i honestly don't mind

 

Any idea when this release could be out? This issue has been with me since 6.1.3 last year so i'm slightly anxious to get it resolved

 

As i say, it is easter weekend so thank you for your response so quickly jonp

 

Edit:

So just an an update, the crashes seem to be coming more regularly now. It has been less than 3 hours and i have just had another occur

 

To give you some insight as to what the vm was doing on the last crash

 

I am running windows 10 with all power savings turned off - (all power saving is turned off in my bios too)

I was running Blue Iris recording one IP camera to a network share - so it is constantly using cpu, memory, network, and the network shares

I was playing a movie via amazon prime but had left it paused full screen on my display.

 

The recorder was running from my last post so 3 hours. The video was paused for around 10 minutes on my screen. I came back to my system with no access, the video still paused on the screen but unable to do anything.

No ssh, telnet, or anything like that. The log i had left running on another screen has not picked up anything at all in the last 30 minutes of running so there is literally nothing to go on as to what causes this issue.

 

The unraid USB is still powered but with no usage light flashes. I have had trouble getting my 750ti to be picked up by any OVMF vms (unrecognized ROM someone found in my logs) so i am more then happy to remove that GPU to see if it is causing any low level system issues (is is not being used currently)

 

The only other thing i can think of is that there may be a fault with my unraid usb? Just throwing a load of ideas out there for limetech to look over (may rule some things out perhaps?)

 

How old is the HW you are running on?

Maybe it's a power problem?

A few years back, I had problems with my server.. It was an Windows Server.. It worked fine.. but suddenly I would have very slow network transfer speed..

Took a look at the motherboard.. found som bad caps.. replaced them, and all worked fine for some months again.

Then It would not boot.. tried re-installing the OS, but did not manage to get that far.. I don't remember if it bluescreened when loading files, or if I got so far that I could select disk for installation.

Took a look at the motherboard again.. and guess what.. yeah.. some more bad caps.. Replaced them and it booted just fine again.

 

A friend also had a desktop that would not boot.. found out that the graphics card did not work.. but the built-in one worked just fine.. Got in to windows to find out that the soundcard also did not work... but the built-in one did.

Took a look at his motherboard.. Found one bad cap, replaced it.. and both the graphics and audio card worked fine again.

 

 

So.. the conclusion is that a bad cap, might cause a lot of different problems ;)

Link to comment
  • Replies 194
  • Created
  • Last Reply

Top Posters In This Topic

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Link to comment

I have som issues with the beta and a VM.

In 6.1.9 I had a Windows 2012 R2 VM (running in Bios mode, since I could not manage to make partitions in UEFI mode), it was running in one disk image, placed on Disk 1.

 

[...]

 

When I tried to access the data drive in the VM now, the VM hang.

Could not shut it down.. and "Force Stop" did not work either.. Just got this error:

"Execution error ... Failed to terminate process 16228 with SIGKILL: Device or resource busy"

 

Is this a bug in 6.2 beta, or is it becuase I have one VM, running on two different Disk images, placed on differnt disks?

I will try to re-install it, just using one Disk image, placed on Cache drive.

Sounds exactly like THIS.

No solution or reason why, but it seems there are issues while running VMs with disks on the array.

 

More like a "bug" with vdisks placed on the array, that crashes VMs when the start writing something on it.

If you place your OS vdisk on the array, you probably wont get far after the login, if you even make it that far.

 

Cache-Only VMs do not show these issues. Workaraund would be to copy everything to the cache or run it outside of the array or a rollback.

Link to comment

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried asking the motherboard manufacturer about this issue also? It might be a bad implementation of virtualization and something they have to fix in a bios update.

Link to comment

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried asking the motherboard manufacturer about this issue also? It might be a bad implementation of virtualization and something they have to fix in a bios update.

 

Actually I haven't no, would vmware work under Windows if it was a vm issue.

Link to comment

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried asking the motherboard manufacturer about this issue also? It might be a bad implementation of virtualization and something they have to fix in a bios update.

 

Actually I haven't no, would vmware work under Windows if it was a vm issue.

I have no idea. It might be that VMware doesn't trigger the problem, but Linux and KVM do.

It might also be an idea to post in the vfio mailing list.

Have you tried running a Linux distro and KVM to see if the same thing happens or is it too much trouble to do it?

Link to comment

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried running each VM alone to see if it still crashes then? If you have time you can test that.

Link to comment

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried running each VM alone to see if it still crashes then? If you have time you can test that.

 

I have tried each vm by themselves. It happens more with the seabios vm with the 750ti but a lot less with my 780 ovmf vm running by itself.

 

The hardware is about 4 months old, all top of the line x99, 5960x etc

 

It was a full replacement of everything in the system. It doesn't happen when I have no vms running

 

(the hardware also has no issues with just windows installed and running)

Have you tried asking the motherboard manufacturer about this issue also? It might be a bad implementation of virtualization and something they have to fix in a bios update.

 

Actually I haven't no, would vmware work under Windows if it was a vm issue.

I have no idea. It might be that VMware doesn't trigger the problem, but Linux and KVM do.

It might also be an idea to post in the vfio mailing list.

Have you tried running a Linux distro and KVM to see if the same thing happens or is it too much trouble to do it?

 

It would be rather a lot to get that up and running to test it specifically I'm afraid

Link to comment

A couple of long shots -

- remove any non-working hardware, like the 750ti

- remove the Marvell controller, you aren't using it, and Marvell stuff has been notorious for VM-related bugs

 

Suggestion -

- Turn spin down off for Disk 2.  I noticed that it is constantly spinning up and down.  I think all industry people agree that that could be slightly life shortening, better to stay spinning always. (probably no connection with your other issues)

Link to comment

A couple of long shots -

- remove any non-working hardware, like the 750ti

- remove the Marvell controller, you aren't using it, and Marvell stuff has been notorious for VM-related bugs

 

Suggestion -

- Turn spin down off for Disk 2.  I noticed that it is constantly spinning up and down.  I think all industry people agree that that could be slightly life shortening, better to stay spinning always. (probably no connection with your other issues)

 

You mean disable the marvel sata controller? I don't think I am using that controller so that is possible.

 

I will give those recommendations a try. Thank you

Link to comment

Hello all,

 

I decided to give unRAID 6.2 Beta a run since I just built the server and have no data on it. Everything seems to work fine and dandy most of the times, and am impressed with the new features.

 

However, one issue is making me tear my hair out. I have a Win 10 VM with a forwarded NVidia gfx card. I can game on the VM with no issue...forwarded a USB expansion card to the VM and have all peripherals connected to it, etc...

 

What happens is that in the VM I set Windows to turn off the screen after 15mins. Not sleep or hibernate, just turn off the screen. If I press a key on the keyboard or move my mouse, the monitor springs back to life...however no picture is shown. At that stage, if I go to unRAID web from another PC, and try to stop/force stop/restart the VM...the whole unRAID server crashes.

 

Should I just set the VM OS to never turn off the monitor...or is there something fishy?

 

Thank you.

 

Quick update - Even without the screen turning off, the whole VM freezes after 30 mins, and so does the unRAID server in general. I can still access the shares though.

Link to comment

I have som issues with the beta and a VM.

In 6.1.9 I had a Windows 2012 R2 VM (running in Bios mode, since I could not manage to make partitions in UEFI mode), it was running in one disk image, placed on Disk 1.

 

[...]

 

When I tried to access the data drive in the VM now, the VM hang.

Could not shut it down.. and "Force Stop" did not work either.. Just got this error:

"Execution error ... Failed to terminate process 16228 with SIGKILL: Device or resource busy"

 

Is this a bug in 6.2 beta, or is it becuase I have one VM, running on two different Disk images, placed on differnt disks?

I will try to re-install it, just using one Disk image, placed on Cache drive.

Sounds exactly like THIS.

No solution or reason why, but it seems there are issues while running VMs with disks on the array.

 

More like a "bug" with vdisks placed on the array, that crashes VMs when the start writing something on it.

If you place your OS vdisk on the array, you probably wont get far after the login, if you even make it that far.

 

Cache-Only VMs do not show these issues. Workaraund would be to copy everything to the cache or run it outside of the array or a rollback.

 

I think so.. Just finished re-installing the VM, and now I just made one vdisk image, and just had it at "auto" on where to store it.

Worked fine copying OS images to it now.. Copied 32GB without problems :)

But experienced something wierd on the first logon from a RDP connection instead of VNC.. It just crashed with a bluescreen (saw that in the VNC window).. but after it rebooted, it worked fine.

Link to comment

A couple of long shots -

- remove any non-working hardware, like the 750ti

- remove the Marvell controller, you aren't using it, and Marvell stuff has been notorious for VM-related bugs

 

Suggestion -

- Turn spin down off for Disk 2.  I noticed that it is constantly spinning up and down.  I think all industry people agree that that could be slightly life shortening, better to stay spinning always. (probably no connection with your other issues)

 

Ok so i have turned off the marvel controller and it is no longer shown under device manager. I have removed the 750ti and disabled spindown (disk 2 is the drive data is currently being written to)

I also had a new bios update for support of new intel cpus (new xeons) but you never know what comes down with these updates

 

Time to wait a while now and see how the system holds up

Link to comment

So I tried to do a downgrade using just the bzimage and bzroot files for 6.1.0, but that caused more problems than it solved. Docker would not work at all, VM's would not function. The array is available though to access my data. Is there any better way to do a downgrade? I'm also troubleshooting issues with a new Startech PEXSAT34RH sata card on beta 19. It seems that my disks are recognized, but after some time they will stop showing smart stats and temps. When I take the array offline, the drives are no longer being seen by unraid. I'm hoping this is just a problem with the beta and not an overall incompatibility with unraid and the card.

Link to comment

So I tried to do a downgrade using just the bzimage and bzroot files for 6.1.0, but that caused more problems than it solved. Docker would not work at all, VM's would not function. The array is available though to access my data. Is there any better way to do a downgrade? I'm also troubleshooting issues with a new Startech PEXSAT34RH sata card on beta 19. It seems that my disks are recognized, but after some time they will stop showing smart stats and temps. When I take the array offline, the drives are no longer being seen by unraid. I'm hoping this is just a problem with the beta and not an overall incompatibility with unraid and the card.

 

If that "6.1.0" is not a typo, try downgrading to 6.1.9 instead.

 

That Startech card looks like a four port RAID card. Does it offer JBOD mode?

 

Link to comment

I'm also troubleshooting issues with a new Startech PEXSAT34RH sata card on beta 19. It seems that my disks are recognized, but after some time they will stop showing smart stats and temps. When I take the array offline, the drives are no longer being seen by unraid. I'm hoping this is just a problem with the beta and not an overall incompatibility with unraid and the card.

Chipset is a Marvell 88SE9230.  Try checking for an updated firmware for it.  Be aware Marvell chipsets on disk controllers are known lately for being buggy, especially when VM's are involved.

 

Edit:  to be fair though, many users *are* successfully using Marvell based cards.

Link to comment

Sorry about the typo. I did mean 6.1.9. Is there a proper way to downgrade from 6.2B19? Also I have checked for a firmware update for the card but there doesn't appear to be one. I did read about the problem with Marvell chips though. If anything I'll just exchange this card and try to find another one that works.

Link to comment

Sorry about the typo. I did mean 6.1.9. Is there a proper way to downgrade from 6.2B19? Also I have checked for a firmware update for the card but there doesn't appear to be one. I did read about the problem with Marvell chips though. If anything I'll just exchange this card and try to find another one that works.

copy bzimage and bzroot from the previous folder on your flash drive to the root of the flash, and delete bzroot-gui from the root of the flash.  Ideally I guess you should restore the backup of the syslinux.cfg file, but that's not *strictly* required (you'll just wind up with an extra boot option which won't work)

 

Also, in my testing because docker in 6.2 updates a whack of stuff in the docker.img file, you'll probably notice that you'll have no docker apps running once you downgrade.  Easiest solution there is to delete your docker.img file and then recreate it and re-add your apps.

Link to comment

Sorry about the typo. I did mean 6.1.9. Is there a proper way to downgrade from 6.2B19? Also I have checked for a firmware update for the card but there doesn't appear to be one. I did read about the problem with Marvell chips though. If anything I'll just exchange this card and try to find another one that works.

copy bzimage and bzroot from the previous folder on your flash drive to the root of the flash, and delete bzroot-gui from the root of the flash.  Ideally I guess you should restore the backup of the syslinux.cfg file, but that's not *strictly* required (you'll just wind up with an extra boot option which won't work)

 

Also, in my testing because docker in 6.2 updates a whack of stuff in the docker.img file, you'll probably notice that you'll have no docker apps running once you downgrade.  Easiest solution there is to delete your docker.img file and then recreate it and re-add your apps.

 

I did try this yesterday actually. The results were not good. My array came up fine, but docker was screwed up either way. First thing I did was delete the img before spinning up any dockers. I kept running into errors when I went to start the dockers on a new img.

Link to comment

 

 

I'm also troubleshooting issues with a new Startech PEXSAT34RH sata card on beta 19. It seems that my disks are recognized, but after some time they will stop showing smart stats and temps. When I take the array offline, the drives are no longer being seen by unraid. I'm hoping this is just a problem with the beta and not an overall incompatibility with unraid and the card.

Chipset is a Marvell 88SE9230.  Try checking for an updated firmware for it.  Be aware Marvell chipsets on disk controllers are known lately for being buggy, especially when VM's are involved.

 

Edit:  to be fair though, many users *are* successfully using Marvell based cards.

 

I started getting sparse sata link resets and frozen errors on a couple drives on my 9230 which I hadn't seen since 6.0 betas. It may have been only after I added a second parity drive (one of the drives getting errors) to that controller. I updated the firmware. I should have a long time ago. No errors since.

Link to comment
Guest
This topic is now closed to further replies.