Cannot reboot VM without server freeze and hard reboot

December 23, 20178 yr

Trying to get all these items into one place so hopefully someone can help me figure out what is going on the past several weeks, first with a win10 vm, now with a fresh win7. Whenever a reboot of the VM is required, I lose access to everything (GUI, shares, SSH). Only solution is a hard reboot. I just went through it again with FCP Troubleshooting. FCP syslog and the last diag is made before crash are attached. Also a photo of the monitor output is attached, let me know if I missed something, as I'm sure there will be another oppor tunity.

Along the way I have seen an error in the SuperMicro log, jonly occurred this one time:

I also have a disk in the array showing 31 uncorrectable offline errors. This one was fine before all of these issues. UNRAOD always tells me it's fine, no red balls, so I have left it go, but there is also now an unassigned drive that will someday become a part of the array. But I need to resolve the crashed once and for all.

Thanks and happy holidays!

unraid-diagnostics-20171223-0929.zip

FCPsyslog_tail.txt

Quote

December 23, 20178 yr

Community Expert

Disk1 need to be replaced, it's not disabled yet because it only failed on read so far, but if another disk fails your stuck with 2 bad disks:

Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033280
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033288
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033296
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033304
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033312
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033320
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033328
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033336
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033344
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033352
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033360
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033368
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033376
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033384
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033392
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033400
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033408
Dec 22 06:28:57 UNRAID kernel: md: disk1 read error, sector=6503033416

Screenshot with point to hardware issues with your SSD, check cables

Quote

December 23, 20178 yr

Author

Ok I will look into those. For my education why would it only be an issue on vm reboots? VMs run fine otherwise, and all Dockers are on SSD too.

Quote

December 23, 20178 yr

Community Expert

3 minutes ago, btrcp2000 said:

For my education why would it only be an issue on vm reboots?

It is strange, and with just the screenshot it's more of a guess and difficult to say for sure, unfortunately nothing on the syslog about that, are you passing any pci device to that VM?

Quote

December 24, 20178 yr

Author

Yes, in fact that's the entire reason for existence of the vm. It has a USB card, two Hauppauge Colossus TV tuners, and a video card. All but the video card are stubbed.

Quote

December 29, 20178 yr

Author

Removed faulty 4tb hard drive and replaced with a larger 8tb one which I had been meaning to add anyway. Also swapped out SATA cables for the kind with clips. It took a few days to rebuild, then everything seemed stable and the VM was able to reboot successfully several times. Today I rebooted the VM and got this:

image.png.7788d1473e616a99b833892a493c4ed4.png

now what?

Quote

December 29, 20178 yr

Community Expert

Those are hardware errors, so bad device, cable, controller or power.

Quote

January 4, 20188 yr

Author

I have swapped out cables for new ones with latching mechanisms and changed SATA ports on the motherboard and still see issues with VM shutdown. Those cache drives are both brand new, and the server seems to run fine outside of VM reboots. It's tough to still think it's a hardware issue, but I can't rule it out. I have now added a SAS8087 into the mix to use further different SCU ports, so we'll see how that goes.

I also saw today that there was apparently a known VM shutdown problem that was addressed with a kernel update in a recent beta release of 6.4 (see here). First I had heard of it (not that I was paying close attention), so that will be the next thing I try if I continue to see issues. Sounds like a stable release is forthcoming, so I'm hoping things can stay stable long enough so I don't have to install a beta

Quote

January 4, 20188 yr

Community Expert

1 hour ago, btrcp2000 said:

I also saw today that there was apparently a known VM shutdown problem

There is one affecting some users on a couple o rc releases, not if your using v6.3.5.

In any case posting the full diagnostics instead of screenshots might help better see what the problem is.

Quote

January 8, 20188 yr

Author

That's what I thought I did in the first post with the FCP, diag, and terminal screenshots but it was inconclusive? I will have reason to reboot it later this week, so is there a step I am missing to produce something helpful in addition to those 3 things?

Thanks!

Quote

Cannot reboot VM without server freeze and hard reboot

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)