Updated to 6.2, server is now in a constant reboot loop


Herdo

Recommended Posts

I just updated to 6.2 stable and upon restarting I noticed my servers web GUI didn't reload.  Went to the server and found it constantly rebooting itself.

 

It starts up fine, gets to the boot loader, counts down and auto selects the main unRAID option, then prints out:

 

"loading bzimage" then "loading bzroot"

 

but then it immediately reboots and starts the whole process over again.  I left it a while and it just keeps doing it.

 

I tried the safe mode option and the same thing happened.

 

 

I'm going to try and repair the USB device using chkdsk, but I'm wondering if it would be safe to try and replace bzroot.  Maybe it got corrupted somehow?

 

There is unfortunately no way for me to get a diagnostics file.

 

 

Any other ideas?

Link to comment

Thanks Squid.

 

I replaced those two files and I still have the same issue.

 

However there was one thing I didn't try, the "GUI" version.  Just tried selecting it from the boot menu and it works fine.

 

I can easily change the boot loader settings to automatically launch the GUI version, but I still am wondering why the non GUI version is failing.

Link to comment

REALLY strange =>  Am I correct in assuming you can boot the GUI version; and then it works fine ... i.e. you can access the WEb GUI from your normal clients; etc. ??

 

Is there anything "strange" in your configuration?  [i.e. any unusual disk controller; non-standard disks; etc.]

 

... and are you seeing any error messages "flash by" on the console?    If so, can you tell if they're from Linux (i.e. after it's been trying to boot for a while) or from your BIOS (would likely be nearly instantaneous after the bzroot/bzimage loading completed) ?

 

Did you disable all VM's and Dockers before doing the upgrade?

 

 

 

 

 

Link to comment

REALLY strange =>  Am I correct in assuming you can boot the GUI version; and then it works fine ... i.e. you can access the WEb GUI from your normal clients; etc. ??

 

Is there anything "strange" in your configuration?  [i.e. any unusual disk controller; non-standard disks; etc.]

 

... and are you seeing any error messages "flash by" on the console?    If so, can you tell if they're from Linux (i.e. after it's been trying to boot for a while) or from your BIOS (would likely be nearly instantaneous after the bzroot/bzimage loading completed) ?

 

Did you disable all VM's and Dockers before doing the upgrade?

 

Yes, I can access everything fine from the web UI, all disks and shares are present, etc.  Nothing seems to be wrong with it now.

 

The only thing odd was that I got a Fix Common Problems warning telling me I shouldn't be placing my Dockers "appdata" in a user share.  I set the default to /mnt/cache/appdata/ (which is where is was before, and where all my Dockers appdata is currently stored) and the warning went away.

 

I must have read the 6.2 announcement post three times, but I did forget to disable the Dockers before updating.  I just checked them all and they all seem to be functioning correctly now though.

 

Maybe that was the problem?

 

I will run memtest overnight as well.

 

Thanks for the replies everyone!

Link to comment

If the Dockers are all functioning okay now, you might try a "normal" boot and see if it's now functioning okay.    Could be that was the issue; but once you were running and updated the Dockers it won't have the issue anymore.

 

That's what I was hoping for too, but I just tried it and it's still not working.  The GUI mode still works fine though.

Link to comment

The only thing odd was that I got a Fix Common Problems warning telling me I shouldn't be placing my Dockers "appdata" in a user share.  I set the default to /mnt/cache/appdata/ (which is where is was before, and where all my Dockers appdata is currently stored) and the warning went away.

With 6.2 final, I'll be removing that test shortly as its no longer applicable.
Link to comment

The only thing odd was that I got a Fix Common Problems warning telling me I shouldn't be placing my Dockers "appdata" in a user share.  I set the default to /mnt/cache/appdata/ (which is where is was before, and where all my Dockers appdata is currently stored) and the warning went away.

With 6.2 final, I'll be removing that test shortly as its no longer applicable.

 

 

Good to know thank you.

 

 

As for the possible memory issues.  I actually did have memory issues initially.  Apparently the Dell H310 in a PCIe slot and memory sticks in slots A1 and B1 don't play well together.  I actually had to place a small piece of tape over some of the pins.

 

You can see more about this issue here:  http://yannickdekoeijer.blogspot.com/2012/04/modding-dell-perc-6-sas-raidcontroller.html

 

Not sure if it's related at all, but thought I'd mention it.

 

 

I've rebooted the server several times now in both the GUI and non-GUI versions.  Non-GUI version still does not work.  I get:

 

loading bzimage........... ok

 

loading bzroot.............

 

and then the screen goes black, but for a split second it flashes back to the boot menu and I believe I can see it say

 

loading bzimage........... ok

 

loading bzroot............. ok

 

 

Maybe it's getting hung up AFTER bzroot?

 

The GUI version continues to boot fine and I've watched closely and can't see any errors while it's loading.

Link to comment

I'm not sure how much this will help now, but I generated a diagnostics zip.

 

It's odd because the GUI version still has to run bzroot before it runs bzroot-gui, yet it works.

 

I noticed I am getting an ACPI error:

 

Sep 15 19:42:13 unRAID kernel: ACPI: Core revision 20150930
Sep 15 19:42:13 unRAID kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_Zumba) while loading table (20150930/tbxfload-193)
Sep 15 19:42:13 unRAID kernel: ACPI Error: 1 table load failures, 7 successful (20150930/tbxfload-214)

unraid-diagnostics-20160915-1943.zip

Link to comment

I posted a note in the 6.2 announcment thread to be sure Tom is aware of this issue => hopefully he'll take a look at it and might have some insight about what might be going on.

 

It's interesting that nobody else has had this problem => hopefully there's something in the diagnostic log that will jump out at a knowledgeable Linux guy (e.g. Tom or JonP) as the reason for it.

 

Link to comment

I posted a note in the 6.2 announcment thread to be sure Tom is aware of this issue => hopefully he'll take a look at it and might have some insight about what might be going on.

 

It's interesting that nobody else has had this problem => hopefully there's something in the diagnostic log that will jump out at a knowledgeable Linux guy (e.g. Tom or JonP) as the reason for it.

 

Hey thanks garycase, that is a good idea and it's something I should have thought of.

Link to comment

I'm not sure how much this will help now, but I generated a diagnostics zip.

 

It's odd because the GUI version still has to run bzroot before it runs bzroot-gui, yet it works.

 

I noticed I am getting an ACPI error:

 

Sep 15 19:42:13 unRAID kernel: ACPI: Core revision 20150930
Sep 15 19:42:13 unRAID kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_Zumba) while loading table (20150930/tbxfload-193)
Sep 15 19:42:13 unRAID kernel: ACPI Error: 1 table load failures, 7 successful (20150930/tbxfload-214)

 

The ACPI error is an expected thing and can be ignored, not related to your issue.

 

Have you checked your flash device for corruption?

 

My experience with these kind of loading problems is the flash has some file system corruption and fixing it using Windows check disk usually works, you may want to try that.

 

Link to comment

I'm not sure how much this will help now, but I generated a diagnostics zip.

 

It's odd because the GUI version still has to run bzroot before it runs bzroot-gui, yet it works.

 

I noticed I am getting an ACPI error:

 

Sep 15 19:42:13 unRAID kernel: ACPI: Core revision 20150930
Sep 15 19:42:13 unRAID kernel: ACPI Error: [\_SB_.PCI0.XHC_.RHUB.HS11] Namespace lookup failure, AE_NOT_FOUND (20150930/dswload-210)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, During name lookup/catalog (20150930/psobject-227)
Sep 15 19:42:13 unRAID kernel: ACPI Exception: AE_NOT_FOUND, (SSDT:xh_Zumba) while loading table (20150930/tbxfload-193)
Sep 15 19:42:13 unRAID kernel: ACPI Error: 1 table load failures, 7 successful (20150930/tbxfload-214)

 

The ACPI error is an expected thing and can be ignored, not related to your issue.

 

Have you checked your flash device for corruption?

 

My experience with these kind of loading problems is the flash has some file system corruption and fixing it using Windows check disk usually works, you may want to try that.

 

Thanks for the suggestion bonienl.  Chkdsk  was one of the first things I did.  I also tried reloading 6.2 and replacing the bzimage, bzroot, and bzroot-gui files.  Same issue though.

Link to comment

I noticed in your original post that you'd tried chkdsk ... and was going to suggest trying a different flash drive altogether, but when you indicated it works perfectly if you simply do a GUI boot I discounted the flash drive, since it is clearly loading just fine for the GUI boot.

 

I suppose trying a different flash drive would still be a good idea => you could do it without a license just to see if it gets to the Web GUI okay ... and if it does you may want to transfer your license key to the new flash.  [You may want to see if Tom has any suggestions before doing this.]

 

 

 

Link to comment

Noticed another thread where the user also had an issue getting 6.2 to work (not the same issue you're having, but nevertheless upgrade-related).    Reverted to 6.1.9 and all worked well.  Tried upgrading again and it didn't work.  Loaded 6.2 on a different flash drive and it worked !!

 

So ... there may be something REALLY strange at play here that's related to your flash drive (Hard to imagine what, since it works with the GUI boot => but I've definitely seem some very strange issues over the years).

 

... just to be sure, I'd definitely try installing 6.2 on another flash drive.  Just do a clean install; and confirm it boots okay.  If so, then you can add your key and move it to the new flash drive following the online key replacement process.

 

Link to comment

Noticed another thread where the user also had an issue getting 6.2 to work (not the same issue you're having, but nevertheless upgrade-related).    Reverted to 6.1.9 and all worked well.  Tried upgrading again and it didn't work.  Loaded 6.2 on a different flash drive and it worked !!

 

So ... there may be something REALLY strange at play here that's related to your flash drive (Hard to imagine what, since it works with the GUI boot => but I've definitely seem some very strange issues over the years).

 

... just to be sure, I'd definitely try installing 6.2 on another flash drive.  Just do a clean install; and confirm it boots okay.  If so, then you can add your key and move it to the new flash drive following the online key replacement process.

 

 

Thanks garycase.

 

OK, so I ran memtest for about 42 hours and 38 full passes with 0 errors.

 

I then installed unRAID 6.2 on a new flash drive, and I had the same issue.  Then I tried 6.19 on that new flash drive, and it booted just fine.

 

So apparently it isn't my flash drive.

 

 

Does anyone have any other ideas?

 

I'm not sure if it's getting hung up on "loading bzroot....", because the screen goes black but it flashes back for a split second and I think I can see it says "loading bzroot......  ok".

 

It may be directly after that, which I think (at least in 6.19) says:

 

"early console in decompress_kernel

 

Decompressing Linux..."

 

 

The 6.2 GUI version still works completely fine.

Link to comment

Did you copy the syslinux folder which comes with 6.2 to your flash device, might be worth to redo that part.

 

 

 

I used the web ui to update my unRAID install.

 

As for the new/test flash drive, I downloaded the 6.2 zip from the website, and yes, I copied everything over including the syslinux folder.

Link to comment

Did you copy the syslinux folder which comes with 6.2 to your flash device, might be worth to redo that part.

 

 

 

I used the web ui to update my unRAID install.

 

As for the new/test flash drive, I downloaded the 6.2 zip from the website, and yes, I copied everything over including the syslinux folder.

 

You have quite a mysterious case ::)

 

Have you tried the USB stick in a different USB port (there are known issues with USB3).

 

 

Link to comment

Did you copy the syslinux folder which comes with 6.2 to your flash device, might be worth to redo that part.

 

 

 

I used the web ui to update my unRAID install.

 

As for the new/test flash drive, I downloaded the 6.2 zip from the website, and yes, I copied everything over including the syslinux folder.

 

You have quite a mysterious case ::)

 

Have you tried the USB stick in a different USB port (there are known issues with USB3).

 

 

Oh yea, I should have mentioned that.  I tried 2 USB 2.0 ports and a USB 3.0 port and it still wouldn't work.  I also tried unplugging my other USB devices (UPS and keyboard) and just having the boot flash drive plugged in, but got the same results.

Link to comment

I may have had a similar issue -

 

1) 6.1.9 ran fine.

2) Upgraded. 6.2 took forever to load bzroot and then started giving me "device descriptor read/64, error -71" errors for the usb drive and then hung.

3) Downgraded. 6.1.9 ran fine.

4) Used another usb stick. 6.2 gave me the same errors. 6.1.9 ran fine.

5) Finally (knowing that -71 errors could be indicative of usb hardware issues), I disabled the USB 2.0 controller and BINGO - it booted fine

 

I'll keep playing with it when I have more time but I'm fairly confident it's a hardware issue. No clue as to why 6.2 suddenly puts it in the spotlight. I also haven't tried the gui version.

 

As I prefaced, I can't tell if your issue is same as mine but try disabling the USB2 controller and see if it boots.

 

 

cheers

Link to comment

I may have had a similar issue -

 

1) 6.1.9 ran fine.

2) Upgraded. 6.2 took forever to load bzroot and then started giving me "device descriptor read/64, error -71" errors for the usb drive and then hung.

3) Downgraded. 6.1.9 ran fine.

4) Used another usb stick. 6.2 gave me the same errors. 6.1.9 ran fine.

5) Finally (knowing that -71 errors could be indicative of usb hardware issues), I disabled the USB 2.0 controller and BINGO - it booted fine

 

I'll keep playing with it when I have more time but I'm fairly confident it's a hardware issue. No clue as to why 6.2 suddenly puts it in the spotlight. I also haven't tried the gui version.

 

As I prefaced, I can't tell if your issue is same as mine but try disabling the USB2 controller and see if it boots.

 

 

cheers

 

Great, I'll try this out.

 

Quick question though, what exactly does disabling the USB 2.0 controller do?  Will this limit the speed of the port?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.