WebGui and Disk Inaccessible - Not Consistent - Unraid 6.1.3 -


titon

Recommended Posts

Hey Folks,

 

I need a little help.  I have gone through the forums, and can't seem to find out what is causing my issue. 

 

Background:

 

I have been running Unraid servers for years.  Earlier this year, I upgraded to Unraid 6 Beta to run xenserver, which has been working great.  I saw that the released version of Unraid had its own VM Manager, so I decided to upgrade to 6.1.3 a few weeks back.

 

During the upgrade, I removed all plugins.  The only feature I am using is VM Manager (no Docker).  My VM image resides on the cache drive.

 

Issue:

 

Ever since I upgraded, I started noticing system stability issues.  Out the blue, my unraid web service would not come up.  At the same time, I am unable to access any of my shares.  I am still able to telnet into the server and my VM is still running.

 

If I telnet into the server, I can go to like /mnt/disk1.  If I do a LS or DIR, my telnet session just freezes.  I can't cancel or interrupt the command.  I can open a new telnet session, and start over. 

 

If i run reboot or powerdown from the command line, it looks like it tries to reboot, but the system never does.  It requires a hard power to take the server down.

 

When the system comes back online, the drives are fine, a parity checks starts up again, and web gui is back online. 

 

No hardware has changed.  The only thing that has changed is the unraid version.  I currently have (2) 5TB drives pre-clearing on another PC.  Before I add them, I hope to get this issue resolved.

 

Syslog -

 

Here is the last lines in my syslog.  It was working fine.  Only things happening to the drive is that they are spinning down.  After I noticed I cant access the web gui or shares, I telnet in the box.  Is it possible that my drives are not spinning back up?  Is there a command to spin the drives back up?

 

Nov 18 03:40:01 SERVER logger: mover finished

Nov 18 06:01:04 SERVER kernel: mdcmd (44): spindown 6

Nov 18 08:09:08 SERVER kernel: mdcmd (45): spindown 1

Nov 18 08:09:09 SERVER kernel: mdcmd (46): spindown 2

Nov 18 08:09:09 SERVER kernel: mdcmd (47): spindown 3

Nov 18 08:09:10 SERVER kernel: mdcmd (48): spindown 4

Nov 18 08:09:10 SERVER kernel: mdcmd (49): spindown 5

Nov 18 10:26:39 SERVER kernel: md: sync done. time=35561sec

Nov 18 10:26:39 SERVER kernel: md: recovery thread sync completion status: 0

Nov 18 10:41:40 SERVER kernel: mdcmd (50): spindown 0

Nov 18 12:48:28 SERVER kernel: mdcmd (51): spindown 1

Nov 18 12:48:33 SERVER kernel: mdcmd (52): spindown 3

Nov 18 15:02:08 SERVER kernel: mdcmd (53): spindown 5

Nov 18 15:33:38 SERVER kernel: mdcmd (54): spindown 5

Nov 18 15:36:20 SERVER kernel: mdcmd (55): spindown 0

Nov 18 15:36:20 SERVER kernel: mdcmd (56): spindown 6

Nov 18 15:39:10 SERVER kernel: mdcmd (57): spindown 1

Nov 18 15:57:48 SERVER kernel: mdcmd (58): spindown 3

Nov 18 16:13:11 SERVER kernel: mdcmd (59): spindown 2

Nov 18 16:15:31 SERVER kernel: mdcmd (60): spindown 4

Nov 18 16:44:01 SERVER kernel: mdcmd (61): spindown 2

Nov 18 16:54:31 SERVER kernel: mdcmd (62): spindown 4

Nov 18 17:28:50 SERVER kernel: mdcmd (63): spindown 3

Nov 18 17:31:14 SERVER kernel: mdcmd (64): spindown 2

Nov 18 17:53:34 SERVER kernel: mdcmd (65): spindown 3

Nov 18 18:19:32 SERVER kernel: mdcmd (66): spindown 3

Nov 18 18:19:33 SERVER kernel: mdcmd (67): spindown 4

Nov 18 18:24:06 SERVER kernel: mdcmd (68): spindown 2

Nov 18 18:50:42 SERVER kernel: mdcmd (69): spindown 2

Nov 18 18:50:47 SERVER kernel: mdcmd (70): spindown 3

Nov 19 00:48:53 SERVER in.telnetd[20597]: connect from 192.168.0.151 (192.168.0.151)

 

Any help or suggestions would be greatly appreciated.

 

- Ton

 

Link to comment

It would be worth posting a diagnostics.zip file (from Tools->diagnostics or via a 'diagnostics' command from a console/telnet session) so that we can see the full details of your configuration.

 

I would recommend installing the powerdown plugin    The built-in powerdown will fail if for any reason the GUI becomes unresponsive.  the enhanced version installed by the plugin can succeed in closing down tidily even when the GUI is unresponsive.

 

Just a check - you mention VM Manager.  I assume that you mean the version that is built into unRAID - not the version that used to be available as a plugin?  Just checking as I am not sure what would be the effect of having the old plugin version installed.

Link to comment

itimpi,

 

Thank you so much for the reply.  Let me see if I can help address each of your questions:

 

VMManager - yeah.. i did it via the unraid version.  I made a new VM from the tab in the web gui.

 

Powerdown - I downloaded the powerdown-x86_64.plg.  Even if I run power down, it doesn't powerdown the box.  If you can recommend a different version, please let me know.

 

Diagnostics - I am currently running the command.  Its still not done yet.  Once its done, i will post it.  Thanks again.

 

- Ton

Link to comment

ok.. ran diagnostics for like 20 min.. still no response.  I ran power down, and it didn't shutdown.  I am going to hard boot it.

 

I was reading on another thread.  Another member was having problems with her unraid, and was running a VM.  After she stopped the VM, problem seem to have gone away.

 

I am going to leave my VM off, and see if the issue gets resolved.  It typically takes a day or two for the symptom to reappear.  I  will post back on how things goes.

 

- Ton

 

Link to comment

Hey trul,

 

When I upgraded to 6.1.3 from beta, I did a fresh install.  I copied off the backup files.. and then reformated the drive and all that good stuff.  I didn't bring over any plugins, just the bare essentials to maintain the disk and shares.

 

As for XEN, I just got rid of it.  I built a fresh VM, using VM Manager.  I was actually surprised how easy it was. 

 

Right now the systems seems to be running fine.  I am going to let it run over the weekend.  I am still pre-clearing a 5TB HD, which will take a few days.  Once that is done, I am going to upgrade the parity size.

 

I do appreciate everyone input.  Thank you very much.

 

-Ton

Link to comment

Ton,  how long is unraid up until it starts to loose web interface. I also am experiencing freezups. The VMs work fine but all Docker containers stop and unraid will not power off. I have the power down script installed and can still ssh into the box.

 

I am currently waiting on new drives to replace a failing drive so that could be my issue.

 

My unit stays up for about 4 days no problem and then BAM it stops allowing me to get to the web interface and user shares are down.

 

Link to comment

There are others experiencing similar issues... curious if this is the same.

 

Other thread with similar issues

 

- What combination of RFS/XFS/BTRFS are you running on your data disks?

- Do you run Sonarr? (if so... locally on unRAID or on another machine?)

- Are your drives plugged into your motherboard or do you have SATA/SAS card(s)? Which?

 

My symptoms are the same as you describe. If I revert to 5.0.5, the problem goes away completely.

 

Thanks

Link to comment

I am still tuning all my data disks in RFS. I have a 3 drive cache pool that seems to be running great. I do not run sonarr at all.

 

I run all my data disks off a IBM M15**(can't remember the exact one but it's the one everyone runs). That goes into a SAS expander and out to the drives.

I have one drive dying so I am pre clearing on another system. Not sure if that is my issue but it is throwing up a lot of fails.

Link to comment
  • 1 month later...

Hey Guys..

 

Sorry for the slow slow reply.  I didn't realize I wasn't getting notification on my thread.  This issue may have been solved, but wanted to share some additional information.

 

For me.. the issue is directly tied to VM manager and AMD.

 

As I mentioned, when I went from unraid 5 to unraid 6, I completely built a new VM from scratch using VM Manager (very cool and easy process).

 

As that time, I was running unraid on an AMD Kabini on AM1 platform.  I wanted lower power consumption.  When the VM is running, the SMB and web interface would just die after several hours.  The VM would still be accessible via RDP and SSH was still up on unraid.

 

I then wanted to build a Dalphile VM and use VT-D / IOMMU to connect to my USB DAC.  I went ahead and swap the motherboard to an AMD A10-5800K (Trinity) to get the Vt-D feature.  After swapping the motherboard, I just restarted my VM.  Same thing happens.  After several hours, my SMB and Web interface would just die.

 

I gave up on VM, and just used plugins.  Everything remain stables for months. 

 

The Trinity was using up too much Watts for me, so I decided to move to an Intel L5630 (40watt).  Guess what.  I turned the same VM back on with the Intel platform, and no more issues.  The VM has been running strong for days, and everything is up and running.

 

Now time for me to read the changelog between 6.1.3 and the current version, and see if this issue was detected and addressed.

 

- Ton

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.