Unable to load web interface or Hit Shares [SOLVED]


Recommended Posts

As you said, nothing in the syslog.  Mover starts normally, transfers some files without issues, then nothing, no errors, and the drives spin down.

 

There was a case not too long ago, where the memory tested fine, on a long test with many passes, but someone (Tom I think) said that Memtest doesn't catch everything, so the user replaced their memory sticks - and had no more problems!  Obviously, this is a shot in the dark, an expensive one too.

 

htop says the CPU is stuck in the User Shares file system.  I'm not sure, but there seem to be quite a few threads working on the User Share file system!  More than I would expect, but I don't have your Dockers, so can't say if that's abnormal.

 

I am happy to spend money to resolve this. I still just find it too coincidental that I could have a hardware issue the day I upgrade to 6.X.  Also, hard to ignore other people having very similar issues:

https://lime-technology.com/forum/index.php?topic=42858.0

https://lime-technology.com/forum/index.php?topic=42900.0

 

Before dropping a lot of cash and rebuilding everything (which I am on the verge of doing) I am going to see if I can get my hands on some loaner hardware and move all of my drives and os over.  Hopefully prove hardware is indeed the issue.

Link to comment
  • Replies 117
  • Created
  • Last Reply

Top Posters In This Topic

 

I'm having similar issues but only when I'm interacting with file shares on my windows VM. Before doing all that why not copy your config files back to 5 and see if it still works there.

 

I suppose I could do that. I am almost certain everything would work as normal since I've been running v5 for years with this hardware without issue.

 

Problem is I have converted to docker and don't feel like reverting back to plugins. Would rather try to fix this issue and move forward.

Link to comment

 

I've been looking at your syslog, ps report, and htop pic, and I've found some similar reports elsewhere.  Please check my summary here.  Sorry, no solutions yet, just data gathering.  But at least, it seems to point away from hardware issues.

 

Thank you sir for your continued support. I really appreciate it.

 

I don't think Plex is related to this issue. I don't run it on my unraid and plexpy is just a tool that monitors my Plex usage and doesn't interact with the array at all. I think the bug is simply in the mover/rsync. Hoping someone can help sort it out.

Link to comment

I've gone 7.5 days now with my mover turned off and everything has been running smoothly.  I decided to manually invoke the mover and after about 5 minutes my syslog was flooded with this: 

 

REISERFS error (device md2): vs-4010 is_reusable: block number is out of range 1867554259 (732566633)

 

I was unable to cleanly shutdown per usual.  Shares locked up and cant hit webui after I tried to kill some dockers.

 

I have performed a reiserfsck many times on md2, and no errors.  When I had this in my error log last time it was md5.

 

Full diagnostics here since its too large to upload: https://www.dropbox.com/s/lpue7fhw6dpp43z/diagnostics.zip?dl=0

 

After a reboot, the move completes successfully.

Link to comment

 

Personally, I would recommend to convert all your data drives to XFS. In my experience this works more stable on unRAID v6 than RFS.

 

I appreciate the suggestion however I don't feel like that is a suitable solution for my issue. Surely the majority of longtime unraid users have their file system still set to reiserfs. If Limetech made an official statement that XFS is the only supported fs I would spend the time converting.

 

It seems like the easiest option is to stop using a cache drive for my shares but I'd really prefer not to do that.

 

Does anyone have any other suggestions?

Link to comment

 

Personally, I would recommend to convert all your data drives to XFS. In my experience this works more stable on unRAID v6 than RFS.

 

I appreciate the suggestion however I don't feel like that is a suitable solution for my issue. Surely the majority of longtime unraid users have their file system still set to reiserfs. If Limetech made an official statement that XFS is the only supported fs I would spend the time converting.

 

It seems like the easiest option is to stop using a cache drive for my shares but I'd really prefer not to do that.

 

Does anyone have any other suggestions?

Did you ever run beta 7 or 8 of unRAID 6?

Link to comment

 

 

Did you ever run beta 7 or 8 of unRAID 6?

 

Hard to say for certain but I don't believe so. I read in another thread about the reiserfs bug but never saw a way to verify/resolve if that is indeed the case.

Well I can tell you this: I recently reconfigured a test system to contain all reiserfs disks (array + single cache) and tested the mover with no issues.  I've seen a few people complain about mover issues and rfs, but not too many. If you had run the 7 or 8 beta, this may be an issue, but if not, it could also be bad hardware / memory.  Can you share your system diagnostics?  They can be downloaded from the Tools -> Diagnostics page in the webgui.

 

Link to comment

 

Personally, I would recommend to convert all your data drives to XFS. In my experience this works more stable on unRAID v6 than RFS.

 

I appreciate the suggestion however I don't feel like that is a suitable solution for my issue. Surely the majority of longtime unraid users have their file system still set to reiserfs. If Limetech made an official statement that XFS is the only supported fs I would spend the time converting.

 

It seems like the easiest option is to stop using a cache drive for my shares but I'd really prefer not to do that.

 

Does anyone have any other suggestions?

 

One observation I have is that sometimes RFS may hang when extended attributes are being written/accessed. This might come into play when the mover is used, it does copy the extended attributes when moving files.

 

Link to comment
  • 2 weeks later...

I got my hands on some much beefier hardware and moved over all of my drives.  Hoping better processor and more ram will alleviate the mover script lockup like it did for the other guy having my same issue.

 

After a few days of uptime my unraid locked up hard.  I couldn't login via telnet or hit web interface so this was a new issue that I am not sure is related to my original mover issue.  Attached is a photo of the console.  Bad RAM on my "new" (5 year old) hardware maybe?  I did run memtest for a few hours before switching over...

 

lP58soh.jpg

Link to comment

I got my hands on some much beefier hardware and moved over all of my drives.  Hoping better processor and more ram will alleviate the mover script lockup like it did for the other guy having my same issue.

 

After a few days of uptime my unraid locked up hard.  I couldn't login via telnet or hit web interface so this was a new issue that I am not sure is related to my original mover issue.  Attached is a photo of the console.  Bad RAM on my "new" (5 year old) hardware maybe?  I did run memtest for a few hours before switching over...

 

lP58soh.jpg

Boot in SAFE mode
Link to comment

 

Boot in SAFE mode

 

Thanks, will do.  Then what?

See if it continues to happen

 

Well it took about a week for this lockup to happen.  I'm not sure I can go that login without my dockers up and running :)  This box is loaded up with 16GB of memory and was running just fine on 4GB, hence my though it could be bad RAM since the error message was out of RAM?

Link to comment

 

Boot in SAFE mode

 

Thanks, will do.  Then what?

See if it continues to happen

 

Well it took about a week for this lockup to happen.  I'm not sure I can go that login without my dockers up and running :)  This box is loaded up with 16GB of memory and was running just fine on 4GB, hence my though it could be bad RAM since the error message was out of RAM?

The reason I said SAFE mode was because nzbget was in the oom message. Are you running it as a plugin or a docker?
Link to comment

Something is causing you to run out of memory according to the screenshot. Since you have tried different hardware, and most others are not experiencing this issue with their software configuration, seems like it must be related to the plugins or dockers you are running.

 

To be clear this particular memory loss issue crept up when I moved to different hardware (including different ram).  My previous 20+ crashes were from the mover hanging and one theory was not enough resources.  Moving hardware is my attempt to prove hardware was related to the mover crashing.  All previous crashes I could still get in via telnet whereas this time it was locked up completely.

Link to comment

Something is causing you to run out of memory according to the screenshot. Since you have tried different hardware, and most others are not experiencing this issue with their software configuration, seems like it must be related to the plugins or dockers you are running.

 

To be clear this particular memory loss issue crept up when I moved to different hardware (including different ram).  My previous 20+ crashes were from the mover hanging and one theory was not enough resources.  Moving hardware is my attempt to prove hardware was related to the mover crashing.  All previous crashes I could still get in via telnet whereas this time it was locked up completely.

Maybe when you had less memory it crashed quicker, such as when mover rsync started needing memory. Now it takes longer, maybe something is leaking memory but it just takes longer to use it up. Different processes can get killed due to out-of-memory, maybe emhttp, maybe smb, maybe telnetd, so the symptoms could vary.

 

OOM killing emhttp used to be a problem on 32bit v5. There were some workarounds for it back then. I think with 64bit v6 and enough RAM it should be pretty hard to use up memory unless some app is broken.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.