[Request] Elastic Search


sse450

Recommended Posts

Nextcloud needs Elastic Search for full text indexing. Elastic search, in return, needs Java.
 
I would appreciate if you kindly prepare this container for unRAID.
 
Thanks.
Exactly who are you addressing this to?

Sent from my LG-H815 using Tapatalk

Link to comment

Well, I got it working, but it's...... very messy right now. 

 

Main thing to get ironed out is that the required plugin for Nextcloud isn't installed by default, and trying to work out how to make the plugin persistent across container upgrades, got a few thoughts, but need to simplify it.

 

Already decided that if you use this, I'm not supporting it :D

 

Link to comment

Dear @CHBMB, thank you very much for your efforts on this.

 

I too spent all yesterday to get it running and got it working with an old version of ElasticSearch. I couldn't get it running with the latest version though. Perhaps, my findings/research/trials below may give you a hint or two in tackling this issue. But, first, for the users who don't know much about ElasticSearch: This service gives Nextcloud full text indexing capability which turns Nextcloud into a powerful document management system. The documents (plain text, rtf, pdf, jpeg, tiff, html, LibreOffice and Microsoft Office file formats) saved on Nextcloud are indexed as full text as opposed to just file names.

 

Now on to the info I gathered:

1. Images on Docker hub are all old versions. They are obsolete. The new version is on ElasticSearch repo. It is pulled as docker.elastic.co/elasticsearch/elasticsearch:6.2.2.

2. Commands regarding to indexing can be found here: https://github.com/nextcloud/fulltextsearch/wiki/Commands

3. I managed to install one of the images on Docker hub. When I tried to run "sudo -u abc php7 occ fulltextsearch:index", it complained about a missing plugin: ingest attachment. This plugin can be installed as "/usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-attachment" inside the container. From this point on ElasticSearch (old version) works. It indexes all the docs in the nextcloud.

4. I don't know how much it is related, but I defined two variables: ES_JAVA_OPTS= -Xmn512m -Xmx512m and bootstrap.memory_lock=true

5. I think, data, config and plugins folders in /usr/share/elasticsearch should be mounted externally.

6. I guess, all three plugins, Full text search - Elasticsearch Platform, Full text search - Files and Full text search, should be installed in Nextcloud.

6. The latest version (6.2.2) is a different story. Tried to install with a manual pull but it complains about two things:

    - vm.max_map_count is too low. This can be fixed with "sysctl -w vm.max_map_count=262144" in go file.

     - max file descriptors [40960] for elasticsearch process is too low. Probably, this is related with "ulimit -n" command. But, I couldn't find a persistent method to increase it. Go file and user scripts didn't work.

7. A compherensive write up about ElasticSeacrh can be found here: https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started.html.

 

All the best.

 

Link to comment

@sse450  Thanks

 

1.  I got it running on the new ElasticSearch repo

2.  Yep, got that.

3.  I installed the plugin fine.

4.  Think I got the variables sorted.

5.  It's not as simple as that, you can't always just choose which folders to "externalise" this is the main sticking point so far.  The way elasticsearch recommend is to modify the dockerfile and then build the container locally, but whilst that works, it doesn't translate well to use on Unraid.  I have a solution, but it's not very elegant.

6.  Still looking at this bit, doesn't seem that the container reflects changes made on the host.

7.  Pretty much read all that. 

 

:)

 

Edited by CHBMB
Link to comment
On 13.03.2018 at 12:01 AM, CHBMB said:

@sse450  Thanks

 

1.  I got it running on the new ElasticSearch repo

2.  Yep, got that.

3.  I installed the plugin fine.

4.  Think I got the variables sorted.

5.  It's not as simple as that, you can't always just choose which folders to "externalise" this is the main sticking point so far.  The way elasticsearch recommend is to modify the dockerfile and then build the container locally, but whilst that works, it doesn't translate well to use on Unraid.  I have a solution, but it's not very elegant.

6.  Still looking at this bit, doesn't seem that the container reflects changes made on the host.

7.  Pretty much read all that. 

 

:)

 

I ended up a CentOS VM with just for ElasticSearch. It works out of the box (well almost). Surely, container would be much better for the resources. Do you have any progress on this?

Link to comment
  • 2 weeks later...

Just to add to this as well, i got elasticsearch working with nextcloud in a Ubuntu VM. 

Here are a few issues relating to the nextcloud plugin i encountered to be aware of

1. File locking in nextcloud - i had to turn this off in my config.php file in nextcloud while doing the scan, as it kept giving errors (i reenabled it after the scan was complete). The scan = the initial index with occ and the plugin. 

2. Memory limit issues in php. The default is 128M which wasnt enough to index all of my nextcloud files, i had to change it (512mb was enough) or else it would keep giving errors as per below. 

PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 18874368 bytes) in /config/www/nextcloud/lib/private/Lock/DBLockingProvider.php on line 77
PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 8388616 bytes) in /config/www/nextcloud/lib/private/Lock/DBLockingProvider.php on line 265

3. As the live scan command needs to be embedded into the docker somehow, i havent been able to use that. 

 

i also managed to get Kibana installed, but havent figured out logstash yet. The full ELK stack with dockers would be great!

 

(also speaking of collabora, i have that running and working as well in case anyone needs help)

 

-Brett

Link to comment
  • 4 months later...

I'd like this as well, this is a great docker app, but I get the standard  "vm.max_map_count is too low" and "max file descriptors for elasticsearch process is too low", I know how to fix but it requires a persistent boot which we do not have.  I am going to try and do it with CA Scripts plugin but we will see.  Seems to me we should have the option to change variables such as those :)

 

ELK Docker Image

Edited by Jclendineng
Link to comment

I could not...I added both manually but the container does not see it.  Was really hoping this was going to work...something as simple as modifying a startup file will not work because it isn't persistent.  Maybe I need to try debian+rancher and see if I can make it work. 

 

Edit: or just set up a vm and edit the host...but that seems like overkill for something as simple as 2 lines of code.

Edited by Jclendineng
Link to comment

This is just to close this and add a little fix for other people searching this topic.

 

Install this in docker ELK

 

Go through the wiki linked on the docker page, and make sure the variables are correct.

 

Add a variable : MAX_OPEN_FILES set to 65536

 

To get this to stick you need to set the ELK image as privileged (need to toggle advanced)

 

Download community apps script manager

 

Add the script below to run at start of array:

 

sysctl -w vm.max_map_count=262144

 

After this elk stack is fully running, you will still need to set it up with index and all that to parse data.

Edited by Jclendineng
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.