Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Full Text Search

Featured Replies

Hello,

 

I have several scanned PDFs in my unraid. Till lately i used a desktop search (copernic and dtsearch) to find the documents i need.

 

I have tried all related unraid-dockers, like paperless, ...  But I want to keep the PDFs in my directory order and dont want them in database or even forced to put the PDFs via webinerface into the searchable engine.

 

Till now  I didnt found a full-text search engine with a decent web UI where i can search and open/click on the link of my document

 

I tried elasticsearch with kibana but got an operation to complex error on layer 8.

 

Does someone has a tip what i can try or where to look - it would be great if it were free or opensource.

 

 

Thanks

Chris

  • 1 month later...
  • Author

does noone have a tip?

I tried Solr, elasticsearch and such, but i dont get any of them runnning so that it indexes my files to be able to search for them in a web UI

 

  • 10 months later...
  • Author

I still have no nice solution for that.

 

If it is a stupid question, or trivial to solve, you could at least put some flame and hate here ;-) please

  • 4 months later...

I have this question too... I've looked at

  • SOLR
  • ElasticSearch

so yes I too would like to hear of an alternate / simpler solution?

 

In the meanwhile - my current approach follows ( I welcome suggestions or others to join me on this journey)

 

I think I got closer with SOLR, and in any case I found the elastic search site unclear about whether I would need to buy a license to run ElasticSearch on a server.

 

So currently I'm pursuing SOLR.

 

The current issue is:

solr 19:28:30.53 INFO  ==> ** Starting solr setup **
solr 19:28:30.55 INFO  ==> Validating settings in SOLR_* env vars...
solr 19:28:30.55 INFO  ==> Initializing Solr ...
realpath: /bitnami/solr/data: No such file or directory
solr 19:28:30.56 INFO  ==> Configuring file permissions for Solr
mkdir: cannot create directory '/bitnami/solr': Permission denied

 

searching through the various links I found

https://hub.docker.com/r/bitnami/solr/

 

TL:DR I think I need to either tinker with the docker image https://github.com/bitnami/containers/blob/main/bitnami/solr/docker-compose.yml or find out how to

"mount a volume in the desired location and setting the environment variable with the customized value (as it is pointed above, the default value is data_driven_schema_configs)"

 

so going to investigate the "data_driven_schema_configs" as I think this would persist even if the container were modified by the maintainer

  • Author

seems we are the only ones trying to do a serverside fulltext search of existing PDF Files.

I am not sure that SOLR is what *I* am looking for though.

What i am thinking of is something like DTSearch or COPERNIC, in a Browserwindow, so that i can use not only on one PC.

having dug deeper about SOLR and it's schema's  it looks more powerful than I need  - and requiring much more work to setup.

 

Yes Copernic has proved to be an absolute godsend a few times,  I'd even pay good money for a copy to run on unraid 

  • 6 months later...
  • Author

Is there any solution, meanwhile?

I still have no way to search my files (fulltext,pdf, ...) on the server other then with copernic or totalcommander

 

i dont get elasticsearch running though.

 

and i dont want to import my files into an dms (although i am broken at this point and might do so, just for the sake of searching)

I didnt get yacy to crawl my documents locally.

i too have this need, i used to be able to find anything with google 5 years ago but now this is not the case and hence i am back again to archiving PDFs, i use dtsearch but admittedly it's suboptimal, i hope this thread can get some momentum

  • Author

What I have tried now: 
* Install the container rdestop with the ubuntu label: lscr.io/linuxserver/rdesktop:ubuntu-mate
* gave it a path to my pdf on the server

* then i have installed JAVA and DOCFETCHER into it

And now i can remotedesktop into docfetcher ...

that works, but i dont know what happens after update or reboot.
 

  • 3 weeks later...
On 2/28/2023 at 6:56 AM, JohnGAG said:

I think I got closer with SOLR, and in any case I found the elastic search site unclear about whether I would need to buy a license to run ElasticSearch on a server.

 

So currently I'm pursuing SOLR.

 

The current issue is:

solr 19:28:30.53 INFO  ==> ** Starting solr setup **
solr 19:28:30.55 INFO  ==> Validating settings in SOLR_* env vars...
solr 19:28:30.55 INFO  ==> Initializing Solr ...
realpath: /bitnami/solr/data: No such file or directory
solr 19:28:30.56 INFO  ==> Configuring file permissions for Solr
mkdir: cannot create directory '/bitnami/solr': Permission denied

 

In case someone else comes across this thread looking for help with this error.

 

I was able to get over this error by manually creating that folder and setting permissions.

 

Roughly, from the Unraid console:

cd /mnt/user/appdata/solr

mkdir solr

chown nobody:users solr

 

 

After this, the container starts up and the WebGUI comes up.

Now to work out how to use thing thing...

  • 6 months later...
  • Author

I am not closer to a solution, since i startedthis thread.

Intense work on Paperless, but since i have tons of existing PDF in a neat directory structure and naming, Paperless ist of no big help.

 

Are there new options? like AI training tools or such. Still wondering why I am the only one (almost) with this problem.

  • 2 months later...

Thx for this. I too am looking for a way to index my documents. Although Diskover indexes filesystems, it doesn't do any breaking or stemming of docs. I'm thinking I may have to put up my SharePoint search engine on a VMS. Kinda overkill but at least it'll work. 

  • 2 weeks later...

I am also looking for something like this - not just to search but to help categorize my files. Paperless seems very rigid in its needs.

  • Author
1 hour ago, MrCrispy said:

I am also looking for something like this - not just to search but to help categorize my files. Paperless seems very rigid in its needs.

 

well PaperlessNG does work. but with several 10 thousand existing documents its a pita to get it running.
I still use Copernic-Desktop and am looking for a decent serverside solution (copernic server is not affordable for me).

The virtual client install with NoVNC access to docfetcher does work to sometimes, but it seems fragile and stops working and needs attention now and then, so no alternative.

Does noone else do a fulltext search on their PDF? I cant believe its such a niche question?

Chris

On 6/26/2024 at 11:16 PM, ChrisW1337 said:

 

well PaperlessNG does work. but with several 10 thousand existing documents its a pita to get it running.
I still use Copernic-Desktop and am looking for a decent serverside solution (copernic server is not affordable for me).

The virtual client install with NoVNC access to docfetcher does work to sometimes, but it seems fragile and stops working and needs attention now and then, so no alternative.

Does noone else do a fulltext search on their PDF? I cant believe its such a niche question?

Chris

has anyone tried this - https://github.com/sunde41/recoll

I've used Recoll on my desktop and like it.

 

there are also other libraries such as solr that supposedly do a very good job.

 

some other projects I intend to try - 

 

https://github.com/eikek/docspell (probably more for scanned files)

https://github.com/simon987/sist2/ (uses ES internally)

 

  • Author

Thank you,

 

although they are Desktop Solution, put on a server ... this might not be an ideal start point.

I still have hopes, that someone with more knowledge than me, comes up with a solution description to index and search on serverside.

 

Chris

 

 

  • 1 year later...
  • Author

Its still very silent about that topic. Does anybody have a satisfying solution?
I really wanted to like paperless, but i cant wrap my documents around it (ordered and stored in unraid folders)

Any new fulltext search tools?

Hi

I have used nextcloud for many years only for its full text and file search. Mounted unraid folders inside the docker and when something happens with the docker the files are always accessable as a regular nas and Fe file explorer on phone acces my files through wifiman teleport on my unify gateway

I'm working on trying to get Recoll to work with the web frontend. I've had great luck with it on the desktop, so I'm trying to reproduce that usability in UnRAID. I've got equal or better functionality on my new NAS versus my old Synology, except for full text search. Unfortunately Recoll isn't super Docker friendly as it's got a lot of dependencies and requires a cronjob to update its indexes. None of that is insurmountable but it certainly makes it a lot more difficult than a "plug and play" type solution.

  • Author

I managed recoll to work, but ... the ui is a bit basic.

I have now NextCloud running and linked the Documents Folders into it and activated Fulltext Search. it kinda works

=> If only I could find a workflow that would keep my existing documents and scanning workflow BUT also make use of Paperless. right now I endup with two separate data stores that are not synced.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.