ChrisW1337 Posted November 2, 2021 Share Posted November 2, 2021 Hello, I have several scanned PDFs in my unraid. Till lately i used a desktop search (copernic and dtsearch) to find the documents i need. I have tried all related unraid-dockers, like paperless, ... But I want to keep the PDFs in my directory order and dont want them in database or even forced to put the PDFs via webinerface into the searchable engine. Till now I didnt found a full-text search engine with a decent web UI where i can search and open/click on the link of my document I tried elasticsearch with kibana but got an operation to complex error on layer 8. Does someone has a tip what i can try or where to look - it would be great if it were free or opensource. Thanks Chris Quote Link to comment
ChrisW1337 Posted December 7, 2021 Author Share Posted December 7, 2021 does noone have a tip? I tried Solr, elasticsearch and such, but i dont get any of them runnning so that it indexes my files to be able to search for them in a web UI Quote Link to comment
ChrisW1337 Posted October 25, 2022 Author Share Posted October 25, 2022 I still have no nice solution for that. If it is a stupid question, or trivial to solve, you could at least put some flame and hate here please Quote Link to comment
JohnGAG Posted February 28, 2023 Share Posted February 28, 2023 I have this question too... I've looked at SOLR ElasticSearch so yes I too would like to hear of an alternate / simpler solution? In the meanwhile - my current approach follows ( I welcome suggestions or others to join me on this journey) I think I got closer with SOLR, and in any case I found the elastic search site unclear about whether I would need to buy a license to run ElasticSearch on a server. So currently I'm pursuing SOLR. The current issue is: solr 19:28:30.53 INFO ==> ** Starting solr setup ** solr 19:28:30.55 INFO ==> Validating settings in SOLR_* env vars... solr 19:28:30.55 INFO ==> Initializing Solr ... realpath: /bitnami/solr/data: No such file or directory solr 19:28:30.56 INFO ==> Configuring file permissions for Solr mkdir: cannot create directory '/bitnami/solr': Permission denied searching through the various links I found https://hub.docker.com/r/bitnami/solr/ TL:DR I think I need to either tinker with the docker image https://github.com/bitnami/containers/blob/main/bitnami/solr/docker-compose.yml or find out how to "mount a volume in the desired location and setting the environment variable with the customized value (as it is pointed above, the default value is data_driven_schema_configs)" so going to investigate the "data_driven_schema_configs" as I think this would persist even if the container were modified by the maintainer Quote Link to comment
ChrisW1337 Posted March 2, 2023 Author Share Posted March 2, 2023 seems we are the only ones trying to do a serverside fulltext search of existing PDF Files. I am not sure that SOLR is what *I* am looking for though. What i am thinking of is something like DTSearch or COPERNIC, in a Browserwindow, so that i can use not only on one PC. Quote Link to comment
JohnGAG Posted March 2, 2023 Share Posted March 2, 2023 having dug deeper about SOLR and it's schema's it looks more powerful than I need - and requiring much more work to setup. Yes Copernic has proved to be an absolute godsend a few times, I'd even pay good money for a copy to run on unraid Quote Link to comment
ChrisW1337 Posted September 7, 2023 Author Share Posted September 7, 2023 Is there any solution, meanwhile? I still have no way to search my files (fulltext,pdf, ...) on the server other then with copernic or totalcommander i dont get elasticsearch running though. and i dont want to import my files into an dms (although i am broken at this point and might do so, just for the sake of searching) I didnt get yacy to crawl my documents locally. Quote Link to comment
gkoul Posted September 11, 2023 Share Posted September 11, 2023 i too have this need, i used to be able to find anything with google 5 years ago but now this is not the case and hence i am back again to archiving PDFs, i use dtsearch but admittedly it's suboptimal, i hope this thread can get some momentum Quote Link to comment
ChrisW1337 Posted September 11, 2023 Author Share Posted September 11, 2023 What I have tried now: * Install the container rdestop with the ubuntu label: lscr.io/linuxserver/rdesktop:ubuntu-mate * gave it a path to my pdf on the server * then i have installed JAVA and DOCFETCHER into it And now i can remotedesktop into docfetcher ... that works, but i dont know what happens after update or reboot. Quote Link to comment
external-palate3321 Posted October 2, 2023 Share Posted October 2, 2023 On 2/28/2023 at 6:56 AM, JohnGAG said: I think I got closer with SOLR, and in any case I found the elastic search site unclear about whether I would need to buy a license to run ElasticSearch on a server. So currently I'm pursuing SOLR. The current issue is: solr 19:28:30.53 INFO ==> ** Starting solr setup ** solr 19:28:30.55 INFO ==> Validating settings in SOLR_* env vars... solr 19:28:30.55 INFO ==> Initializing Solr ... realpath: /bitnami/solr/data: No such file or directory solr 19:28:30.56 INFO ==> Configuring file permissions for Solr mkdir: cannot create directory '/bitnami/solr': Permission denied In case someone else comes across this thread looking for help with this error. I was able to get over this error by manually creating that folder and setting permissions. Roughly, from the Unraid console: cd /mnt/user/appdata/solr mkdir solr chown nobody:users solr After this, the container starts up and the WebGUI comes up. Now to work out how to use thing thing... Quote Link to comment
ChrisW1337 Posted April 16 Author Share Posted April 16 I am not closer to a solution, since i startedthis thread. Intense work on Paperless, but since i have tons of existing PDF in a neat directory structure and naming, Paperless ist of no big help. Are there new options? like AI training tools or such. Still wondering why I am the only one (almost) with this problem. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.