[Support] Paperless-ng Docker


Recommended Posts

On 11/19/2021 at 1:00 AM, Profezor said:

Anyway to help or donate in order to get machine learning/AI advances and IOS app developed? 

 

Love this docker, just want it to be even better.

There’s not really a need for an iOS app, Paperless-ng can be added to your Home Screen as a PWA. Just select „Add to Home Screen“ in the share tab in Safari.

Link to comment
On 10/16/2021 at 7:37 AM, Greygoose said:

Just setup both redis and paperless-ng. Seems to be working fine except redis shows this error?

 

Is this normal?

 

 

Screenshot 2021-10-16 123455.png

 

I am having the same issue - the Redis container is being rebooted as part of this.  When I try to follow the directions and run the command 'sysctl vm.overcommit_memory=1' in the container terminal I get the following response 'sysctl: setting key "vm.overcommit_memory": Read-only file system'.

 

I also can't seem to find the files for Redis within my "AppData" share so I can't modify /etc/sysctl.conf either.

 

Any suggestions?

 

Thanks!

Link to comment
  • 2 weeks later...
  • 2 weeks later...

Hi,

Quick question regarding papaerless-ng functionality.

I have docker instance up and running and i have stored about 10 files in it. Works great. Now i want to change the file format variable as below,

PAPERLESS_FILENAME_FORMAT={created_year}/{correspondent}/{title}

 

will this new change apply to existing files? Does anyone have any experience with that? 

Link to comment
8 hours ago, Dhruvin said:

Hi,

Quick question regarding papaerless-ng functionality.

I have docker instance up and running and i have stored about 10 files in it. Works great. Now i want to change the file format variable as below,

PAPERLESS_FILENAME_FORMAT={created_year}/{correspondent}/{title}

 

will this new change apply to existing files? Does anyone have any experience with that? 

 

Yes, I went through this a few months back! Here is the relevant doc page and section: https://paperless-ng.readthedocs.io/en/latest/administration.html#managing-filenames

 

Specific instructions, you need to run the following command in the console of your Paperless docker (The command takes no arguments and processes all your documents at once):

document_renamer
  • Thanks 2
  • Upvote 1
Link to comment

Hello!

i'm still on the deprecated paperless docker and tbh i'm not sure my setup will survive a migration.

i've read the migrations docs but these docs are not unraid related and so it's already a level too much for me. :|

usually i'd try to make it work with trial & error but with all my docs in the background i don't think thats a good idea.

 

me approach would be:

- make backup of /mnt/cache/appdata/paperless

- stop paperless docker

- install "some" redis docker (JJ9987's? there's an "official" badge :) there...)

- install paperless-ng docker from CA and follow point 2 from OP of this thread (same user / password as on my old/existing paperless?)

- have all my docs in the "new" paperless-ng ?

 

can't be that simple, can it?

and what am i to do with the paperless-consumer docker that i have running atm?

 

would somebody be so kind to help me out here?

Link to comment
  • 2 weeks later...

Paperless does not seem to be handling dates in the original file name correctly. It is ignoring the filename date and picking it up from the document instead. The documentation says that filename is used and document is ignored when PAPERLESS_FILENAME_DATE_ORDER is set. However, the consumer is still pulling from the OCR contents. My date format in the file is YYYY-MM-DD. As you can see from my date settings, I have a different date format for filenames for sorting purposes. I should note the the date is usually at the end of the filename too. "<document description> YYYY-MM--DD.pdf". Any thoughts? Thanks.

chrome_oyvhsMp4Dz.thumb.png.22f1d224612ca216618c2f04954dc36c.png

Link to comment

Another question. I have hyperthreading enabled. I have 8 physical cores and 16 logical threads. Regardless of the settings of the workers and threads/worker, the 2nd thread in each core never gets utilized. Right now I have it on 6 workers and 4 threads per worker. I know this is not recommended but I have been trying multiple configurations to get it to work. This is just he latest.

CPU_Core_Utilization.png.db18690ab5cb6e2a686c6721b262dd6b.png

Link to comment

Hi,

 

I'm looking for some help to secure my installation of paperless-ng.

 

I have seen mentions of setting up Nginx or Swag as a reverse proxy to serve up the app over SSL, but I'm a little bit stumped on how to do this.

 

I do not intend to expose my docker instance of paperless-ng to the internet (although it might be something I tinker with later), however  all the requests to the server are http, which means my user name, and any other information can be intercepted and read using something like Wireshark (I know this would mean someone snooping with access to my network already so is an unlikely event), but I feel all comms of sensitive data should really be secured,

 

Secondly, if I was to setup a reverse proxy, my encryption is to the proxy from what I understand, and the comms from proxy to the docker container would still be unencrypted and susceptible to snooping also?

 

Can anyone provide some steps / hints on how I configure the docker contain to run as https for all requests?

 

Thanks in advance

 

Minimos

 

Link to comment

Yes you can use a reverse proxy as Nginx (or the NginxProxyManager docker container). However if you want a SSL certificate you will need to at least expose your proxy to the internet (for letsencrypt to work, self-signed or other certs can work without internet access). You can however add an ACL so only certain IP's can access your container.

 

The communication between your proxy and paperless is indeed unencrypted. You can't really avoid that. There must be some place that decrypts the traffic. However in order to MTM traffic from container A to container B attackers need to have access to your server. And be honest if they have server access why would they care to snoop your data? They can just download the database or something like that.

 

So: I would suggest to try out the NginxProxyManager container, it is listed in the CA appstore

Link to comment
On 1/5/2022 at 2:13 PM, mattie112 said:

Yes you can use a reverse proxy as Nginx (or the NginxProxyManager docker container). However if you want a SSL certificate you will need to at least expose your proxy to the internet (for letsencrypt to work, self-signed or other certs can work without internet access). You can however add an ACL so only certain IP's can access your container.

 

The communication between your proxy and paperless is indeed unencrypted. You can't really avoid that. There must be some place that decrypts the traffic. However in order to MTM traffic from container A to container B attackers need to have access to your server. And be honest if they have server access why would they care to snoop your data? They can just download the database or something like that.

 

So: I would suggest to try out the NginxProxyManager container, it is listed in the CA appstore

Hi,

 

Thank you so much for coming back with info.

 

I've done a bit more digging, and have decided to use swag (which includes nginx) as my reverse proxy as it has a few other nice features and moved the swag container an paperless-ng to the same docker network (to stop network leakage)

 

I have got swag up and running with a cert being pulled for my domain, however I can't for the life of me get it to work.

 

I keep getting a a 502 error (Bad Gateway), when using my subdomain to access the server.

 

SWAG works fine with the sub domain connected (i.e I get the default SWAG page) if I don't have a subdomain configured in Nginx.

 

I also included the suggested Nginx configuration from the original paperless project but I still get the same error.

 

EDIT:  Found an error in the Paperless Log '[ERROR] [django.security.DisallowedHost] Invalid HTTP_HOST header: paperless.example.com, paperless.example.com The domain name provided is not valid according to RFC 1034/1035.'  It seems to indicate that the domain name is being passed twice in the header (host example changed from my own domain for security reasons)

 

Does anyone potentially have any ideas on what I could check?

 

 

 

Thanks

 

Minimos

Edited by minimos
Link to comment
5 hours ago, minimos said:

Hi,

 

Thank you so much for coming back with info.

 

I've done a bit more digging, and have decided to use swag (which includes nginx) as my reverse proxy as it has a few other nice features and moved the swag container an paperless-ng to the same docker network (to stop network leakage)

 

I have got swag up and running with a cert being pulled for my domain, however I can't for the life of me get it to work.

 

I keep getting a a 502 error (Bad Gateway), when using my subdomain to access the server.

 

SWAG works fine with the sub domain connected (i.e I get the default SWAG page) if I don't have a subdomain configured in Nginx.

 

I also included the suggested Nginx configuration from the original paperless project but I still get the same error.

 

EDIT:  Found an error in the Paperless Log '[ERROR] [django.security.DisallowedHost] Invalid HTTP_HOST header: paperless.example.com, paperless.example.com The domain name provided is not valid according to RFC 1034/1035.'  It seems to indicate that the domain name is being passed twice in the header (host example changed from my own domain for security reasons)

 

Does anyone potentially have any ideas on what I could check?

 

 

 

Thanks

 

Minimos

Finally got it resolved, thanks to a solution earlier in the thread from @muppie & @Lumpy_BD.

 

I used the config on page 4 substituting the ip address on the $upstreamapp variable for my own Unraid host IP.

 

However, before this, I cleared up the error above by removing the suggested Nginx settings from the paperless-ng documentation and the django error cleared up, but I still got a bad gateway message until I made the ipchange

 

Hope this might help someone else

Link to comment
21 hours ago, minimos said:

Finally got it resolved, thanks to a solution earlier in the thread from @muppie & @Lumpy_BD.

 

I used the config on page 4 substituting the ip address on the $upstreamapp variable for my own Unraid host IP.

 

However, before this, I cleared up the error above by removing the suggested Nginx settings from the paperless-ng documentation and the django error cleared up, but I still got a bad gateway message until I made the ipchange

 

Hope this might help someone else

 

So I have managed to improve on the solution above and get the docker container name working in the Nginx config file, which further increases security by making make the http request stay internal to the docker network and not leak onto my wider LAN.

 

The trick is to specify a variable PAPERLESS_ALLOWED_HOSTS, in the docker setup.  Add it as a new environment variable.

 

Then, you have to specify both your custom domain name and the docker container separated by a comma.

 

You have to put your custom domain first or you will get 400 HTTP codes in paperless for some content, so the new variable  value would be 'paperless.example.com,paperless-ng', substituting example.com for whatever your custom url is.

 

Update your Nginx subdomain config variable $upstream_app to 'paperless-ng' or whatever your contain is and restart Swag and it should all be working.

 

Hope others find this useful

Link to comment
  • 2 weeks later...

Hi all,

i want to install paperless-ng using postgresql.

On page 1 of this thread there are some variables i had to add to the docker.

Question: Do i have to set up the database / schema first or is this done during the installation of paperless?

Meaning: Are the needed tables created automatically?

 

Also, i run unraid on a HP microserver Gen8 with a 2-core CPU.

I read here https://paperless-ng.readthedocs.io/en/latest/configuration.html, that one should set PAPERLESS_TASK_WORKERS and  PAPERLESS_THREADS_PER_WORKER to values corresponding with CPU cores.

Where is this done?

 

Thanks

 

Juergen

Link to comment
  • 2 weeks later...
On 1/17/2022 at 7:31 PM, Spline said:

un unraid on a HP microserver Gen8 with a 2-core CPU.

I read here https://paperless-ng.readthedocs.io/en/latest/configuration.html, that one should set PAPERLESS_TASK_WORKERS and  PAPERLESS_THREADS_PER_WORKER to values corresponding with CPU cores.

Where is this done?

 

Just click on edit on the container in the docker tab then add new variables with those names and the values you need and restart. 

 

 

I'm trying to get Paperless-ng running behind a reverse proxy using Swag and I'm having a weird issue that I honestly have _no_ idea how to troubleshoot. 

 

The container is reachable and I can log in, browse, upload documents and everything but as soon as I try to do any changes to the docs (adding tags or correspondents or changing something or whatever) it either fails silently or I get a popup that says:

 

Error executing bulk operation: {"detail":"CSRF Failed: CSRF token missing or incorrect."}

 

If I open the developer tools I see the POST request returned a 403 and that error.

 

I tried setting up the proxy to use a subfolder and setting the base URL to match (using docker variables). I also tried as a subdomain which would be my preferred method. In both cases I also set up PAPERLESS_ALLOWED_HOSTS and PAPERLESS_CORS_ALLOWED_HOSTS to the corresponding values. In both cases I get the same behavior. 

 

The weird thing is that the error only happens when I try to modify things from the Paperless 'normal' UI. When I go to the admin interface I can modify whatever I want and it works. 

 

Any ideas?. 

Link to comment
  • 3 weeks later...

@T0a just for your info, it seems that Paperless-ng Project came to a hold.

The Maintainer cant be contacted since September 2021, althought he is active on other platforms.

Seems that he abandoned the project.

 

Nevertheless there are some people discussing a new fork of Paperless-ng which will be run by more than one maintainer.

Maybe at some point it would be a good idea to rebase the docker on thins project:

https://github.com/jonaswinkler/paperless-ng/issues/1632

  • Like 1
Link to comment
1 hour ago, darkside40 said:

Would be great to hear what @T0a thinks of it.

I am aware of the fork and involved as well. Until the first dust has setteled and the fork is going forward in a good healthy way, I will offer it to the UnRaid community. It may be as a separate container though. Right now, they are organizing the project and still in the migration phase. Let‘s give them time and let the fork mature.

 

Until then, I would recommend not exposing paperless-ng to the the Internet directly. This is because, security related fixes in third-party dependendies are not merged anymore. You should not expose it anyways since it is not hardended in any way!
 

 

Edited by T0a
  • Like 1
  • Thanks 1
Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.