[Support] Paperless-ngx Docker


Recommended Posts

On 1/30/2024 at 12:17 AM, MaRob said:

Hi I installed unRaid some days ago. So I am fairly new to the system. Today I installed Paperless NGX.
In general it is working.
What I am asking myself is, where can I find the docker configuration file of Paperless NGX?
I also tried to use the unraid console and use "find" to find the docker-compose.env.
But I wasn't successful.

Before installing the Paperless NGX Docker I set up a share called documents and inside the folders consume, data, export and media. There I also can't find any config file

 

Docker containers don't always expose the config file.

 

From the manual:  "If you run paperless on docker, paperless.conf is not used. Rather, configure paperless by copying necessary options to docker-compose.env".

 

Unraid uses a slightly modified docker system. The docker-compose.env file is not adjustable by a user.

 

This means that in unraid you configure the application through variables...

https://docs.paperless-ngx.com/configuration/

 

An example:

 

 

Edited by Wimpie
Link to comment
2 hours ago, Wimpie said:

 

Did you read the manual? https://docs.paperless-ngx.com/

 

Is this the first container (docker) you install?

 

look at:

 

 

 

Hi, I find there only that consume folder is folder that I have documents where paperless will take from. There's also to create folders: media, data and consume. It's not my first docker but I think it will be nice to see some descriptions what are that folders for.

Regards.

 

Link to comment
1 hour ago, MarianKoniuszko said:

Hi, I find there only that consume folder is folder that I have documents where paperless will take from. There's also to create folders: media, data and consume. It's not my first docker but I think it will be nice to see some descriptions what are that folders for.

Regards.

 

 

Please read the manual, it's all there. 

 

Eg:

 

Paths and folders

 

PAPERLESS_CONSUMPTION_DIR=<path>

This where your documents should go to be consumed. Make sure that it exists and that the user running the paperless service can read/write its contents before you start Paperless.

Don't change this when using docker, as it only changes the path within the container. Change the local consumption directory in the docker-compose.yml file instead.

Defaults to "../consume/", relative to the "src" directory.

 

PAPERLESS_DATA_DIR=<path>

This is where paperless stores all its data (search index, SQLite database, classification model, etc).

Defaults to "../data/", relative to the "src" directory.

 

PAPERLESS_TRASH_DIR=<path>

Instead of removing deleted documents, they are moved to this directory.

This must be writeable by the user running paperless. When running inside docker, ensure that this path is within a permanent volume (such as "../media/trash") so it won't get lost on upgrades.

Note that the directory must exist prior to using this setting.

Defaults to empty (i.e. really delete documents).

 

PAPERLESS_MEDIA_ROOT=<path>

This is where your documents and thumbnails are stored.

You can set this and PAPERLESS_DATA_DIR to the same folder to have paperless store all its data within the same volume.

Defaults to "../media/", relative to the "src" directory.

 

PAPERLESS_STATICDIR=<path>

Override the default STATIC_ROOT here. This is where all static files created using "collectstatic" manager command are stored.

Unless you're doing something fancy, there is no need to override this. If this is changed, you may need to run collectstatic again.

Defaults to "../static/", relative to the "src" directory.

 

PAPERLESS_LOGGING_DIR=<path>

This is where paperless will store log files.

Defaults to PAPERLESS_DATA_DIR/log/.

 

PAPERLESS_NLTK_DIR=<path>

This is where paperless will search for the data required for NLTK processing, if you are using it. If you are using the Docker image, this should not be changed, as the data is included in the image already.

Previously, the location defaulted to PAPERLESS_DATA_DIR/nltk. Unless you are using this in a bare metal install or other setup, this folder is no longer needed and can be removed manually.

Defaults to /usr/share/nltk_data

Link to comment
On 2/3/2024 at 2:15 PM, Wimpie said:

 

Please read the manual, it's all there. 

 

Eg:

 

Paths and folders

 

PAPERLESS_CONSUMPTION_DIR=<path>

This where your documents should go to be consumed. Make sure that it exists and that the user running the paperless service can read/write its contents before you start Paperless.

Don't change this when using docker, as it only changes the path within the container. Change the local consumption directory in the docker-compose.yml file instead.

Defaults to "../consume/", relative to the "src" directory.

 

PAPERLESS_DATA_DIR=<path>

This is where paperless stores all its data (search index, SQLite database, classification model, etc).

Defaults to "../data/", relative to the "src" directory.

 

PAPERLESS_TRASH_DIR=<path>

Instead of removing deleted documents, they are moved to this directory.

This must be writeable by the user running paperless. When running inside docker, ensure that this path is within a permanent volume (such as "../media/trash") so it won't get lost on upgrades.

Note that the directory must exist prior to using this setting.

Defaults to empty (i.e. really delete documents).

 

PAPERLESS_MEDIA_ROOT=<path>

This is where your documents and thumbnails are stored.

You can set this and PAPERLESS_DATA_DIR to the same folder to have paperless store all its data within the same volume.

Defaults to "../media/", relative to the "src" directory.

 

PAPERLESS_STATICDIR=<path>

Override the default STATIC_ROOT here. This is where all static files created using "collectstatic" manager command are stored.

Unless you're doing something fancy, there is no need to override this. If this is changed, you may need to run collectstatic again.

Defaults to "../static/", relative to the "src" directory.

 

PAPERLESS_LOGGING_DIR=<path>

This is where paperless will store log files.

Defaults to PAPERLESS_DATA_DIR/log/.

 

PAPERLESS_NLTK_DIR=<path>

This is where paperless will search for the data required for NLTK processing, if you are using it. If you are using the Docker image, this should not be changed, as the data is included in the image already.

Previously, the location defaulted to PAPERLESS_DATA_DIR/nltk. Unless you are using this in a bare metal install or other setup, this folder is no longer needed and can be removed manually.

Defaults to /usr/share/nltk_data

 Thank You very much. I cannot find that in docs. There also Export dir in unraid but I will leave it like media folder.
Regards.

Link to comment

Hello, 

 

hope your fine. 

 

Just today i saw that i am no longer able to login to paperless.ngx anymore (i did not used it in a while).

First i thought it could be something as described in this https://github.com/paperless-ngx/paperless-ngx/discussions/4755 but i double checked the mountpoints and thye are correct, also double checked user and password, no luck. 

 

Any other idea what i can try ? 

New setup of the container will not helo i guess as the data folder will be the same ( if i choose to), otherwise i am loosing all my taggin etc is that right? 

The Media folders are saved somewhere else, so the original files will be there, but the whole ocr etc will be gone. 

 

Maybe edit the redis container ? any idea how ? 

 

Thanks in advance 

Link to comment

Hi,

 

I am having problems exporting my paperless-ngx date from my old server to Unraid.

 

I have both running now, both the latest version 2.5.0 and I created the same users on both.

 

I then ran an export on the old one: document_exporter ../export

Next I moved ALL files (including the .jsons) to the export folder of the new installation

Then I ran the import on the new one: document_importer ../export

 

I get these errors:

 

root@39a32b43a9fe:/usr/src/paperless/src# document_importer ../export
Found existing user(s), this might indicate a non-empty installation
Checking the manifest
Database import failed
No version information present
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 89, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py", line 328, in execute
    return super().execute(query, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.IntegrityError: UNIQUE constraint failed: auth_user.username

 

My version.json only has this content:

{
  "version": "2.5.0"
}

 

And my manifest.json is full with the tags and user info from my old installation:

 {
    "model": "auth.user",
    "pk": 4,
    "fields": {
      "password": "REMOVED", <<< removed real password
      "last_login": "2024-02-10T18:49:33.728Z",
      "is_superuser": true,
      "username": "user", <<< removed real username
      "first_name": "",
      "last_name": "",
      "email": "[email protected]", <<< removed real emailaddress
      "is_staff": true,
      "is_active": true,
      "date_joined": "2024-01-22T11:02:24Z",
      "groups": [],
      "user_permissions": []
    }
  },

 

Does anyone have an idea what I am doing wrong?

 

Edit: Found the issue and resolved it. Paperless doesn't like it when you import existing users of another instance to the new one, if you have created users there before.

 

So for anyone wondering, after installing your new image and starting it, do not log in, instead do the document_importer ... routine first,

 

Much appreciated! :)

Edited by Ynitxap
Link to comment
  • 2 weeks later...

Hi Guys, I have set-up Paperless on Unraid. I used header auth with Authentik. Currently it is possible to use proper SSO with paperless and authentik. I watched https://youtu.be/xO-EVYNinXA?si=nJbh3qQ338oMFTuP  and I noticed I needed 2 environment variables:

 

PAPERLESS_APPS: "allauth.socialaccount.providers.openid_connect"

and

 

PAPERLESS_SOCIALACCOUNT_PROVIDERS: '{"openid_connect": {"APPS": [{"provider_id": "authentik","name": "Authentik SSO","client_id": "XX Client ID from Authentika XX","secret": "XX Client Secret from Authentika XX","settings": { "server_url": "https://auth.xyz.com/application/o/paperless-ngx/.well-known/openid-configuration"}}]}}'

 

so I added the following to the template and removed the previous variables related to headers.

As soon as I start the paperless container is stops again. do I remove the variables, it runs again. If i remove the single quotes, from paperless_socialaccount_providers, the container starts but SSO does not work. What Am I missing here? Someone with a similair set-up to help me out?

 

1998102724_Screenshot2024-02-23om10_12_54.thumb.jpg.29fe0c82064db56356ae762a1fd9b9d9.jpg

 

Edit: It would help if someone with the same set-up can share his Unraid template screenshots. I have doubts about if I have the right environment variables. But looking at all responses I must have been te only one running Paperless on Unraid in combination with Authentik.

Edited by MobileDude
No response yet
Link to comment

n00b to paper-ngx here 🙋‍♂️ ... I'm hoping someone can point me in the right direction to determine why the docker instance is consuming 10TB of the unRaid docker image and maxing out the image file. I have all user customizable fields pointed to places outside the docker image, but I must be overlooking something. I don't have tika, gotenberg installed; just redis. The gui launches and it seems to be working ok, just running out of docker image space. Thank you.

Screenshot 2024-02-25 at 4.21.24 PM.png

  • Upvote 1
Link to comment
On 1/1/2024 at 7:44 PM, kaosdll said:

Hi there,

 

I am facing some issues with Gotenberg & Tika as well.

Docker:

1739773772_Bildschirmfoto2024-01-01um19_39_42.png.3726128c10cdfb5d96781f7f3381e5d0.png

 

Paperless:

96392763_Bildschirmfoto2024-01-01um19_40_46.thumb.png.ec6ba0af15eda76f605674c782f60de6.png

I tried using the IP address, too (http://192.168.0.223:9998/ and :3000/)

 

When I upload a docx I get:

1474342662_Bildschirmfoto2024-01-01um19_42_28.png.f8ac4bb532afd5a84ae84dfd90668c28.png

documents.parsers.ParseError: Could not parse /tmp/paperless/paperless-ngxhc_f5mdb/demo.docx with tika server at http://localhost:9998: [Errno 111] Connection refused

 

 

Looks like Paperless always try to open localhost instead of the Docker container name nor the IP address....

 

 

What is wrong?

 

Thank you and a happy new year.

Br Chris

 

Tried today to get my paperless with Tika/Gotenberg running, too...

Everytime the same - paperless try to open http://localhost:9998 instead of http://192.186.0.223:9998 or http://Apache-Tika-Server:9998 (Docker container name)

 

I am running out of ideas, my config looks ok - I set PAPERLESS_TIKA_ENDPOINT and PAPERLESS_TIKA_ENABLED...

 

Suggestions?
Thanks you.

  • Upvote 2
Link to comment
  • 1 month later...

I've installed paperless and run into a problem, that my files are stuck in consume. They're being added to the queue but not all of them are being processed, when i add them all at once. Adding PAPERLESS_CONSUMER_POLLING and PAPERLESS_CONSUMER_POLLING_DELAY with 10 seconds each, seems to help a bit but no entirely so.

 

Some files are stuck forever in the consume folder and nothing happens to them. Has anyone else encountered this behaviour?

 

[2024-04-03 23:14:24,083] [INFO] [paperless.management.consumer] Polling directory for changes: /usr/src/paperless/consume

[2024-04-03 23:14:44,087] [DEBUG] [paperless.management.consumer] Waiting for file /usr/src/paperless/consume/Bestätigung der Vertragsbeendigung.pdf to remain unmodified

[2024-04-03 23:14:49,094] [INFO] [paperless.management.consumer] Adding /usr/src/paperless/consume/Bestätigung der Vertragsbeendigung.pdf to the task queue.

Link to comment

Hello, I installed Paperless NGX under Unraid. Now I get an import error when importing certain PDFs. They are probably signed PDFs. What can I change?

 

Gutschrift-GVG521754.pdf: Error occurred while consuming document Gutschrift-GVG521754.pdf: DigitalSignatureError: Input PDF has a digital signature. OCR would alter the document, invalidating the signature.

Link to comment
On 7/15/2022 at 11:19 AM, Tredaptive said:

Hello guys,
i hope someone can help me with my problem, i am relatively new to docker containers.
I could not find anything online that helps me find a solution to my problem.
Paperless-ngx always fills my docker.image instead of saving it to the array. Currently it is 96% filled and I don't know why. All files are max 1.2Gb in total. I have already enlarged the docker.image to 40Gb! in the docker settings.

 

Here you can see the memory usage is ok, but the docker usage is nearly at the max.

usage.png.b0b545306615141a5fe1392928ef6e9e.png

 

2112011902_dockersettings.png.68c2b2fc6338cba2a624f5e54e870f2b.png

 

This is the template settings i'm using for the storage(do i have them wrong?)

375367458_paperlesstempl.thumb.png.22f2429e26b2876710cd7e21570bdc8e.png

 

Update:

I think I found part of the problem. I removed and reinstalled everything again and uploaded individual files as a test. Default allocation of the docker.image was about 8.26GB. Then when uploading it fluctuated between 8.26-8.27GB.
I have a PDF file with 481MB which I then uploaded, now it slowly but steadily increased by that amount until it stopped at 20GB. 1 file is responsible for a little over 11GB even though the file is actually only 481MB.

On the Paperless interface, the file remains in the uploader with "Processing Document" without anything else happening....

Did you solve your problem? I have everything on the array, but my docker.img is totally filled up by paperless.  Currently at 13gb, and I just can't figure out why. My setup is like yours. Thoughts?

 

 

 

Link to comment
On 2/23/2024 at 10:16 AM, MobileDude said:

Hi Guys, I have set-up Paperless on Unraid. I used header auth with Authentik. Currently it is possible to use proper SSO with paperless and authentik. I watched https://youtu.be/xO-EVYNinXA?si=nJbh3qQ338oMFTuP  and I noticed I needed 2 environment variables:

 

PAPERLESS_APPS: "allauth.socialaccount.providers.openid_connect"

and

 

PAPERLESS_SOCIALACCOUNT_PROVIDERS: '{"openid_connect": {"APPS": [{"provider_id": "authentik","name": "Authentik SSO","client_id": "XX Client ID from Authentika XX","secret": "XX Client Secret from Authentika XX","settings": { "server_url": "https://auth.xyz.com/application/o/paperless-ngx/.well-known/openid-configuration"}}]}}'

 

so I added the following to the template and removed the previous variables related to headers.

As soon as I start the paperless container is stops again. do I remove the variables, it runs again. If i remove the single quotes, from paperless_socialaccount_providers, the container starts but SSO does not work. What Am I missing here? Someone with a similair set-up to help me out?

 

1998102724_Screenshot2024-02-23om10_12_54.thumb.jpg.29fe0c82064db56356ae762a1fd9b9d9.jpg

 

Edit: It would help if someone with the same set-up can share his Unraid template screenshots. I have doubts about if I have the right environment variables. But looking at all responses I must have been te only one running Paperless on Unraid in combination with Authentik.

 

Hey, did you find a solution for Paperless in combination with Authentik?

Link to comment
1 hour ago, Mika said:

 

Hey, did you find a solution for Paperless in combination with Authentik?

@Mika@MobileDude I have a working setup:

 

PAPERLESS_APPS=allauth.socialaccount.providers.openid_connect
PAPERLESS_SOCIALACCOUNT_PROVIDERS='{"openid_connect": {"APPS": [{"provider_id": "authentik", "name": "Authentik", "client_id": "YOUR_CLIENT_ID", "secret": "YOUR_CLIENT_SECRET", "settings":{ "server_url": "https://auth.domain.tld/application/o/paperless/.well-known/openid-configuration"}}], "OAUTH_PKCE_ENABLED": "True"}}'


Be sure to name the Authentik Outpost "paperless". In addition, you can set the following variables, once the basics work:

 

# hide the default login form
PAPERLESS_DISABLE_REGULAR_LOGIN=true

# sign up new users from authentik automatically
PAPERLESS_SOCIAL_AUTO_SIGNUP=true

# redirect to authentik after logout
PAPERLESS_LOGOUT_REDIRECT_URL=https://auth.domain.tld/application/o/paperless/end-session/logout

# trust authentik to provide valid email addresses
PAPERLESS_ACCOUNT_EMAIL_VERIFICATION=none

 

I hope this helps :)

Edited by Tuetenk0pp
Link to comment

I use Paperless's document_exporter command to manually backup everything to a specified folder no problem. How can I automate this command to occur weekly?

My end goal is automated backup starting with Paperless -> Duplicacy -> Cloudstorage

OS: Unraid 6.12.10, Paperless-ngx within a docker container.
Exporter command https://paperless.readthedocs.io/en/latest/utilities.html#utilities-exporter

Reply to my thread if you know how I can do this please! 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.