[Support] Paperless-ng Docker


Recommended Posts

I currently have FILENAME_FORMAT set as: {create_year}/{correspondent}/{title}.  This works well for me but not for my better half ;-)

 

My wife wants to have all the scans under a separate folder (different from the one above that I'm using).  She would like to have all scans stored under her name.  If I can have all PDFs prefixed with her name (eg: wife_document1.pdf), how can I use/extract the prefix of the filename to have all documents stored under?

 

In summary,

My documents get stored at: /usr/src/paperless/media/{create_year}/{correspondent}/{title}. 

My wife would like to have everything under: /usr/src/paperless/media/wife_name/{create_year}/{correspondent}/{title}.{create_year}/{correspondent}/{title}.

 

How do I accomplish this?

Link to comment
On 4/12/2021 at 1:30 PM, CvT said:

Got paperless-ng up and running with Tika and Gotenberg, really nice!

 

Does anybody know if you can use parts of the document's filename to tag?

 

Would you mind sharing how you got Tika & Gotenberg installed/configured?

 

Also, did you figure out how to use filename for tagging?

 

Thanks.

  • Like 2
Link to comment
On 5/20/2021 at 6:03 PM, luk said:

Hi all,

 

how I can do a full backup of Paperless-ng ? I would like to avoid the situation, if I have to replace the hardware/hard disk, that I lose the data I have stored so far.

 

Thank you,

Luk

Just backup the mounted folder in your appdata or use the CA backup plugin

  • Like 1
Link to comment
On 5/22/2021 at 10:22 AM, mattie112 said:

That's not an other question, that is the same question.

 

Either use CA backup or backup the directories listed above with any other tool.

Ok, got it finally, thank you :)

Edited by luk
Link to comment

Set it up this weekend, and having a small issue.  I can manually add files just fine.  However, the consumption watcher never hits the directory I'm using.  

I set the directory to a share on my server so I can export from my scanner, etc135806170_ScreenShot2021-05-24at2_16_12PM.thumb.png.da9d0c811922a71856b2ca7f222f6413.png

 

Looking at the logs, it seems to be watching a different directory for changes:

846859565_ScreenShot2021-05-24at2_17_44PM.thumb.png.bf37ba8cc3a5602e48528f1c689a8d82.png

 

I'm sure I made a simple mistake but I think I've looked at it so long now I'm overlooking it.  

 

 

Link to comment

How to change LOGLEVEL to INFO?  Currently, in the paperless.log, I see DEBUG messages.

 

I'm trying to minimize disk writes.  In paperless-ng documentation, I only see two LOG related variables (PAPERLESS_LOGROTATE_MAX_SIZE, PAPERLESS_LOGROTATE_MAX_BACKUPS).  There's nothing to change the LOGLEVEL.

 

Any idea how do I change the log level?

 

Thanks.

Link to comment
  • 2 weeks later...

My Paperless was working well for a while but my mail fetching doesn't seem to work anymore.  When I manually add files by drag and drop or browse files it all works fine.

 

When I run the mail fetcher I get this in the paperless.log

image.thumb.png.55bdf15889b900d8d6bbb855e17af22d.png

 

This is also shown in the docker log:

image.thumb.png.7454ed94acc4023a056931901e1e922b.png

 

The mail fetching side of things seems to be working fine:

image.png.0ead7436684575f6b427aa7e39b4a307.png

 

I have 777 permissions on my three mounted Media, Consume and Export directories and as I said - manual uploads work fine. Has anyone had any issues with mail fetching lately?

 

 

 

Edited by nug
redaction of file names, oops.
Link to comment
4 hours ago, nug said:

My Paperless was working well for a while but my mail fetching doesn't seem to work anymore.  When I manually add files by drag and drop or browse files it all works fine.

 

When I run the mail fetcher I get this in the paperless.log

image.thumb.png.55bdf15889b900d8d6bbb855e17af22d.png

 

This is also shown in the docker log:

image.thumb.png.7454ed94acc4023a056931901e1e922b.png

 

The mail fetching side of things seems to be working fine:

image.png.0ead7436684575f6b427aa7e39b4a307.png

 

I have 777 permissions on my three mounted Media, Consume and Export directories and as I said - manual uploads work fine. Has anyone had any issues with mail fetching lately?

 

 

 

 

I have no issues with the latest version (no update available). Does this happen with all mails? Or only a certain document?

Link to comment
4 hours ago, mattie112 said:

 

I have no issues with the latest version (no update available). Does this happen with all mails? Or only a certain document?

 

Every email and on both accounts that I have fetching. 

Edited by nug
Link to comment
  • 1 month later...
On 3/20/2021 at 3:36 PM, Shad0wWulf said:

 

For everyone using Bitnami Redis with a password the answer to this is to use this connection string:
 

redis://default:[PASSWORD]@[IP]:6379

if like me anyone else has the problem with it saying invalid combination or username disabled
try leaving the username blank, thats what made mine work.

redis://:password@redis:6379

Edited by strongy
Link to comment
  • 2 weeks later...
  • 2 weeks later...
  • 2 weeks later...

Hi there folks, here another user with some configuration problems.

 

Here are the steps I've completed so far:

 

1) Redis docker installation with default configuration. No password has been configured, just plain installation:

 

image.png.57cfca776ba91d79f7d10498c0f2bd26.png

 

2) Paperless-ng installation; folders have been configured and here comes the problem: redis IP config (and maybe the consumption folder). I'm trying to follow the same workflow explained by the OP. This is, taking documents from SMB folder in printer and process them. Here is my docker config. Must say I've tried with localhost, 127.0.0.1, server name and so on, and nothing seems to work.

 

image.thumb.png.a2717b8e9d8be5f369d646feae1154c8.png

 

3) Finally, paperless-ng logs:

 

[2021-08-19 10:44:31,907] [INFO] [paperless.management.consumer] Adding /usr/src/paperless/src/../consume/scan.jpg to the task queue.

[2021-08-19 10:44:31,914] [ERROR] [paperless.management.consumer] Error while consuming document

Traceback (most recent call last):

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 559, in connect

sock = self._connect()

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 615, in _connect

raise err

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 603, in _connect

sock.connect(socket_address)

ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/usr/src/paperless/src/documents/management/commands/document_consumer.py", line 76, in _consume

task_name=os.path.basename(filepath)[:100])

File "/usr/local/lib/python3.7/site-packages/django_q/tasks.py", line 73, in async_task

enqueue_id = broker.enqueue(pack)

File "/usr/local/lib/python3.7/site-packages/django_q/brokers/redis_broker.py", line 18, in enqueue

return self.connection.rpush(self.list_key, task)

File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 2016, in rpush

return self.execute_command('RPUSH', name, *values)

File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 898, in execute_command

conn = self.connection or pool.get_connection(command_name, **options)

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1192, in get_connection

connection.connect()

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 563, in connect

raise ConnectionError(self._error_message(e))

redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:6379. Connection refused.

[2021-08-19 10:44:31,919] [INFO] [paperless.management.consumer] Adding /usr/src/paperless/src/../consume/scan.pdf to the task queue.

[2021-08-19 10:44:31,924] [ERROR] [paperless.management.consumer] Error while consuming document

Traceback (most recent call last):

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 559, in connect

sock = self._connect()

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 615, in _connect

raise err

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 603, in _connect

sock.connect(socket_address)

ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File "/usr/src/paperless/src/documents/management/commands/document_consumer.py", line 76, in _consume

task_name=os.path.basename(filepath)[:100])

File "/usr/local/lib/python3.7/site-packages/django_q/tasks.py", line 73, in async_task

enqueue_id = broker.enqueue(pack)

File "/usr/local/lib/python3.7/site-packages/django_q/brokers/redis_broker.py", line 18, in enqueue

return self.connection.rpush(self.list_key, task)

File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 2016, in rpush

return self.execute_command('RPUSH', name, *values)

File "/usr/local/lib/python3.7/site-packages/redis/client.py", line 898, in execute_command

conn = self.connection or pool.get_connection(command_name, **options)

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 1192, in get_connection

connection.connect()

File "/usr/local/lib/python3.7/site-packages/redis/connection.py", line 563, in connect

raise ConnectionError(self._error_message(e))

redis.exceptions.ConnectionError: Error 111 connecting to 127.0.0.1:6379. Connection refused.

[2021-08-19 10:44:31,926] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/src/../consume/.HPIMAGE.VFS: Unknown file extension.

[2021-08-19 10:44:31,928] [INFO] [paperless.management.consumer] Using inotify to watch directory for changes: /usr/src/paperless/src/../consume

 

 

Thank you!

Link to comment
39 minutes ago, T0a said:

What does the log tell you when you use your server IP address i.e. `redis://XXX.XXX.XXX.XX:6379`. I think `127.0.0.1` will not work, because it tries to resolve Redis in the paperless-ng container then. Keep me posted.

 

Great! Changed IP to 192.168.x.x and seems network issue has been solved.

 

Now a new error appears:

 

[2021-08-19 11:37:06,938] [INFO] [paperless.management.consumer] Adding /usr/src/paperless/src/../consume/scan.jpg to the task queue.

[2021-08-19 11:37:06,944] [INFO] [paperless.management.consumer] Adding /usr/src/paperless/src/../consume/scan.pdf to the task queue.

[2021-08-19 11:37:06,948] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/src/../consume/.HPIMAGE.VFS: Unknown file extension.

[2021-08-19 11:37:06,950] [INFO] [paperless.management.consumer] Using inotify to watch directory for changes: /usr/src/paperless/src/../consume

[2021-08-19 11:37:10,173] [INFO] [paperless.consumer] Consuming scan.pdf

[2021-08-19 11:37:15,214] [DEBUG] [paperless.consumer] Detected mime type: application/pdf

[2021-08-19 11:37:15,314] [DEBUG] [paperless.consumer] Parser: RasterisedDocumentParser

[2021-08-19 11:37:15,323] [DEBUG] [paperless.consumer] Parsing scan.pdf...

[2021-08-19 11:37:16,592] [INFO] [paperless.consumer] Consuming scan.jpg

[2021-08-19 11:37:20,145] [DEBUG] [paperless.consumer] Detected mime type: image/jpeg

[2021-08-19 11:37:20,162] [DEBUG] [paperless.consumer] Parser: RasterisedDocumentParser

[2021-08-19 11:37:20,170] [DEBUG] [paperless.consumer] Parsing scan.jpg...

[2021-08-19 11:37:20,248] [WARNING] [paperless.parsing.tesseract] Error while getting text from PDF document with pdfminer.six

Traceback (most recent call last):

  File "/usr/src/paperless/src/paperless_tesseract/parsers.py", line 120, in extract_text

    stripped = post_process_text(pdfminer_extract_text(pdf_file))

  File "/usr/local/lib/python3.7/site-packages/pdfminer/high_level.py", line 119, in extract_text

    caching=caching,

  File "/usr/local/lib/python3.7/site-packages/pdfminer/pdfpage.py", line 128, in get_pages

    doc = PDFDocument(parser, password=password, caching=caching)

  File "/usr/local/lib/python3.7/site-packages/pdfminer/pdfdocument.py", line 572, in __init__

    self.read_xref_from(parser, pos, self.xrefs)

  File "/usr/local/lib/python3.7/site-packages/pdfminer/pdfdocument.py", line 806, in read_xref_from

    (pos, token) = parser.nexttoken()

  File "/usr/local/lib/python3.7/site-packages/pdfminer/psparser.py", line 493, in nexttoken

    self.fillbuf()

  File "/usr/local/lib/python3.7/site-packages/pdfminer/psparser.py", line 219, in fillbuf

    self.buf = self.fp.read(self.BUFSIZ)

PermissionError: [Errno 13] Permission denied

[2021-08-19 11:37:20,763] [DEBUG] [paperless.parsing.tesseract] Calling OCRmyPDF with args: {'input_file': '/usr/src/paperless/src/../consume/scan.pdf', 'output_file': '/tmp/paperless/paperless-rhgt24hs/archive.pdf', 'use_threads': True, 'jobs': 3, 'language': 'spa+eng', 'output_type': 'pdfa', 'progress_bar': False, 'skip_text': True, 'clean': True, 'deskew': True, 'rotate_pages': True, 'rotate_pages_threshold': 12.0, 'sidecar': '/tmp/paperless/paperless-rhgt24hs/sidecar.txt'}

 

Thank you @T0a!

 

Forgot to say I'm using unassigned devices plugin to mount SMB folder

Edited by lgb
Adding some info
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.