Jump to content

Paperless ngx - tika und gotenberg integrieren


tiobane
Go to solution Solved by alturismo,

Recommended Posts

Hallo,

 

bin noch recht unerfahren und habe mir  paperless-ngx eingerichtet.

Läuft auch soweit, redis hat auch geklappt, habe auch tika und gotenberg am laufen, allerdings scheint paperless da nicht mit zu arbeiten:

 

[2024-02-25 02:23:51,190] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Kündigung Rentenversicherung.docx: Unknown file extension.

 

Habe das mal soweit zum container hinzugefügt:

image.thumb.png.76f7da649b4c38dfc99d96b56a8ca779.png

 

Bei der boolean sieht es so aus:

image.png.60d96434d251d4363b7cd952021b3ec2.png

 

Container laufen auch:

image.png.4f2ac779bb71ce3952f43b1afbfada69.png

 

 

Wo liegt hier mein Fehler?

 

Würde mich über Hilfe freuen 😃

 

 

Link to comment
  • Solution
2 hours ago, tiobane said:

Wo liegt hier mein Fehler?

 

ich nutze es zwar nicht, aber was man sieht, du hast der Variable einen Namen gegeben, aber nicht den Key

 

image.png.29f45df7eb924997e2419b7657323ab6.png

 

der Name ist egal, der Key ist elementar, kopier den Namen einfach ins Key Feld und schau dann weiter, gleiches evtl. für die endpoints ... ?

Link to comment
  • 4 weeks later...

Bei mir werden doc und eml Dateien nicht erkannt.

 

Habe paperless tika gotenberg auf dem unraid installiert. Habe dann im paperless die Variablen angelegt. Ich sehe keine Fehlermeldung. Es scheint zu laufen. Es gibt in den logs in paperless keinen Hinweis auf tika und gotenberg, keine Ahnung ob das gut oder schlecht ist.

 

Was kann hier der Fehler sein? Wo muss ich noch schauen?

 

Das paperless log vom Start...

[2024-03-24 21:15:54,918] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Einlieferungsbeleg_OL.msg: Unknown file extension.
[2024-03-24 21:15:54,919] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Anschreiben.odt: Unknown file extension.
[2024-03-24 21:15:54,921] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Einlieferungsbeleg.eml: Unknown file extension.
[2024-03-24 21:15:54,922] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Whg_ÜN.docx: Unknown file extension.
[2024-03-24 21:15:54,923] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Whg_ÜN_bearb.xopp: Unknown file extension.
[2024-03-24 21:15:54,924] [INFO] [paperless.management.consumer] Using inotify to watch directory for changes: /usr/src/paperless/consume
[2024-03-24 21:15:55 +0100] [183] [INFO] Starting gunicorn 21.2.0
[2024-03-24 21:15:55 +0100] [183] [INFO] Listening at: http://[::]:8000 (183)
[2024-03-24 21:15:55 +0100] [183] [INFO] Using worker: paperless.workers.ConfigurableWorker
[2024-03-24 21:15:55 +0100] [183] [INFO] Server is ready. Spawning workers
[2024-03-24 21:15:55,301] [INFO] [celery.beat] beat: Starting...
[2024-03-24 21:15:55,330] [INFO] [celery.worker.consumer.connection] Connected to redis://192.168.21.211:6379//
[2024-03-24 21:15:55,351] [INFO] [celery.apps.worker] celery@8f92659f22fe ready.
[2024-03-24 21:20:00,008] [INFO] [celery.beat] Scheduler: Sending due task Check all e-mail accounts (paperless_mail.tasks.process_mail_accounts)
[2024-03-24 21:20:00,017] [INFO] [celery.worker.strategy] Task paperless_mail.tasks.process_mail_accounts[296783b9-e1ad-44e5-badf-9a32d6d4fdee] received
[2024-03-24 21:20:00,631] [INFO] [celery.app.trace] Task paperless_mail.tasks.process_mail_accounts[296783b9-e1ad-44e5-badf-9a32d6d4fdee] succeeded in 0.6110591730102897s: 'No new documents were added.'
[2024-03-24 21:30:00,000] [INFO] [celery.beat] Scheduler: Sending due task Check all e-mail accounts (paperless_mail.tasks.process_mail_accounts)
[2024-03-24 21:30:00,005] [INFO] [celery.worker.strategy] Task paperless_mail.tasks.process_mail_accounts[49df54d4-8b42-450d-9be9-a3eebbb20cc1] received
[2024-03-24 21:30:00,586] [INFO] [celery.app.trace] Task paperless_mail.tasks.process_mail_accounts[49df54d4-8b42-450d-9be9-a3eebbb20cc1] succeeded in 0.5769838010019157s: 'No new documents were added.'
Installing languages...
Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian bookworm-updates InRelease
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
Fetched 48.0 kB in 0s (144 kB/s)
Reading package lists...
Package tesseract-ocr-deu already installed!
Creating directory scratch directory /tmp/paperless
Adjusting permissions of paperless files. This may take a while.
Waiting for Redis...
Connected to Redis broker.
Apply database migrations...
Operations to perform:
  Apply all migrations: account, admin, auth, authtoken, contenttypes, django_celery_results, documents, guardian, paperless, paperless_mail, sessions, socialaccount
Running migrations:
  No migrations to apply.
Running Django checks
System check identified no issues (0 silenced).
Executing /usr/local/bin/paperless_cmd.sh
2024-03-24 13:15:50,433 INFO Set uid to user 0 succeeded
2024-03-24 13:15:50,434 INFO supervisord started with pid 1
2024-03-24 13:15:51,441 INFO spawned: 'gunicorn' with pid 183
2024-03-24 13:15:51,444 INFO spawned: 'celery' with pid 184
2024-03-24 13:15:51,446 INFO spawned: 'celery-beat' with pid 185
2024-03-24 13:15:51,449 INFO spawned: 'consumer' with pid 186
2024-03-24 13:15:51,451 INFO spawned: 'celery-flower' with pid 187
Checking if we should start flower...
Not starting flower
2024-03-24 13:15:51,474 INFO success: celery-flower entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2024-03-24 13:15:51,475 INFO exited: celery-flower (exit status 0; expected)
2024-03-24 13:15:52,476 INFO success: gunicorn entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-03-24 13:15:52,476 INFO success: celery entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-03-24 13:15:52,476 INFO success: celery-beat entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-03-24 13:15:52,477 INFO success: consumer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
celery beat v5.3.6 (emerald-rush) is starting.
 
 -------------- celery@8f92659f22fe v5.3.6 (emerald-rush)
--- ***** ----- 
-- ******* ---- Linux-6.1.74-Unraid-x86_64-with-glibc2.36 2024-03-24 21:15:55
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         paperless:0x152d4c733250
- ** ---------- .> transport:   redis://192.168.21.211:6379//
- ** ---------- .> results:     
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- .> task events: ON
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery
                

[tasks]
  . documents.tasks.bulk_update_documents
  . documents.tasks.consume_file
  . documents.tasks.index_optimize
  . documents.tasks.sanity_check
  . documents.tasks.train_classifier
  . documents.tasks.update_document_archive_file
  . paperless_mail.mail.apply_mail_action
  . paperless_mail.mail.error_callback
  . paperless_mail.tasks.process_mail_accounts

__    -    ... __   -        _
LocalTime -> 2024-03-24 21:15:55
Configuration ->
    . broker -> redis://192.168.21.211:6379//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> celery.beat.PersistentScheduler
    . db -> /usr/src/paperless/data/celerybeat-schedule.db
    . logfile -> [stderr]@%INFO
    . maxinterval -> 5.00 minutes (300s)

 

Das paperless log...

Quote

[2024-03-24 21:41:16,259] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Einlieferungsbeleg_OL.msg: Unknown file extension.

[2024-03-24 21:41:16,260] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Anschreiben.odt: Unknown file extension.

[2024-03-24 21:41:16,262] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Einlieferungsbeleg.eml: Unknown file extension.

[2024-03-24 21:41:16,263] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Whg_ÜN.docx: Unknown file extension.

[2024-03-24 21:41:16,265] [WARNING] [paperless.management.consumer] Not consuming file /usr/src/paperless/consume/Whg_ÜN_bearb.xopp: Unknown file extension.

[2024-03-24 21:41:16,267] [INFO] [paperless.management.consumer] Using inotify to watch directory for changes: /usr/src/paperless/consume

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...