[Support] Collectathon - Hoarder

Collectathon · May 13

hoarder.png.cb75b7347d47989bd0c3e65b9e613700.png

Application Name: Hoarder

Application Website: https://hoarder.app

Docker Hub: N/A

GitHub: https://github.com/hoarder-app/hoarder

A self-hostable bookmark-everything app with a touch of AI for the data hoarders out there.

Documentation: https://docs.hoarder.app

Support thread for Hoarder docker image.

Edited May 19 by Collectathon
Updated GitHub link

nirav · May 15

I noticed that CA has 2 of these. One is the actual Hoarder app and the other is a "worker". Is the Hoarder worker required along side Hoarder container on a single server? On the app github page - i don't see any mention of the worker. So not sure what it even does.

Collectathon · May 15

13 minutes ago, nirav said:

I noticed that CA has 2 of these. One is the actual Hoarder app and the other is a "worker". Is the Hoarder worker required along side Hoarder container on a single server? On the app github page - i don't see any mention of the worker. So not sure what it even does.

The hoarder app will save and store your bookmarks. The worker app will then fetch additional information. It will fetch the web page for archival with browserless and automatically create tags with chat/ollama.

nirav · May 15

thanks!

drmetro · May 16

How does this app work and what are its usecases ?

Collectathon · May 16

1 hour ago, drmetro said:

How does this app work and what are its usecases ?

The main use case for Hoarder is as a "read-it-later" app. You can save interesting articles, tools, or other content they find while browsing on your phone or desktop, and then access and read that content later across devices. The developer built Hoarder as a self-hosted alternative to bookmark managers like Pocket, with inspiration from open-source projects like memos and mymind. They wanted a bookmark manager they could host themselves, with features like link previews and automatic AI-based tagging. You can find more info in the GitHub README.

Trustwbc · May 18

Hi i installed the "Server" Haorder-App at 192.168.2.207 and the Worker on 192.168.2.211 - the Installation in the same .207 was not possible.

Browserless installed and its running, meilisearch installed and its running.

But:

How does the App know, that the worker is an alternative Adress .211 ?

It looks so:

Collectathon · May 19

17 hours ago, Trustwbc said:

Hi i installed the "Server" Haorder-App at 192.168.2.207 and the Worker on 192.168.2.211 - the Installation in the same .207 was not possible.

Browserless installed and its running, meilisearch installed and its running.

But:

How does the App know, that the worker is an alternative Adress .211 ?

It looks so:

They access the same database and communicate through redis. The data dir of the worker needs to be the same as the web container.

Edited May 19 by Collectathon

schubdog · May 19

Are there any plans to have only a single image with the worker and the app like an All in One package ?

Also does Hoarder meanwhile supports import of booksmarks from Chrome etc ?

I really like the Idea of giving tags KI based but im still not convinced why i should move from Linkwarden if the rest is still under heavy development

Collectathon · May 19

4 hours ago, schubdog said:

Are there any plans to have only a single image with the worker and the app like an All in One package ?

Also does Hoarder meanwhile supports import of booksmarks from Chrome etc ?

I really like the Idea of giving tags KI based but im still not convinced why i should move from Linkwarden if the rest is still under heavy development

There will be an AIO template if the developer creates an AIO docker image. From what I can tell, this is on the roadmap, but it's not a priority.

Yes, Hoarder supports importing bookmarks from Chrome.

https://docs.hoarder.app/import/

You can run both simultaneously to try it out. I'm not trying to convince you to switch, I just like the software so I made a template.

luisalrp · May 23

Has anyone managed to get it working using LocalAI? What models in LocalAI and variables in Hoarder? Thanks!

Collectathon · May 24

17 hours ago, luisalrp said:

Has anyone managed to get it working using LocalAI? What models in LocalAI and variables in Hoarder? Thanks!

I haven't used LocalAI before but in theory, you can add/modify the variables below as it should be a drop-in replacement for OpenAI. Other who have actually used it may be able to provide more support.

OPENAI_BASE_URL = http://localhost:8080
OPENAI_API_KEY = sk-XXXXXXXXXXXXXXXXXXXX
INFERENCE_TEXT_MODEL = gpt-4

Edited May 24 by Collectathon

mbassem · May 25

On 5/24/2024 at 7:15 AM, Collectathon said:
I haven't used LocalAI before but in theory, you can add/modify the variables below as it should be a drop-in replacement for OpenAI. Other who have actually used it may be able to provide more support.
OPENAI_BASE_URL = http://localhost:8080
OPENAI_API_KEY = sk-XXXXXXXXXXXXXXXXXXXX
INFERENCE_TEXT_MODEL = gpt-4

Hi, Hoarder's developer here.

I highly recommend against the `gpt-4` model. `gpt-4` costs $30 / 1M tokens which is extremely expensive. Hoarder defaults to `gpt-3.5-turbo-0125` for text which costs $0.5 / 1M tokens (notice the huge difference!). If you want `gpt-4` level of inference, go for `gpt-4o` which is $5 / 1M tokens still much cheaper than the `gpt-4` model.

millercb · June 7

I am trying to get Hoarder installed on Unraid, but I keep running into issues. I have Hoarder, Hoarder-workers, Redis, and Browserless installed. I figured I can add the search feature after I get it working. When I try to add a bookmark, it seems like nothing happens. If I refresh the page the bookmark shows up, but no description or image, and no tags. I have tried with Ollama and OpenAI, neither seem to work. I am getting an error in the logs, that looks like it is unable to connect to Redis:

Error: connect EHOSTUNREACH 192.168.1.10:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1605:16)
    at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -113,
  code: 'EHOSTUNREACH',
  syscall: 'connect',
  address: '192.168.1.10',
  port: 6379
}

I am running the Redis docker also on unraid. It is running in Host mode, 192.168.1.10 is the IP of my Unraid box, and it is running on the default port 6379. In Hoarder, I have tried the hostname redis, http://192.168.1.10 and 192.168.1.10. The only thing that changes is when I use "redis" as the hostname I get a different error:

Error: getaddrinfo ENOTFOUND redis
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -3008,
  code: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'redis'
}

I know Redis is working, because I am running Paperless-ngx which uses Redis, and it works just fine. Anyone have any ideas what my problem could be?

Collectathon · June 7

3 hours ago, millercb said:

Error: connect EHOSTUNREACH 192.168.1.10:6379
    at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1605:16)
    at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -113,
  code: 'EHOSTUNREACH',
  syscall: 'connect',
  address: '192.168.1.10',
  port: 6379
}

Error: getaddrinfo ENOTFOUND redis
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (node:dns:107:26)
    at GetAddrInfoReqWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
  errno: -3008,
  code: 'ENOTFOUND',
  syscall: 'getaddrinfo',
  hostname: 'redis'
}

Hoarder can't reach Redis. Can you please send a screenshot of your Hoarder config.

millercb · June 8

4 hours ago, Collectathon said:

Hoarder can't reach Redis. Can you please send a screenshot of your Hoarder config.

Here are my settings.

Collectathon · June 8

3 hours ago, millercb said:

Here are my settings.

Thanks for that. Are you able to connect to Redis if it is in Bridge mode instead of Host? Also, change the Redis Host address to just the IP address without the 'http://'.

millercb · June 8

7 hours ago, Collectathon said:

Thanks for that. Are you able to connect to Redis if it is in Bridge mode instead of Host? Also, change the Redis Host address to just the IP address without the 'http://'.

Thanks for the help. I switched Redis to bridge mode, and removed the 'http://', but I'm still getting the ehostunreach error. One other thing I just thought of, I am running AdGuard as a DNS server on my network, would that have an effect on it? Other than that, I'm not sure.

Collectathon · June 8

8 hours ago, millercb said:

Thanks for the help. I switched Redis to bridge mode, and removed the 'http://', but I'm still getting the ehostunreach error. One other thing I just thought of, I am running AdGuard as a DNS server on my network, would that have an effect on it? Other than that, I'm not sure.

I believe I have found the issue.

Normally Docker does not allow Docker containers to directly access the same subnet as the one used by the host. You can allow this under Settings → Docker by changing Host access to custom networks from disabled to enabled.

The other option is to also move your Redis container to your br0 network.

luisalrp · June 13

Hey @Collectathon, I can confirm it works with LocalAI. Thanks!

Collectathon · June 13

4 minutes ago, luisalrp said:

Hey @Collectathon, I can confirm it works with LocalAI. Thanks!

That's awesome! Thanks for the update.

millercb · June 13

On 6/8/2024 at 5:21 PM, Collectathon said:

I believe I have found the issue.

Normally Docker does not allow Docker containers to directly access the same subnet as the one used by the host. You can allow this under Settings → Docker by changing Host access to custom networks from disabled to enabled.

The other option is to also move your Redis container to your br0 network.

Thank you!! I finally have everything working.

Edited June 13 by millercb

Collectathon · June 13

5 hours ago, millercb said:

Thank you!! I finally have everything working.

You're welcome! I'm glad that you got it up and running.

Greyberry · July 8

My worker is always failing and i can not really figure out why:

2024-07-08T13:04:51.625Z info: Workers version: 0.15.0
2024-07-08T13:04:51.627Z info: [Crawler] Browser connect on demand is enabled, won't proactively start the browser instance
2024-07-08T13:04:51.627Z info: Starting crawler worker ...
2024-07-08T13:04:51.628Z info: Starting inference worker ...
2024-07-08T13:04:51.628Z info: Starting search indexing worker ...
2024-07-08T13:05:16.988Z debug: [search][7] Search is not configured, nothing to do now
2024-07-08T13:05:16.990Z info: [search][7] Completed successfully
2024-07-08T13:05:39.245Z info: [Crawler][15] Will crawl "https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P" for link with id "o81u01cxioxflow8fgim977a"
2024-07-08T13:05:39.245Z info: [Crawler][15] Attempting to determine the content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P
2024-07-08T13:05:39.382Z info: [Crawler][15] Content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P is "text/html"
2024-07-08T13:05:39.383Z info: [Crawler] Connecting to existing browser websocket address: ws://172.19.0.19:3000?token=ak1_0b27720fed5e83f0e765_a073391f3ce890473649
2024-07-08T13:05:39.463Z error: [Crawler][15] Crawling job failed: [object Object]
2024-07-08T13:05:40.501Z info: [Crawler][15] Will crawl "https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P" for link with id "o81u01cxioxflow8fgim977a"
2024-07-08T13:05:40.501Z info: [Crawler][15] Attempting to determine the content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P
2024-07-08T13:05:40.609Z info: [Crawler][15] Content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P is "text/html"
2024-07-08T13:05:40.609Z info: [Crawler] Connecting to existing browser websocket address: ws://172.19.0.19:3000?token=ak1_0b27720fed5e83f0e765_a073391f3ce890473649
2024-07-08T13:05:40.612Z error: [Crawler][15] Crawling job failed: [object Object]
2024-07-08T13:05:42.707Z info: [Crawler][15] Will crawl "https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P" for link with id "o81u01cxioxflow8fgim977a"
2024-07-08T13:05:42.707Z info: [Crawler][15] Attempting to determine the content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P
2024-07-08T13:05:42.791Z info: [Crawler][15] Content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P is "text/html"
2024-07-08T13:05:42.791Z info: [Crawler] Connecting to existing browser websocket address: ws://172.19.0.19:3000?token=ak1_0b27720fed5e83f0e765_a073391f3ce890473649
2024-07-08T13:05:42.798Z error: [Crawler][15] Crawling job failed: [object Object]
2024-07-08T13:05:46.822Z info: [Crawler][15] Will crawl "https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P" for link with id "o81u01cxioxflow8fgim977a"
2024-07-08T13:05:46.823Z info: [Crawler][15] Attempting to determine the content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P
2024-07-08T13:05:46.902Z info: [Crawler][15] Content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P is "text/html"
2024-07-08T13:05:46.902Z info: [Crawler] Connecting to existing browser websocket address: ws://172.19.0.19:3000?token=ak1_0b27720fed5e83f0e765_a073391f3ce890473649
2024-07-08T13:05:46.909Z error: [Crawler][15] Crawling job failed: [object Object]
2024-07-08T13:05:54.956Z info: [Crawler][15] Will crawl "https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P" for link with id "o81u01cxioxflow8fgim977a"
2024-07-08T13:05:54.957Z info: [Crawler][15] Attempting to determine the content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P
2024-07-08T13:05:55.106Z info: [Crawler][15] Content-type for the url https://www.amazon.de/Homo-Sapiens-Nikolaus-Geyrhalter/dp/B0765NRP1P is "text/html"
2024-07-08T13:05:55.106Z info: [Crawler] Connecting to existing browser websocket address: ws://172.19.0.19:3000?token=ak1_0b27720fed5e83f0e765_a073391f3ce890473649
2024-07-08T13:05:55.114Z error: [Crawler][15] Crawling job failed: [object Object]

Is the token supposed to be an api token from the hoarder webinterface right? I assumed that one, but otherwise I have no idea what else i could do. I even used the hoarder containers IP instead of the hostname i do usually between container communication.

Communication is definetly possible between the two containers. This are pings from the hoarder-workers to the hoarder container:

/app/apps/workers # ping 172.19.0.19
PING 172.19.0.19 (172.19.0.19): 56 data bytes
64 bytes from 172.19.0.19: seq=0 ttl=64 time=0.278 ms
64 bytes from 172.19.0.19: seq=1 ttl=64 time=0.145 ms
64 bytes from 172.19.0.19: seq=2 ttl=64 time=0.147 ms
64 bytes from 172.19.0.19: seq=3 ttl=64 time=0.151 ms
^C
--- 172.19.0.19 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.145/0.180/0.278 ms
/app/apps/workers # ping hoarder
PING hoarder (172.19.0.19): 56 data bytes
64 bytes from 172.19.0.19: seq=0 ttl=64 time=0.166 ms
64 bytes from 172.19.0.19: seq=1 ttl=64 time=0.078 ms
64 bytes from 172.19.0.19: seq=2 ttl=64 time=0.143 ms
64 bytes from 172.19.0.19: seq=3 ttl=64 time=0.162 ms
^C
--- hoarder ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.078/0.137/0.166 ms
/app/apps/workers #

Edited July 8 by Greyberry
added pings

Collectathon · July 8

9 hours ago, Greyberry said:

My worker is always failing and i can not really figure out why

Is the token supposed to be an api token from the hoarder webinterface right? I assumed that one, but otherwise I have no idea what else i could do. I even used the hoarder containers IP instead of the hostname i do usually between container communication.

There are two different versions of browserless. The token you need is passed through as an env variable to the browserless container. If you are using browserless v1, you may not have/need a token. Both versions are available in CA.

Browserless-v1
- Token optional

- Example: ws://browserless:3000

Browserless-v2

- Token required

- Example: ws://browserless:3000?token=my-token

Edited July 8 by Collectathon

[Support] Collectathon - Hoarder

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Join the conversation