[Support] mem-zero — Self-hosted MCP memory server for AI coding assistants - Docker Containers

May 3May 3

mem-zero is a self-hosted memory server for AI coding assistants. It stores, searches, and manages persistent context across sessions so your tools remember what happened last week without stuffing everything into the context window. Each project gets its own isolated vector collection with automatic fact extraction and deduplication — all inside a single Docker container with no external dependencies.

GitHub: https://github.com/sworcery/mem-zero

Docker image: ghcr.io/sworcery/mem-zero:latest

How It Works

When you store text, an LLM extracts atomic facts (e.g. "User prefers Python over R"), checks each one against existing memories for duplicates, and embeds novel facts into a vector database for semantic search. When you search, only relevant memories are returned — not entire conversation logs. The project slug in the URL creates isolated collections, so memories from one project never leak into another.

Connecting Your Tools

mem-zero exposes both an MCP transport and a REST API. Any MCP client (Claude Code, Cursor, Windsurf, Claude Desktop, etc.) can connect by adding the server URL to its configuration:

http://YOUR-UNRAID-IP:8765/mcp/your-project-slug/http/your-user-id

For Claude Code specifically: claude mcp add mem-zero --transport http "http://YOUR-UNRAID-IP:8765/mcp/your-project-slug/http/your-user-id" -s local

Anything that can make HTTP requests can also use the REST API directly — store memories with POST, search with POST, list/delete with GET/DELETE. Full endpoint reference is in the GitHub README.

LLM Backends

mem-zero supports three backends for fact extraction and deduplication:

Bundled (default) — Ships with a quantized Qwen2.5-3B model that runs on CPU. Zero configuration, no external dependencies. Handles embeddings well and provides basic fact extraction. First startup downloads ~2 GB of models.
Ollama (recommended) — Point it at an existing Ollama instance on your network with the OLLAMA_BASE_URL variable. A 7B+ model on GPU produces significantly better extraction — qwen2.5:14b is the sweet spot. If Ollama becomes unreachable, the bundled model automatically takes over as a fallback.
OpenAI (beta) — Works with any OpenAI-compatible API (OpenAI, Groq, Together, etc.) via OPENAI_API_KEY and optional OPENAI_BASE_URL.

Backend is auto-detected based on which environment variables are set, or you can force it with LLM_BACKEND.

Web Dashboard

A management UI is served at the container's root URL. From the dashboard you can monitor system health and uptime, browse all projects and their memory counts, view/search/delete individual memories, consolidate similar fragments into clean summaries, delete entire projects, and add memories manually. Enable DIAGNOSTICS_ENABLED=true to see performance metrics, accuracy stats, and score distributions. Optionally protect with basic auth via DASHBOARD_USER and DASHBOARD_PASS.

Authentication

API key auth is available for all MCP and REST endpoints. Set the API_KEY variable and requests must include it as a Bearer token. If not set, all endpoints are open — suitable for trusted networks. The dashboard has its own separate basic auth since browsers need a login prompt rather than Bearer tokens.

Features

Project-isolated memory — each project slug maps to its own Qdrant vector collection
Semantic search with configurable top-k results
Automatic fact extraction and deduplication via LLM
Web dashboard with project browsing, memory search, and health monitoring
Memory consolidation — merge similar fragments into clean summaries
Cleanup tool for garbled text and multi-fact entries
Re-embed tool to regenerate all embeddings after model changes
MCP transport compatible with Claude Code, Cursor, Windsurf, Claude Desktop, and any MCP client
Full REST API with endpoints for store, search, list, delete, reembed, cleanup, and consolidate
Three LLM backends: bundled (zero config), Ollama (GPU-accelerated), OpenAI-compatible
Automatic Ollama-to-bundled fallback when Ollama is unreachable
API key authentication for MCP/REST endpoints (optional)
Dashboard basic auth (optional)
Diagnostics mode with performance and accuracy metrics
Embedded Qdrant vector database — no external database required
s6-overlay process supervision for all internal services
Configurable embedding dimensions, collection prefix, and server bind settings
Dark and light mode dashboard

Requirements

Docker
~2 GB disk for initial model download
~2 GB RAM minimum (bundled backend, more recommended for Ollama)

Post here for support, bug reports, or feature requests.

Quote

May 5May 5

Tried installing in unraid but getting error

Unable to find image 'ghcr.io/sworcery/mem-zero:latest' locally
docker: Error response from daemon: Head "https://ghcr.io/v2/sworcery/mem-zero/manifests/latest": unauthorized.

Also the template option for dashboard username says "Username for web dashboard login (leave empty to disable auth)" but the field is required so leaving empty isn't an option.

Quote

May 5May 5

Author

@rgreen83 Thanks for reporting this. I believe I found both issues.

Image pull error: Somehow the GHCR package visibility got set to private — not sure how that happened since the repo itself is public. Should have inherited public visibility. Either way, I've flipped it to public now so ghcr.io/sworcery/mem-zero:latest should pull without authentication going forward.

Dashboard username/password required: You're right, those fields were incorrectly marked as required in the template. I've updated them to optional so you can leave them blank to disable dashboard auth as intended. This fix will be in the next image push — if you want it immediately you can edit the container in Unraid and toggle the fields from "required" to "not required" manually, or just put dummy values in for now and clear them later.

Let me know if you're still hitting issues after re-pulling.

Quote

May 5May 5

working now, thanks!

Quote

July 2Jul 2

Having a weird issue where creating a memory/project works the first time either via tool call or cli, but then no more memories can be added to that project? Can make as many projects as i like but only the first memory is allowed to be created in each...any additional memories just come back as 0 memories added.

Quote

[Support] mem-zero — Self-hosted MCP memory server for AI coding assistants

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)