[SUPPORT] Nexus Orchestrator – Self-Hosted LLM Router - Docker Containers

March 21Mar 21

This is the support thread for the Nexus Orchestrator.

🔗 GitHub Repo🔗 Docker Hub🔗 Unraid Template

What is Nexus Orchestrator?

Nexus Orchestrator is a self-hosted orchestration layer that routes each LLM request to the best local or cloud model automatically.

This started as something I personally wanted for my own setup. I used AI tools along the way to help design, iterate, and refine parts of it — but the goal and overall system came from solving my own use case. It’s not perfect, but it works.

At its core, a lightweight router model classifies each prompt’s intent (CODING, REASONING, CREATIVE, VISION, DOCUMENT, or GENERAL) and dispatches it to whichever model you’ve configured for that category. You can freely mix local Ollama models with cloud providers in any combination.

Quick Install (Unraid)

Copy the template to your flash drive: /boot/config/plugins/dockerMan/templates-user/nexus-orchestrator.xml
Go to your Unraid Docker tab → Add Container
Search for Nexus Orchestrator or select the template manually
Set your Admin API Key (required)
Set your Local Provider URL — e.g. http://YOUR_SERVER_IP:11434 for Ollama. Do not use localhost, it resolves inside the container.
Optionally set a cloud provider URL and API key for hybrid routing
Set a Router Model — something small works fine: gemma3:4b, qwen2.5:3b, or nemotron-mini:4b

100% Local Setup

Set LOCAL_URL to your Ollama instance IP
Set ROUTER_MODEL to a small local model
In the UI under the Models tab, assign local models to each category

Nothing will leave your network.

Environment Variables

Variable	Required	Description
`ADMIN_API_KEY`	Yes	Password for the web UI and API
`ENCRYPTION_SECRET`	No	Encrypts stored data — derives from Admin key if blank
`LOCAL_URL`	No	Local provider base URL (default: `http://localhost:11434`)
`LOCAL_KEY`	No	API key for local provider if required
`CLOUD_URL`	No	OpenAI-compatible base URL for cloud provider
`CLOUD_API_KEY`	No	API key for cloud provider
`ROUTER_MODEL`	No	Model used for intent classification
`ROUTER_URL`	No	Custom router URL — defaults to `LOCAL_URL` if blank
`ROUTER_KEY`	No	API key for the router if different from local

Completed: 4/6/2026

✅ CORS configuration
✅ Cookie-based auth
✅ SSRF protection
✅ LaTeX/KaTeX rendering
✅ Stop generation button
✅ SQLite migration
✅ Category Mappings cloud filter
✅ Model fallback
✅ Chat input UX improvements
✅ Input validation (Zod schemas)
✅ Rate limiting
✅ Tests (Vitest)
✅ Conversation pagination
✅ Router result caching (opt-in toggle)
✅ FAST category
✅ SECURITY category
✅ Error boundaries
✅ Projects (organize chats into folders)
✅ Multi-user support (per-user accounts, isolated config, conversations & projects)
✅ Request queuing (per-user FIFO queue, up to 5 pending per user)
✅ Web search via tool calling (SearXNG integration, LLM decides when to search)

Planned:
Multiple local providers (Ollama + llama-swap + llama.cpp simultaneously)
URL fetch/browse tool (companion to web search — read a specific page)
Ollama backend abort

Configuration Guide

This guide covers the basic configuration needed to get Nexus Router working with local and optional cloud models.

LOCAL MODEL / API PROVIDER

This should point to your local LLM backend, usually Ollama.

Set the Provider URL to your machine’s IP address
Example: http://192.168.1.100:11434

Do not use localhost, as that refers to the container itself

Leave the API Key blank for standard Ollama setups

Nexus will automatically detect Ollama and handle the connection

CLOUD MODEL / API PROVIDER (Optional)

This is for OpenAI-compatible providers such as OpenAI or OpenRouter.

Set the Provider URL
Enter your API key

If left unconfigured, any categories set to Cloud will show a warning
Local-only setups will still function normally

INTENT ROUTER

The router determines which category handles each request.

A small model is sufficient
Recommended: gemma3:4b, qwen2.5:3b

Leave the Router URL blank to reuse your local provider

Only set a custom URL if you want a separate routing endpoint

DISCOVERED MODELS

Nexus will list all models available from the Local Provider.

If models do not appear:
Verify the Provider URL is correct
Ensure the status shows Online

Models can be selected and assigned to categories from this list

CATEGORY MAPPINGS

Categories define how requests are routed.

Each category includes:
A provider (Local or Cloud)
A pool of models
A fallback order (first model is primary, others are used if it fails)

Default categories:

GENERAL — general conversation
CODING — programming and debugging
REASONING — math, logic, and analysis
CREATIVE — writing and brainstorming
VISION — triggered by image input
DOCUMENT — triggered by file input
FAST — simple, low-latency responses
SECURITY — security research and testing

Custom categories can be added from the Models tab

View Changelog here for patch notes. https://github.com/FaqFirebase/Nexus-Orchestrator/blob/master/CHANGELOG.md

Bugs are expected.

GitHub: https://github.com/FaqFirebase/Nexus-Orchestrator Docker Hub: https://hub.docker.com/r/pikkonmg/nexus-orchestrator Template repo: https://github.com/PikkonMG/unraid-docker-templates

Screenshots

Edited April 6Apr 6 by PikkonMG

Quote

March 21Mar 21

Author

Release Notes: Nexus v1.1.9

This update covers all major changes since v1.1.3. The biggest highlights are live reasoning display, multiple local provider support, a full security hardening pass, and a much cleaner UI.

Thinking / Reasoning Display (v1.1.9)

Nexus now shows live reasoning for models that support it.

For Ollama models, the server sends think: true through the native API and streams reasoning token-by-token as it generates. Models like DeepSeek R1 and QwQ that natively emit <think> tags are also supported. Reasoning appears in a collapsible purple section above the response and stays visible until you manually close it.

There are two levels of control:

Global default: System → Settings → Show Model Thinking (enabled by default)
Per-chat override: Brain icon in the chat input bar

Models that do not support thinking fall back silently to a normal response. The FAST category always skips thinking.

Multiple Local Providers (v1.1.5)

Nexus can now connect to multiple local backends at the same time, including Ollama, llama-swap, llama.cpp, LM Studio, Open WebUI, and other OpenAI-compatible endpoints.

Model discovery aggregates across all configured providers. Category assignments now store the provider URL alongside the model name so routing always hits the correct backend. Fallback chains also work across providers.

Existing single-provider setups migrate automatically with no manual action required.

Provider Compatibility Improvements (v1.1.6)

Provider health checks and model discovery now correctly handle endpoints whose base URL ends with /v1, such as llama-swap and LM Studio.

Other compatibility improvements:

llama-swap display names now appear correctly in the UI
the proper routing key is still used in API requests
per-attempt chat timeout increased from 60s to 300s
timeout remains configurable via CHAT_TIMEOUT_MS
model-loading retries increased to 5 attempts with 30-second intervals to better support slow model swaps

Web Search Sources (v1.1.4)

After a web search completes, Nexus now shows a collapsible Sources section below the response.

This includes:

title
URL
snippet

for each SearXNG result that was used.

UI Improvements (v1.1.7 and v1.1.8)

Several quality-of-life improvements landed across the interface:

Collapsible settings sections: All Models tab sections now collapse and expand
Persistent section state: Collapse state survives page refresh
Persistent active tab: The selected tab (Chat, Models, or System) is remembered across refreshes
Discovered Models redesign: The old dense card grid was replaced with a provider-grouped collapsible list
Better model readability: Models are grouped by source, size pills are color-coded by parameter tier, and the active router model is highlighted
Copy code button: Every code block now gets a hover copy button with 2-second success feedback

Security Hardening (v1.1.8)

A full server-side security audit was performed. Major improvements include:

CORS: Origin is now echoed explicitly, and credentials are only allowed when a matching origin is present
SSRF protection: Cloud metadata endpoints such as AWS IMDS, GCP metadata, and the Kubernetes API are blocked
LAN access preserved: Private LAN IPs remain allowed by design so local providers still work
Security headers: Added CSP, X-Frame-Options, X-Content-Type-Options, HSTS, Referrer-Policy, and Permissions-Policy
Session cleanup: Expired sessions are swept hourly
Session caps: Maximum 10 concurrent sessions per user, with oldest eviction on overflow
Password complexity: New passwords now require at least one uppercase letter, one lowercase letter, and one digit
Body limits: Global body limit reduced to 1 MB; chat and conversation routes keep 20 MB for vision/image use
API key decoupling: Changing the admin login password no longer breaks API clients using x-admin-key
Rate limiting: Password-change endpoint is now rate-limited alongside login protections

Docker Image Size Reduction (v1.1.9)

Production images are now smaller because build-time dependencies are excluded from the runtime image.

Image size dropped from about 127 MB to about 86 MB.

Bug Fixes Since v1.1.3

Fixed Ollama being misidentified as a generic OpenAI-compatible provider, which prevented reasoning/thinking from being sent
Fixed the localThinkingEnabled is not defined scope bug introduced during the thinking-toggle work
Fixed mixed content warnings on HTTPS deployments caused by a hardcoded http://localhost:11434 in the frontend bundle
Fixed FAST category routing so it no longer grabs prompts that actually need real answers
Fixed session state not being fully cleared on logout in some edge cases

Edited April 16Apr 16 by PikkonMG

Quote

March 26Mar 26

Author

Edit: 4/06/2026 now on CA , how to and guide on using Nexus Orchestrator can be found here https://github.com/PikkonMG/unraid-docker-templates/blob/main/docs/Nexus_UNRAID_GUIDE.md

Edited April 6Apr 6 by PikkonMG

Quote

[SUPPORT] Nexus Orchestrator – Self-Hosted LLM Router

Featured Replies

Join the conversation

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)