March 21Mar 21 This is the support thread for the Nexus Orchestrator.๐ GitHub Repo๐ Docker Hub๐ Unraid TemplateWhat is Nexus Orchestrator?Nexus Orchestrator is a self-hosted orchestration layer that routes each LLM request to the best local or cloud model automatically.This started as something I personally wanted for my own setup. I used AI tools along the way to help design, iterate, and refine parts of it โ but the goal and overall system came from solving my own use case. Itโs not perfect, but it works.At its core, a lightweight router model classifies each promptโs intent (CODING, REASONING, CREATIVE, VISION, DOCUMENT, or GENERAL) and dispatches it to whichever model youโve configured for that category. You can freely mix local Ollama models with cloud providers in any combination.Quick Install (Unraid)Copy the template to your flash drive: /boot/config/plugins/dockerMan/templates-user/nexus-orchestrator.xmlGo to your Unraid Docker tab โ Add ContainerSearch for Nexus Orchestrator or select the template manuallySet your Admin API Key (required)Set your Local Provider URL โ e.g. http://YOUR_SERVER_IP:11434 for Ollama. Do not use localhost, it resolves inside the container.Optionally set a cloud provider URL and API key for hybrid routingSet a Router Model โ something small works fine: gemma3:4b, qwen2.5:3b, or nemotron-mini:4b100% Local SetupSet LOCAL_URL to your Ollama instance IPSet ROUTER_MODEL to a small local modelIn the UI under the Models tab, assign local models to each categoryNothing will leave your network.Environment VariablesVariableRequiredDescriptionADMIN_API_KEYYesPassword for the web UI and APIENCRYPTION_SECRETNoEncrypts stored data โ derives from Admin key if blankLOCAL_URLNoLocal provider base URL (default: http://localhost:11434)LOCAL_KEYNoAPI key for local provider if requiredCLOUD_URLNoOpenAI-compatible base URL for cloud providerCLOUD_API_KEYNoAPI key for cloud providerROUTER_MODELNoModel used for intent classificationROUTER_URLNoCustom router URL โ defaults to LOCAL_URL if blankROUTER_KEYNoAPI key for the router if different from localCompleted: 4/6/2026โ CORS configurationโ Cookie-based authโ SSRF protectionโ LaTeX/KaTeX renderingโ Stop generation buttonโ SQLite migrationโ Category Mappings cloud filterโ Model fallbackโ Chat input UX improvementsโ Input validation (Zod schemas)โ Rate limitingโ Tests (Vitest)โ Conversation paginationโ Router result caching (opt-in toggle)โ FAST categoryโ SECURITY categoryโ Error boundariesโ Projects (organize chats into folders)โ Multi-user support (per-user accounts, isolated config, conversations & projects)โ Request queuing (per-user FIFO queue, up to 5 pending per user)โ Web search via tool calling (SearXNG integration, LLM decides when to search)Planned:Multiple local providers (Ollama + llama-swap + llama.cpp simultaneously)URL fetch/browse tool (companion to web search โ read a specific page)Ollama backend abortConfiguration GuideThis guide covers the basic configuration needed to get Nexus Router working with local and optional cloud models.LOCAL MODEL / API PROVIDERThis should point to your local LLM backend, usually Ollama.Set the Provider URL to your machineโs IP addressExample: http://192.168.1.100:11434Do not use localhost, as that refers to the container itselfLeave the API Key blank for standard Ollama setupsNexus will automatically detect Ollama and handle the connectionCLOUD MODEL / API PROVIDER (Optional)This is for OpenAI-compatible providers such as OpenAI or OpenRouter.Set the Provider URLEnter your API keyIf left unconfigured, any categories set to Cloud will show a warningLocal-only setups will still function normallyINTENT ROUTERThe router determines which category handles each request.A small model is sufficientRecommended: gemma3:4b, qwen2.5:3bLeave the Router URL blank to reuse your local providerOnly set a custom URL if you want a separate routing endpointDISCOVERED MODELSNexus will list all models available from the Local Provider.If models do not appear:Verify the Provider URL is correctEnsure the status shows OnlineModels can be selected and assigned to categories from this listCATEGORY MAPPINGSCategories define how requests are routed.Each category includes:A provider (Local or Cloud)A pool of modelsA fallback order (first model is primary, others are used if it fails)Default categories:GENERAL โ general conversationCODING โ programming and debuggingREASONING โ math, logic, and analysisCREATIVE โ writing and brainstormingVISION โ triggered by image inputDOCUMENT โ triggered by file inputFAST โ simple, low-latency responsesSECURITY โ security research and testingCustom categories can be added from the Models tabView Changelog here for patch notes. https://github.com/FaqFirebase/Nexus-Orchestrator/blob/master/CHANGELOG.mdBugs are expected.GitHub: https://github.com/FaqFirebase/Nexus-Orchestrator Docker Hub: https://hub.docker.com/r/pikkonmg/nexus-orchestrator Template repo: https://github.com/PikkonMG/unraid-docker-templatesScreenshots Edited April 6Apr 6 by PikkonMG
March 21Mar 21 Author Release Notes: Nexus v1.1.9This update covers all major changes since v1.1.3. The biggest highlights are live reasoning display, multiple local provider support, a full security hardening pass, and a much cleaner UI.Thinking / Reasoning Display (v1.1.9)Nexus now shows live reasoning for models that support it.For Ollama models, the server sends think: true through the native API and streams reasoning token-by-token as it generates. Models like DeepSeek R1 and QwQ that natively emit <think> tags are also supported. Reasoning appears in a collapsible purple section above the response and stays visible until you manually close it.There are two levels of control:Global default: System โ Settings โ Show Model Thinking (enabled by default)Per-chat override: Brain icon in the chat input barModels that do not support thinking fall back silently to a normal response. The FAST category always skips thinking.Multiple Local Providers (v1.1.5)Nexus can now connect to multiple local backends at the same time, including Ollama, llama-swap, llama.cpp, LM Studio, Open WebUI, and other OpenAI-compatible endpoints.Model discovery aggregates across all configured providers. Category assignments now store the provider URL alongside the model name so routing always hits the correct backend. Fallback chains also work across providers.Existing single-provider setups migrate automatically with no manual action required.Provider Compatibility Improvements (v1.1.6)Provider health checks and model discovery now correctly handle endpoints whose base URL ends with /v1, such as llama-swap and LM Studio.Other compatibility improvements:llama-swap display names now appear correctly in the UIthe proper routing key is still used in API requestsper-attempt chat timeout increased from 60s to 300stimeout remains configurable via CHAT_TIMEOUT_MSmodel-loading retries increased to 5 attempts with 30-second intervals to better support slow model swapsWeb Search Sources (v1.1.4)After a web search completes, Nexus now shows a collapsible Sources section below the response.This includes:titleURLsnippetfor each SearXNG result that was used.UI Improvements (v1.1.7 and v1.1.8)Several quality-of-life improvements landed across the interface:Collapsible settings sections: All Models tab sections now collapse and expandPersistent section state: Collapse state survives page refreshPersistent active tab: The selected tab (Chat, Models, or System) is remembered across refreshesDiscovered Models redesign: The old dense card grid was replaced with a provider-grouped collapsible listBetter model readability: Models are grouped by source, size pills are color-coded by parameter tier, and the active router model is highlightedCopy code button: Every code block now gets a hover copy button with 2-second success feedbackSecurity Hardening (v1.1.8)A full server-side security audit was performed. Major improvements include:CORS: Origin is now echoed explicitly, and credentials are only allowed when a matching origin is presentSSRF protection: Cloud metadata endpoints such as AWS IMDS, GCP metadata, and the Kubernetes API are blockedLAN access preserved: Private LAN IPs remain allowed by design so local providers still workSecurity headers: Added CSP, X-Frame-Options, X-Content-Type-Options, HSTS, Referrer-Policy, and Permissions-PolicySession cleanup: Expired sessions are swept hourlySession caps: Maximum 10 concurrent sessions per user, with oldest eviction on overflowPassword complexity: New passwords now require at least one uppercase letter, one lowercase letter, and one digitBody limits: Global body limit reduced to 1 MB; chat and conversation routes keep 20 MB for vision/image useAPI key decoupling: Changing the admin login password no longer breaks API clients using x-admin-keyRate limiting: Password-change endpoint is now rate-limited alongside login protectionsDocker Image Size Reduction (v1.1.9)Production images are now smaller because build-time dependencies are excluded from the runtime image.Image size dropped from about 127 MB to about 86 MB.Bug Fixes Since v1.1.3Fixed Ollama being misidentified as a generic OpenAI-compatible provider, which prevented reasoning/thinking from being sentFixed the localThinkingEnabled is not defined scope bug introduced during the thinking-toggle workFixed mixed content warnings on HTTPS deployments caused by a hardcoded http://localhost:11434 in the frontend bundleFixed FAST category routing so it no longer grabs prompts that actually need real answersFixed session state not being fully cleared on logout in some edge cases Edited April 16Apr 16 by PikkonMG
March 26Mar 26 Author Edit: 4/06/2026 now on CA , how to and guide on using Nexus Orchestrator can be found here https://github.com/PikkonMG/unraid-docker-templates/blob/main/docs/Nexus_UNRAID_GUIDE.md Edited April 6Apr 6 by PikkonMG
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.