Hermes — Provider Router
Hermes is the capability-aware model router inside Zeus. Every call to
Zeus.Generate() or Zeus.Stream() delegates provider selection to
Hermes. This page covers the routing decision tree, complexity scoring, task-type overrides,
and fallback behaviour.
How Hermes picks a provider
Hermes.Route(capability, prompt) works through a prioritised decision tree.
Every node is a real conditional in the source — this is not aspirational documentation.
How local_threshold gates Ollama
When local_first: true (the default), Hermes calls
compression.ClassifyComplexity(prompt) before attempting Ollama.
The scorer is a pure-Go heuristic — no model call, zero latency.
Signals the scorer uses: token count estimate, presence of code fences, multi-step
instruction keywords (explain, architect, design, compare, evaluate), and length
relative to a complexity band. Returns a float in [0.0, 1.0].
local_threshold controls how aggressive the local bias is:
| Value | Behaviour |
|---|---|
1.0 (default) | Everything goes to Ollama when available — cloud is pure fallback |
0.7 | Simple queries go local; high-complexity queries skip to cloud |
0.0 | Disable local-first — every request goes to cloud priority order |
# ~/.olympus/config.yaml
routing:
local_first: true
local_threshold: 1.0
Task-type routing overrides
When task_routes is configured, Hermes classifies each prompt into a task
type via keyword matching in ClassifyTaskType(), then looks up a named
primary/fallback route.
| Task type | Trigger keywords (sample) |
|---|---|
code_generation | implement, refactor, fix bug, write a function, add endpoint |
deep_reasoning | explain, analyze, architect, design a system, compare, evaluate |
automation_pipeline | automate, pipeline, schedule, workflow, cron, bulk, batch process |
batch_processing | batch processing, batch job, bulk run |
large_reasoning | configured route key only — no keyword classifier yet |
# ~/.olympus/config.yaml
routing:
task_routes:
deep_reasoning:
primary: claude_pro
fallback: copilot
code_generation:
primary: ollama
fallback: claude_pro
claude_pro and claude_api are aliases — both resolve to the
claude provider internally. Which variant is active depends on whether an
OAuth token or an API key is configured.
Summarization, Embeddings, and Fallback
Summarization and Embeddings capabilities always route to Ollama regardless
of local_first or complexity settings. This is a hard-coded preference — these
operations should never incur cloud token cost.
Hermes.Fallback(current) returns the next provider after
current in the cloud priority order. Zeus uses this when a provider call fails
mid-stream, allowing automatic retry on the next tier without restarting the request.
Offline mode (routing.offline_mode: true) bypasses the entire
decision tree and returns Ollama directly. All other routing logic is skipped.
What Hermes requires from providers
Every provider — built-in or plugin — must implement the providers.Provider
interface. Hermes calls four methods on every candidate before routing:
IsAvailable() is checked at call time, not cached. A provider that becomes
unavailable mid-session (e.g. Ollama stops) is immediately bypassed to the next tier.