GitHub - workweave/router: Model router for agentic systems. Routes every prompt to the right model in <50ms. Cut costs 40-70% with just an endpoint change.

**One endpoint. Every model. Always the right one.**
A drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model for _every_ request: using a tiny on-box embedder, not a vibes-based prompt.

_Built by Weave: The #1 engineering intelligence platform, loved by Robinhood, PostHog, Reducto, and hundreds of others._
- * *
What it does
[](https://github.com/workweave/router#what-it-does) Point Claude Code, Codex, Cursor, or your own app at `localhost:8080`. The router:
- 🎯 **Routes per request.** A cluster scorer derived from Avengers-Pro1 picks the right model from your enabled providers, every turn.
- 🔌 **Speaks everyone's API.** Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works.
- 🧠 **Knows OSS too.** DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint).
- 🔒 **BYOK by default.** Provider keys stay on your box, encrypted at rest.
- 📊 **Observable.** OTLP traces out of the box. See them in the Weave dashboard (http://localhost:8080/ui/dashboard) or drop in Honeycomb, Datadog, Grafana, whatever.
30-second quickstart
[](https://github.com/workweave/router#30-second-quickstart) The fastest way: point Claude Code, Codex, or opencode at the **hosted** Weave Router with one command. No clone, no Docker, no Postgres.
npx @workweave/router
That's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:
npx @workweave/router --claude # skip the picker, Claude Code npx @workweave/router --codex # skip the picker, OpenAI Codex CLI npx @workweave/router --opencode # skip the picker, opencode npx @workweave/router --scope project # per-repo, commits settings.json (or .codex/ / opencode.json) npx @workweave/router --local # self-hosted localhost:8080 npx @workweave/router --base-url https://router.acme.internal npx @workweave/router@0.1.0 # pin a version
Requires Node ≥ 18 (Claude Code and opencode paths also need `jq`). Full flag reference: install/npm/README.md.
Or: self-host the whole stack
[](https://github.com/workweave/router#or-self-host-the-whole-stack) If you want the router (and dashboard) running on your own box:
1. Drop a provider key in. OpenRouter is the recommended baseline.
echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local
2. Boot Postgres + router on :8080 and seed an rk_ key.
make full-setup
The router is up at http://localhost:8080, the dashboard at http://localhost:8080/ui/ (password: `admin`), and your `rk_...` key prints in the logs.
Call it like Anthropic
curl -sS http://localhost:8080/v1/messages \ -H "Authorization: Bearer rk_..." \ -d '{"model":"claude-sonnet-4-5","max_tokens":256, "messages":[{"role":"user","content":"hi"}]}'
...or like OpenAI
curl -sS http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer rk_..." \ -d '{"model":"gpt-4o-mini", "messages":[{"role":"user","content":"hi"}]}'
Peek at the routing decision without proxying
curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'
Wire it into your tools
[](https://github.com/workweave/router#wire-it-into-your-tools) **Claude Code.** Run `make install-cc` to wire Claude Code at the local self-hosted router (it's also invoked automatically at the end of `make full-setup`). For the hosted router, use `npx @workweave/router` above.
**Codex** (OpenAI CLI). `npx @workweave/router --codex` patches `~/.codex/config.toml` (or `<repo>/.codex/config.toml` with `--scope project`) with a managed `[model_providers.weave]` block and sets `model_provider = "weave"`. Codex's existing `OPENAI_API_KEY` flows through to api.openai.com for the plan-based passthrough; the router key rides in an `X-Weave-Router-Key` HTTP header. Re-install and `--uninstall --codex` rewrite/remove only the managed block, leaving the rest of your Codex config untouched.
**opencode.**`npx @workweave/router --opencode` merges a `provider.weave` entry into `~/.config/opencode/opencode.json` (or `<repo>/opencode.json` with `--scope project`). It uses opencode's bundled `@ai-sdk/anthropic` provider pointed at the router's `/v1` endpoint — the router speaks the Anthropic Messages API natively, so opencode works unmodified. The router key and identity headers ride alongside the provider config; re-install rewrites only the managed block and `--uninstall --opencode` strips it.
**Cursor**_(early beta, performance may not be the best)._ Settings → Models → _Override OpenAI Base URL_ → `http://localhost:8080/v1`, paste `rk_...` as the API key.
**Switching on/off.** After installing, `npx @workweave/router off --claude` (or `--codex` / `--opencode`) routes that client straight to its provider again without discarding the router config; `on` flips it back, and `status` reports which way it's pointing. Claude Code also gets `/router-off`, `/router-on`, and `/router-status` slash commands. Cursor toggles via the same Settings → Models override above. See install/README.md.
> Two keys, don't mix them up: > > > * `sk-or-...` / `sk-ant-...` / `sk-...` = your **upstream** provider key. Lives in `.env.local`. > * `rk_...` = your **router** key. Clients send this as a Bearer token.
Endpoints
[](https://github.com/workweave/router#endpoints) | Endpoint | Format | | --- | --- | | `POST /v1/messages` | Anthropic Messages, routed | | `POST /v1/chat/completions` | OpenAI Chat Completions, routed | | `POST /v1beta/models/:action` | Gemini `generateContent`, routed | | `POST /v1/route` | Returns the decision, no upstream call | | `GET /v1/models`·`POST /v1/messages/count_tokens` | Anthropic passthrough | | `GET /health`·`GET /validate` | liveness + key check |
Deeper docs
[](https://github.com/workweave/router#deeper-docs)
- 📐 **Configuration reference**: every env var, BYOK encryption, OTel knobs, cluster routing.
- 🛠️ **Contributing**: layering rules, hot-reload dev, migrations, tests, the whole engineering loop.
- 🏗️ **Architecture**: package layout, import contracts, recipes for adding endpoints / providers / strategies.
Star history
[](https://github.com/workweave/router#star-history)
- * *
Footnotes
1. Zhang, Y. et al. _Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing_ (Avengers-Pro). arXiv:2508.12631, 2025. https://arxiv.org/abs/2508.12631↩