GitHub - workweave/router: Model router for agentic systems. Routes every prompt to the right model in <50ms. Cut costs 40-70% with just an endpoint change.

Source: https://github.com/workweave/router

![Image 1: Group 1(1)](https://private-user-images.githubusercontent.com/56594084/593310735-139b6a47-f306-4553-a638-764549a1e76b.svg?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3ODI4NDYzMjIsIm5iZiI6MTc4Mjg0NjAyMiwicGF0aCI6Ii81NjU5NDA4NC81OTMzMTA3MzUtMTM5YjZhNDctZjMwNi00NTUzLWE2MzgtNzY0NTQ5YTFlNzZiLnN2Zz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNjA2MzAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjYwNjMwVDE5MDAyMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTEwNjc1NzkxZjEyNzlmN2E2ZDFkZjEzNWFhOGJmMjJjMDRkNjFjY2QyMGMxN2IyNmJmOWQ0MTY4Y2ZiNDY4NjQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JnJlc3BvbnNlLWNvbnRlbnQtdHlwZT1pbWFnZSUyRnN2ZyUyQnhtbCJ9.tZ7vU0Z4ZZoEaxJdUr0dbrpxWn7hY-gznuuv_kQZJQ0)

**One endpoint. Every model. Always the right one.**

A drop-in proxy for Anthropic, OpenAI, and Gemini that picks the best model for _every_ request: using a tiny on-box embedder, not a vibes-based prompt.

![Image 2: Weave Badge](https://app.workweave.ai/reports/repository/org_QWsHDcRQWQEs6RpkdEZrlFK8/https%3A%2F%2Fgithub.com/1222789989)![Image 3: Go](https://github.com/workweave/router/blob/main/go.mod)![Image 4: Tests](https://github.com/workweave/router/actions/workflows/test.yml)![Image 5: License: ELv2](https://www.elastic.co/licensing/elastic-license)

_Built by Weave: The #1 engineering intelligence platform, loved by Robinhood, PostHog, Reducto, and hundreds of others._

- * *

What it does

[](https://github.com/workweave/router#what-it-does) Point Claude Code, Codex, Cursor, or your own app at `localhost:8080`. The router:

- 🎯 **Routes per request.** A cluster scorer derived from Avengers-Pro 1 picks the right model from your enabled providers, every turn.

- 🔌 **Speaks everyone's API.** Anthropic Messages, OpenAI Chat Completions, Gemini native. Streaming, tools, vision, the works.

- 🧠 **Knows OSS too.** DeepSeek, Kimi, GLM, Qwen, Llama, Mistral via OpenRouter (or any OpenAI-compatible endpoint).

- 🔒 **BYOK by default.** Provider keys stay on your box, encrypted at rest.

- 📊 **Observable.** OTLP traces out of the box. See them in the Weave dashboard (http://localhost:8080/ui/dashboard) or drop in Honeycomb, Datadog, Grafana, whatever.

30-second quickstart

[](https://github.com/workweave/router#30-second-quickstart) The fastest way: point Claude Code, Codex, or opencode at the **hosted** Weave Router with one command. No clone, no Docker, no Postgres.

npx @workweave/router

That's it. The installer asks which tool (Claude Code, Codex, or opencode), walks you through scope (user vs. project), grabs a router key, and wires the right config file. Other flavors:

npx @workweave/router --claude # skip the picker, Claude Code npx @workweave/router --codex # skip the picker, OpenAI Codex CLI npx @workweave/router --opencode # skip the picker, opencode npx @workweave/router --scope project # per-repo, commits settings.json (or .codex/ / opencode.json) npx @workweave/router --local # self-hosted localhost:8080 npx @workweave/router --base-url https://router.acme.internal npx @workweave/router@0.1.0 # pin a version

Requires Node ≥ 18 (Claude Code and opencode paths also need `jq`). Full flag reference: install/npm/README.md.

Or: self-host the whole stack

[](https://github.com/workweave/router#or-self-host-the-whole-stack) If you want the router (and dashboard) running on your own box:

1. Drop a provider key in. OpenRouter is the recommended baseline.

echo "OPENROUTER_API_KEY=sk-or-v1-..." >> .env.local

2. Boot Postgres + router on :8080 and seed an rk_ key.

make full-setup

The router is up at http://localhost:8080, the dashboard at http://localhost:8080/ui/ (password: `admin`), and your `rk_...` key prints in the logs.

Call it like Anthropic

curl -sS http://localhost:8080/v1/messages \ -H "Authorization: Bearer rk_..." \ -d '{"model":"claude-sonnet-4-5","max_tokens":256, "messages":[{"role":"user","content":"hi"}]}'

...or like OpenAI

curl -sS http://localhost:8080/v1/chat/completions \ -H "Authorization: Bearer rk_..." \ -d '{"model":"gpt-4o-mini", "messages":[{"role":"user","content":"hi"}]}'

Peek at the routing decision without proxying

curl -sS http://localhost:8080/v1/route -H "Authorization: Bearer rk_..." -d '...'

Wire it into your tools

[](https://github.com/workweave/router#wire-it-into-your-tools) **Claude Code.** Run `make install-cc` to wire Claude Code at the local self-hosted router (it's also invoked automatically at the end of `make full-setup`). For the hosted router, use `npx @workweave/router` above.

**Codex** (OpenAI CLI). `npx @workweave/router --codex` patches `~/.codex/config.toml` (or `<repo>/.codex/config.toml` with `--scope project`) with a managed `[model_providers.weave]` block and sets `model_provider = "weave"`. Codex's existing `OPENAI_API_KEY` flows through to api.openai.com for the plan-based passthrough; the router key rides in an `X-Weave-Router-Key` HTTP header. Re-install and `--uninstall --codex` rewrite/remove only the managed block, leaving the rest of your Codex config untouched.

**opencode.**`npx @workweave/router --opencode` merges a `provider.weave` entry into `~/.config/opencode/opencode.json` (or `<repo>/opencode.json` with `--scope project`). It uses opencode's bundled `@ai-sdk/anthropic` provider pointed at the router's `/v1` endpoint — the router speaks the Anthropic Messages API natively, so opencode works unmodified. The router key and identity headers ride alongside the provider config; re-install rewrites only the managed block and `--uninstall --opencode` strips it.

**Cursor**_(early beta, performance may not be the best)._ Settings → Models → _Override OpenAI Base URL_ → `http://localhost:8080/v1`, paste `rk_...` as the API key.

**Switching on/off.** After installing, `npx @workweave/router off --claude` (or `--codex` / `--opencode`) routes that client straight to its provider again without discarding the router config; `on` flips it back, and `status` reports which way it's pointing. Claude Code also gets `/router-off`, `/router-on`, and `/router-status` slash commands. Cursor toggles via the same Settings → Models override above. See install/README.md.

> Two keys, don't mix them up: > > > * `sk-or-...` / `sk-ant-...` / `sk-...` = your **upstream** provider key. Lives in `.env.local`. > * `rk_...` = your **router** key. Clients send this as a Bearer token.

Endpoints

[](https://github.com/workweave/router#endpoints) | Endpoint | Format | | --- | --- | | `POST /v1/messages` | Anthropic Messages, routed | | `POST /v1/chat/completions` | OpenAI Chat Completions, routed | | `POST /v1beta/models/:action` | Gemini `generateContent`, routed | | `POST /v1/route` | Returns the decision, no upstream call | | `GET /v1/models`·`POST /v1/messages/count_tokens` | Anthropic passthrough | | `GET /health`·`GET /validate` | liveness + key check |

Deeper docs

[](https://github.com/workweave/router#deeper-docs)

- 📐 **Configuration reference**: every env var, BYOK encryption, OTel knobs, cluster routing.

- 🛠️ **Contributing**: layering rules, hot-reload dev, migrations, tests, the whole engineering loop.

- 🏗️ **Architecture**: package layout, import contracts, recipes for adding endpoints / providers / strategies.

Star history

[](https://github.com/workweave/router#star-history)![Image 6: Star History Chart](https://star-history.com/#workweave/router&Date)

- * *

Footnotes

1. Zhang, Y. et al. _Beyond GPT-5: Making LLMs Cheaper and Better via Performance–Efficiency Optimized Routing_ (Avengers-Pro). arXiv:2508.12631, 2025. https://arxiv.org/abs/2508.12631 ↩