Providers & Auth
rupu speaks to four built-in model providers at once — plus any OpenAI-compatible endpoint you register — and because every agent picks its own provider, model, and credential, a single workflow can run a planner on one model and reviewers on another in the same run.
The multi-provider model
rupu is not pinned to one vendor. It ships working integrations for Anthropic, OpenAI, Gemini, and GitHub Copilot, and the choice is made per agent, not globally. Each agent file declares its provider, model, and (optionally) auth mode in its YAML frontmatter — so when a workflow fans out across several agents, those agents can each talk to a different provider concurrently.
That means a real workflow might dispatch a planner step on a Claude model, two reviewer steps on GPT-5 and Gemini, and a long-context search step on Gemini's 1M-token window — all inside one run, each authenticating with its own credential.
| Provider | What it's for | Auth options |
|---|---|---|
anthropic |
Claude models. The most-exercised provider in rupu. | Console API key or Claude.ai SSO (browser callback). |
openai |
GPT models via the Responses API. | Platform API key or ChatGPT SSO (browser callback). Different endpoints under the hood. |
gemini |
Gemini models with very long context windows. | SSO via Vertex / Gemini CLI OAuth (browser callback). API key via AI Studio is deferred. |
copilot |
Copilot-hosted models. Requires a paid Copilot subscription. | GitHub PAT (GITHUB_TOKEN) or GitHub device-code SSO. |
Those four are the built-in providers, with first-class auth flows and curated model lists. But you are not limited to them: beyond the built-ins, you can register any OpenAI-compatible HTTP endpoint as a named provider and mix it into the same workflows — see Bring your own provider below.
Choosing a provider/model per agent
Provider and model are agent-level decisions. An agent file is Markdown with a YAML frontmatter block; the relevant fields are provider, model, and the optional auth:
--- name: refactor description: Suggest minimal-diff refactors. provider: anthropic auth: sso model: claude-sonnet-4-6 --- You suggest minimal-diff refactors. Be concise.
If you omit model, rupu falls back to the provider's default_model from ~/.rupu/config.toml. If you omit auth, the credential resolver applies a default precedence (SSO if present and refreshable, then API key).
Authentication
Under the hood rupu has two neutral auth modes — api-key and sso — and the SSO mode resolves to the right OAuth flow per provider (a localhost browser callback for Anthropic / OpenAI / Gemini, or a GitHub device code for Copilot).
| Mode | CLI flag | How it works | Providers |
|---|---|---|---|
| API key | --mode api-key |
Paste the secret with --key or pipe it on stdin. Stored at rupu/<provider>/api-key. |
all four |
| SSO — browser callback | --mode sso |
Binds a localhost listener, opens the provider's authorize URL with a PKCE challenge, validates state, exchanges the code for tokens. No headless fallback. |
anthropic, openai, gemini |
| SSO — device code | --mode sso |
Prints a code + URL; you authorize from any browser anywhere while rupu polls. Works headless / over SSH. | copilot |
SSO access tokens (typically ~1 hour) are refreshed pre-emptively — the resolver refreshes when expires_at - now < 60s on a credential read, using the stored refresh token. There is no automatic fall-back from SSO to API key: if you chose SSO, a refresh failure points you back at rupu auth login.
How credentials are stored
- Primary — OS keychain. The
keyringcrate writes to the macOS Keychain (or Linux Secret Service over D-Bus). Credentials are never printed back. - Fallback —
~/.rupu/auth.jsonat mode 0600. Used when the keychain is unavailable (headless servers, no D-Bus). A one-time warning is printed and the mode bits are checked on every read. - Probe cache —
~/.rupu/cache/auth-backend.json. Records which backend was chosen so rupu doesn't re-probe on every invocation; invalidated on login. - Global only.
auth.jsonis never read from a project directory — credentials live at the user level.
rupu-keychain-acl shim), which eliminates the "Always Allow" first-prompt. Unsigned, freshly-built binaries are treated as a new code identity by macOS and may still prompt once per binary path.The rupu auth CLI
The rupu auth subcommand manages stored credentials and the storage backend:
rupu auth login— store credentials for a provider (--provider,--mode api-key|sso, optional--key).rupu auth logout— remove a stored credential (--provider[--mode], or--all[--yes]).rupu auth status— show configured providers and the active backend; never prints secrets.rupu auth backend— inspect or switch the storage backend betweenkeychainandfile.
# API key: pass it inline, or pipe it on stdin rupu auth login --provider anthropic --mode api-key --key sk-ant-XXX echo -n "$KEY" | rupu auth login --provider openai --mode api-key # SSO (browser callback or device code, chosen per provider) rupu auth login --provider copilot --mode sso # Inspect and clean up rupu auth status rupu auth logout --provider gemini --mode sso rupu auth logout --all --yes # Switch the storage backend if the keychain drops credentials rupu auth backend --use file
Models are discovered separately with rupu models: rupu models list shows the catalog across all four providers (with a --provider filter), and rupu models refresh re-fetches the live caches. An agent's model: value is resolved against custom config entries, the live cache, then a baked-in list.
Per-provider notes
Anthropic
Console API keys start with sk-ant- and are shown only once — copy at creation. SSO authenticates via Claude.ai's OAuth (claude.ai/oauth/authorize), good for Claude Pro subscribers. Access tokens last ~1 hour and refresh automatically.
OpenAI
Two paths hit different endpoints: a Platform API key (sk-...) targets api.openai.com/v1/responses and is billed via OpenAI; ChatGPT SSO targets the ChatGPT backend and is covered by a Plus/Pro subscription. rupu detects the chatgpt_account_id claim and routes accordingly. Org-scoped keys need org_id in config. Only the newer Responses API is supported — pre-Responses models won't work.
Gemini
Currently SSO only: the lifted client targets the Vertex AI / Gemini CLI OAuth path (accounts.google.com, cloud-platform scope). API-key auth via AI Studio is deferred (--mode api-key returns NotWiredInV0). You need a Google Cloud project with the Vertex AI API enabled and billing configured; set region in config for Vertex.
Copilot
Requires a paid Copilot Pro / Business / Enterprise subscription — a free GitHub account can't invoke the API even with a valid token. The simplest path is a GitHub PAT (ghp_...), or pick it up from GITHUB_TOKEN when --key is omitted. The device-code SSO flow is the same one gh auth login uses and is the only SSO flow that works headless. The exchanged Copilot token expires faster than the GitHub PAT; rupu re-mints it internally.
Bring your own provider (OpenAI-compatible)
The four built-ins are not the boundary. With the openai-compatible provider kind you can register any HTTP server that speaks the OpenAI /v1/chat/completions API as a named provider, then use it from agents exactly like a built-in — and mix it with the built-ins in one workflow. That covers:
- Local models — vLLM, llama.cpp, or Ollama running on your own machine or a GPU box.
- Gateways & aggregators — OpenRouter, Together, Groq, Fireworks, and similar OpenAI-shaped endpoints.
- Private / enterprise endpoints — Oracle GenAI or an internal company gateway.
Authentication is a single static Bearer API key per provider — there is no SSO flow for openai-compatible providers. The model catalog is whatever you declare in config; rupu does not call /v1/models on these endpoints.
Configure it
Add a [providers.<name>] table to ~/.rupu/config.toml with kind = "openai-compatible". base_url and default_model are required; stream is optional (defaults to true; set false for servers without an SSE endpoint). Add one [[providers.<name>.models]] block per private or fine-tuned model you want to select.
default_provider = "oracle" [providers.oracle] kind = "openai-compatible" base_url = "http://192.29.35.246:8080" default_model = "/raid/models/zai-org/GLM-5.2-FP8" stream = true # set false if the server has no SSE endpoint [[providers.oracle.models]] id = "/raid/models/zai-org/GLM-5.2-FP8" context_window = 131072 max_output = 8192
base_url may include or omit a trailing /v1 — rupu normalises both to <root>/v1/chat/completions. Each [[providers.<name>.models]] entry requires id; context_window and max_output are optional (defaulting to 32768 and 8192). Name the provider anything you like — oracle, vllm, together — except a reserved built-in name (anthropic, openai, gemini, copilot); rupu rejects the config otherwise.
Authenticate
Supply the static Bearer key with rupu auth login (stored in the keychain / auth.json, same as the built-ins), or — for CI and ephemeral environments — set the env var RUPU_<UPPERCASED_PROVIDER_NAME>_API_KEY and rupu reads it automatically.
# Interactive prompt (the key is not echoed), or pipe it on stdin rupu auth login --provider oracle --mode api-key # Or set the env var directly — pattern: RUPU_<PROVIDER>_API_KEY export RUPU_ORACLE_API_KEY=sk-...
Use it from an agent
Point an agent at the provider by name; the model: can be any id the endpoint serves, including the custom [[providers.<name>.models]] entries that make private or fine-tuned models selectable.
--- name: oracle-codereview description: Code review via Oracle GenAI. provider: oracle model: /raid/models/zai-org/GLM-5.2-FP8 --- You review code changes for correctness, style, and missing tests.
Then run it like any other agent, and confirm the custom catalog with rupu models list --provider oracle (entries show source custom):
rupu run --agent oracle-codereview
Example endpoints
Any of these can be dropped into the base_url above. When a path detail is unknown for your deployment, the generic http://host:port/v1 shape is a safe starting point.
| Target | Rough base_url |
|---|---|
| Ollama (local) | http://localhost:11434/v1 |
| vLLM (self-hosted) | http://host:8000/v1 |
| llama.cpp server | http://host:8080/v1 |
| OpenRouter | https://openrouter.ai/api/v1 |
| Together | https://api.together.xyz/v1 |
| Groq | https://api.groq.com/openai/v1 |
| Fireworks | https://api.fireworks.ai/inference/v1 |
| Oracle GenAI / internal gateway | http://host:port |
$0.00 in cost tracking (no pricing tables), and model listing returns only what you declare under [[providers.<name>.models]] — rupu never queries /v1/models on these endpoints.