Providers & Auth

rupu speaks to four built-in model providers at once — plus any OpenAI-compatible endpoint you register — and because every agent picks its own provider, model, and credential, a single workflow can run a planner on one model and reviewers on another in the same run.

The multi-provider model

rupu is not pinned to one vendor. It ships working integrations for Anthropic, OpenAI, Gemini, and GitHub Copilot, and the choice is made per agent, not globally. Each agent file declares its provider, model, and (optionally) auth mode in its YAML frontmatter — so when a workflow fans out across several agents, those agents can each talk to a different provider concurrently.

That means a real workflow might dispatch a planner step on a Claude model, two reviewer steps on GPT-5 and Gemini, and a long-context search step on Gemini's 1M-token window — all inside one run, each authenticating with its own credential.

Provider	What it's for	Auth options
`anthropic`	Claude models. The most-exercised provider in rupu.	Console API key or Claude.ai SSO (browser callback).
`openai`	GPT models via the Responses API.	Platform API key or ChatGPT SSO (browser callback). Different endpoints under the hood.
`gemini`	Gemini models with very long context windows.	SSO via Vertex / Gemini CLI OAuth (browser callback). API key via AI Studio is deferred.
`copilot`	Copilot-hosted models. Requires a paid Copilot subscription.	GitHub PAT (`GITHUB_TOKEN`) or GitHub device-code SSO.

Those four are the built-in providers, with first-class auth flows and curated model lists. But you are not limited to them: beyond the built-ins, you can register any OpenAI-compatible HTTP endpoint as a named provider and mix it into the same workflows — see Bring your own provider below.

Mixed providers in one run — each step authenticates with its own credential.

Choosing a provider/model per agent

Provider and model are agent-level decisions. An agent file is Markdown with a YAML frontmatter block; the relevant fields are provider, model, and the optional auth:

---
name: refactor
description: Suggest minimal-diff refactors.
provider: anthropic
auth: sso
model: claude-sonnet-4-6
---

You suggest minimal-diff refactors. Be concise.

If you omit model, rupu falls back to the provider's default_model from ~/.rupu/config.toml. If you omit auth, the credential resolver applies a default precedence (SSO if present and refreshable, then API key).

Note. Workflow steps inherit the provider/model/auth of the agent they invoke, and a step can override them — so the same agent can run on different models in different workflows without editing the agent file.

Authentication

Under the hood rupu has two neutral auth modes — api-key and sso — and the SSO mode resolves to the right OAuth flow per provider (a localhost browser callback for Anthropic / OpenAI / Gemini, or a GitHub device code for Copilot).

Mode	CLI flag	How it works	Providers
API key	`--mode api-key`	Paste the secret with `--key` or pipe it on stdin. Stored at `rupu/<provider>/api-key`.	all four
SSO — browser callback	`--mode sso`	Binds a localhost listener, opens the provider's authorize URL with a PKCE challenge, validates `state`, exchanges the code for tokens. No headless fallback.	anthropic, openai, gemini
SSO — device code	`--mode sso`	Prints a code + URL; you authorize from any browser anywhere while rupu polls. Works headless / over SSH.	copilot

SSO access tokens (typically ~1 hour) are refreshed pre-emptively — the resolver refreshes when expires_at - now < 60s on a credential read, using the stored refresh token. There is no automatic fall-back from SSO to API key: if you chose SSO, a refresh failure points you back at rupu auth login.

How credentials are stored

Primary — OS keychain. The keyring crate writes to the macOS Keychain (or Linux Secret Service over D-Bus). Credentials are never printed back.
Fallback — ~/.rupu/auth.json at mode 0600. Used when the keychain is unavailable (headless servers, no D-Bus). A one-time warning is printed and the mode bits are checked on every read.
Probe cache — ~/.rupu/cache/auth-backend.json. Records which backend was chosen so rupu doesn't re-probe on every invocation; invalidated on login.
Global only. auth.json is never read from a project directory — credentials live at the user level.

macOS no-prompt. rupu pre-populates each new keychain item's ACL with rupu's signing identity (via the rupu-keychain-acl shim), which eliminates the "Always Allow" first-prompt. Unsigned, freshly-built binaries are treated as a new code identity by macOS and may still prompt once per binary path.

The `rupu auth` CLI

The rupu auth subcommand manages stored credentials and the storage backend:

rupu auth login — store credentials for a provider (--provider, --mode api-key|sso, optional --key).
rupu auth logout — remove a stored credential (--provider [--mode], or --all [--yes]).
rupu auth status — show configured providers and the active backend; never prints secrets.
rupu auth backend — inspect or switch the storage backend between keychain and file.

# API key: pass it inline, or pipe it on stdin
rupu auth login --provider anthropic --mode api-key --key sk-ant-XXX
echo -n "$KEY" | rupu auth login --provider openai --mode api-key

# SSO (browser callback or device code, chosen per provider)
rupu auth login --provider copilot --mode sso

# Inspect and clean up
rupu auth status
rupu auth logout --provider gemini --mode sso
rupu auth logout --all --yes

# Switch the storage backend if the keychain drops credentials
rupu auth backend --use file

Models are discovered separately with rupu models: rupu models list shows the catalog across all four providers (with a --provider filter), and rupu models refresh re-fetches the live caches. An agent's model: value is resolved against custom config entries, the live cache, then a baked-in list.

Per-provider notes

Anthropic

Console API keys start with sk-ant- and are shown only once — copy at creation. SSO authenticates via Claude.ai's OAuth (claude.ai/oauth/authorize), good for Claude Pro subscribers. Access tokens last ~1 hour and refresh automatically.

OpenAI

Two paths hit different endpoints: a Platform API key (sk-...) targets api.openai.com/v1/responses and is billed via OpenAI; ChatGPT SSO targets the ChatGPT backend and is covered by a Plus/Pro subscription. rupu detects the chatgpt_account_id claim and routes accordingly. Org-scoped keys need org_id in config. Only the newer Responses API is supported — pre-Responses models won't work.

Gemini

Currently SSO only: the lifted client targets the Vertex AI / Gemini CLI OAuth path (accounts.google.com, cloud-platform scope). API-key auth via AI Studio is deferred (--mode api-key returns NotWiredInV0). You need a Google Cloud project with the Vertex AI API enabled and billing configured; set region in config for Vertex.

Copilot

Requires a paid Copilot Pro / Business / Enterprise subscription — a free GitHub account can't invoke the API even with a valid token. The simplest path is a GitHub PAT (ghp_...), or pick it up from GITHUB_TOKEN when --key is omitted. The device-code SSO flow is the same one gh auth login uses and is the only SSO flow that works headless. The exchanged Copilot token expires faster than the GitHub PAT; rupu re-mints it internally.

Bring your own provider (OpenAI-compatible)

The four built-ins are not the boundary. With the openai-compatible provider kind you can register any HTTP server that speaks the OpenAI /v1/chat/completions API as a named provider, then use it from agents exactly like a built-in — and mix it with the built-ins in one workflow. That covers:

Local models — vLLM, llama.cpp, or Ollama running on your own machine or a GPU box.
Gateways & aggregators — OpenRouter, Together, Groq, Fireworks, and similar OpenAI-shaped endpoints.
Private / enterprise endpoints — Oracle GenAI or an internal company gateway.

Authentication is a single static Bearer API key per provider — there is no SSO flow for openai-compatible providers. The model catalog is whatever you declare in config; rupu does not call /v1/models on these endpoints.

Configure it

Add a [providers.<name>] table to ~/.rupu/config.toml with kind = "openai-compatible". base_url and default_model are required; stream is optional (defaults to true; set false for servers without an SSE endpoint). Add one [[providers.<name>.models]] block per private or fine-tuned model you want to select.

default_provider = "oracle"

[providers.oracle]
kind = "openai-compatible"
base_url = "http://192.29.35.246:8080"
default_model = "/raid/models/zai-org/GLM-5.2-FP8"
stream = true   # set false if the server has no SSE endpoint

  [[providers.oracle.models]]
  id = "/raid/models/zai-org/GLM-5.2-FP8"
  context_window = 131072
  max_output = 8192

base_url may include or omit a trailing /v1 — rupu normalises both to <root>/v1/chat/completions. Each [[providers.<name>.models]] entry requires id; context_window and max_output are optional (defaulting to 32768 and 8192). Name the provider anything you like — oracle, vllm, together — except a reserved built-in name (anthropic, openai, gemini, copilot); rupu rejects the config otherwise.

Authenticate

Supply the static Bearer key with rupu auth login (stored in the keychain / auth.json, same as the built-ins), or — for CI and ephemeral environments — set the env var RUPU_<UPPERCASED_PROVIDER_NAME>_API_KEY and rupu reads it automatically.

# Interactive prompt (the key is not echoed), or pipe it on stdin
rupu auth login --provider oracle --mode api-key

# Or set the env var directly — pattern: RUPU_<PROVIDER>_API_KEY
export RUPU_ORACLE_API_KEY=sk-...

Use it from an agent

Point an agent at the provider by name; the model: can be any id the endpoint serves, including the custom [[providers.<name>.models]] entries that make private or fine-tuned models selectable.

---
name: oracle-codereview
description: Code review via Oracle GenAI.
provider: oracle
model: /raid/models/zai-org/GLM-5.2-FP8
---

You review code changes for correctness, style, and missing tests.

Then run it like any other agent, and confirm the custom catalog with rupu models list --provider oracle (entries show source custom):

rupu run --agent oracle-codereview

Example endpoints

Any of these can be dropped into the base_url above. When a path detail is unknown for your deployment, the generic http://host:port/v1 shape is a safe starting point.

Target	Rough `base_url`
Ollama (local)	`http://localhost:11434/v1`
vLLM (self-hosted)	`http://host:8000/v1`
llama.cpp server	`http://host:8080/v1`
OpenRouter	`https://openrouter.ai/api/v1`
Together	`https://api.together.xyz/v1`
Groq	`https://api.groq.com/openai/v1`
Fireworks	`https://api.fireworks.ai/inference/v1`
Oracle GenAI / internal gateway	`http://host:port`

Scope. openai-compatible providers report $0.00 in cost tracking (no pricing tables), and model listing returns only what you declare under [[providers.<name>.models]] — rupu never queries /v1/models on these endpoints.

← Sessions Integrations →

Providers & Auth

The multi-provider model

Choosing a provider/model per agent

Authentication

How credentials are stored

The rupu auth CLI

Per-provider notes

Anthropic

OpenAI

Gemini

Copilot

Bring your own provider (OpenAI-compatible)

Configure it

Authenticate

Use it from an agent

Example endpoints

The `rupu auth` CLI