Skip to content

Add an AI provider

If the app you're building calls an AI model from its server-side code (RAG, chatbots, agents), bind an AI provider so the deployed app has an API key.

This is a different thing from your harness's model

  • Your harness (Claude Code, Cursor, Roo) uses its own model config — that's the AI driving the development.
  • The AI your app calls at runtime is configured here, in agentry, so it's bound into every sandbox + deployment.

You can use different providers for each. Common pattern: Claude in your harness, OpenRouter bound for server-side calls.

OpenRouter fronts ~200 models behind a single OpenAI-compatible API. Bind it once, get one env var, swap models with a code change.

bash
agentry service bind openrouter

Prompts for your OpenRouter key. Stamps:

OPENROUTER_API_KEY=<your-key>
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1

Your app's code (using the OpenAI SDK against OpenRouter):

ts
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: process.env.OPENROUTER_BASE_URL,
  apiKey: process.env.OPENROUTER_API_KEY,
  defaultHeaders: {
    "HTTP-Referer": "https://your-app.agentry.live",
    "X-Title": "your-app",
  },
});

const res = await client.chat.completions.create({
  model: "anthropic/claude-3.5-sonnet",       // or "openai/gpt-4o", etc.
  messages: [{ role: "user", content: "hi" }],
});

Switch between Claude, GPT, Gemini, Llama, DeepSeek — change the model string. Done.

Anthropic

bash
agentry service bind anthropic

Stamps:

ANTHROPIC_API_KEY=<your-key>

Code:

ts
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const res = await client.messages.create({
  model: "claude-3-5-sonnet-latest",
  max_tokens: 1024,
  messages: [{ role: "user", content: "hi" }],
});

OpenAI

bash
agentry service bind openai

Stamps:

OPENAI_API_KEY=<your-key>

Code:

ts
import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const res = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "hi" }],
});

Other providers

Groq, Mistral, Together, Fireworks, and most other model APIs work the same way — either via OpenRouter (one key, every provider) or with a dedicated binding. See agentry service list for what's available; if the provider you want isn't there, use a custom binding.

Self-hosted models

Running your own model on your server (Ollama, vLLM, llama.cpp)? Set the endpoint as a sandbox env var:

bash
agentry env set LOCAL_LLM_URL http://localhost:11434

Your app code reads LOCAL_LLM_URL directly. To carry it to production, set the same var in the deployment's Env tab when you ship.

If you'd rather only have it in production and not in dev sandboxes, skip the agentry env set and just add LOCAL_LLM_URL straight in the deployment's Env tab in the dashboard.

Switching providers mid-project

Because every major provider follows the OpenAI-shaped API (or has a wrapper), changing providers in code is usually a one-line change. The app keeps working; only the model strings (and possibly the SDK package) differ.

Example refactor request to the agent:

Switch this app from Anthropic to OpenRouter. Use the same models I'm using now (claude-3-5-sonnet). Read the key from OPENROUTER_API_KEY and the base URL from OPENROUTER_BASE_URL. Update the SDK import to openai. Then run the build.

The agent handles the migration in one turn.

Cost notes

agentry never sees your provider bill. You're billed by your provider directly:

  • Anthropic / OpenAI / etc. — by token, at their published rates.
  • OpenRouter — by token, at provider rates + a small (~5%) markup for the routing convenience.
  • Self-hosted — free except for your hardware.

The trade-off with OpenRouter is paying the markup in exchange for instant model-switching and a single bill. For early prototyping it's almost always worth it; for high-volume production you may want to bind providers individually and pay them direct.

Next

agentry — run AI-built apps on your own hardware.