Agent Adapters
Adapters are the bridge between an agent and the language model that powers it. Capx ships three built-in adapters (Claude, OpenAI, and HTTP) and exposes a typed interface for building your own.
How adapters work
Every agent in your roster has exactly one adapter. When the runtime needs the agent to think, whether executing a playbook step, evaluating a rubric, or processing a heartbeat, it sends the request through the adapter. The adapter handles serialization, API authentication, retries, token counting, and cost tracking.
| Provider | Connects to | Highlights |
|---|---|---|
| claude | Anthropic's Claude models | Default adapter. Extended thinking, tool use, and vision. |
| openai | OpenAI's model family | All chat-completion models, function calling in the OpenAI format. |
| http | Any HTTP endpoint | Self-hosted, fine-tuned, or third-party inference backends. |
| custom | Your own plugin | Implements the typed AgentAdapter interface, registered as a plugin. |
Adapters are configured in the adapter block of each agent definition in your company.yaml. You can also override the adapter at the step level within a playbook, which lets a single agent use different models for different tasks.
Claude adapter
The Claude adapter connects to Anthropic's Claude models. It is the default adapter and the one most Capx companies use. It supports the full range of Claude capabilities including extended thinking, tool use, and vision.
| Model | Context window | Strengths | Cost tier |
|---|---|---|---|
| claude-opus-4 | 200K tokens | Highest capability. Complex reasoning, long documents, nuanced decisions. | High |
| claude-sonnet-4 | 200K tokens | Best balance of capability and cost. Default for most roles. | Medium |
| claude-haiku-4 | 200K tokens | Fastest and cheapest. High-volume tasks, classification, simple generation. | Low |
agents:
strategist:
role: strategist
adapter:
provider: claude
model: claude-sonnet-4
max_tokens: 4096 # Max output tokens per request
temperature: 0.7 # 0.0 = deterministic, 1.0 = creative
top_p: 0.95 # Nucleus sampling threshold
stop_sequences: [] # Optional stop sequences
extended_thinking: false # Enable for complex multi-step reasoningextended_thinking for your strategist agent when it handles complex planning or multi-step decisions. This increases latency and cost but significantly improves the quality of reasoning on hard problems.Vision support
All Claude models support vision. If a playbook step passes image data (from a tool output or an input), the Claude adapter automatically includes it in the request. No additional configuration is needed.
OpenAI adapter
The OpenAI adapter connects to OpenAI's model family. It supports all chat-completion models and handles function calling in the OpenAI format.
| Model | Context window | Strengths | Cost tier |
|---|---|---|---|
| o3 | 200K tokens | Best reasoning model. Complex analysis, math, code. | High |
| gpt-4.1 | 1M tokens | Large context, strong general capability. | Medium |
| gpt-4.1-mini | 1M tokens | Good capability at lower cost. Solid default. | Low |
| gpt-4.1-nano | 1M tokens | Cheapest. Best for classification and simple tasks. | Lowest |
agents:
engineer:
role: engineer
adapter:
provider: openai
model: gpt-4.1
max_tokens: 4096
temperature: 0.5
top_p: 1.0
frequency_penalty: 0.0
presence_penalty: 0.0capx secret set OPENAI_API_KEYfrom the CLI or add it in the Casa dashboard under Settings > Secrets.HTTP adapter
The HTTP adapter sends requests to any endpoint that accepts HTTP. This lets you connect agents to self-hosted models, fine-tuned models, third-party inference APIs, or any custom backend. The adapter supports configurable request/response formats, authentication headers, timeouts, and retry policies.
agents:
classifier:
role: custom
system_prompt: "You classify inbound messages by intent and urgency."
adapter:
provider: http
endpoint: "https://inference.example.com/v1/completions"
method: POST
headers:
Authorization: "Bearer ${{ secrets.INFERENCE_API_KEY }}"
Content-Type: "application/json"
body_template: |
{
"model": "my-fine-tuned-model",
"messages": ${{ messages_json }},
"max_tokens": 512,
"temperature": 0.3
}
response_path: "choices[0].message.content"
timeout: 30s
retry:
max_attempts: 3
backoff: exponential
initial_delay: 1s
max_delay: 30sBody template
The body_template field accepts a JSON template string with Capx expression variables. The runtime provides these variables at request time:
messages_jsonstringsystem_promptstringlast_messagestringtools_jsonstringResponse path
The response_path field uses dot notation to extract the model's response from the HTTP response body. If your endpoint returns an OpenAI-compatible format, use choices[0].message.content. For a simple { "text": "..." } response, use text.
Building custom adapters
If the built-in adapters do not cover your use case, you can build a custom adapter. Custom adapters implement the AgentAdapter typed interface and are registered as plugins.
Implement the AgentAdapter interface
interface AgentAdapter {
// Unique adapter identifier
readonly name: string;
// Send a chat completion request and return the response
complete(request: CompletionRequest): Promise<CompletionResponse>;
// Estimate the cost of a request in credits (before execution)
estimateCost(request: CompletionRequest): number;
// Return true if the adapter supports tool/function calling
supportsTools(): boolean;
// Return true if the adapter supports image/vision inputs
supportsVision(): boolean;
// Health check, called during agent spawn
healthCheck(): Promise<boolean>;
}
interface CompletionRequest {
messages: Message[];
systemPrompt: string;
tools?: ToolDefinition[];
maxTokens: number;
temperature: number;
stopSequences?: string[];
}
interface CompletionResponse {
content: string;
toolCalls?: ToolCall[];
usage: {
inputTokens: number;
outputTokens: number;
};
model: string;
latencyMs: number;
}Register the adapter as a plugin
Register your custom adapter in the company configuration, then reference it by name from any agent's adapter block:
adapters:
custom:
my-llama-adapter:
plugin: "./adapters/llama-adapter.js"
config:
endpoint: "http://localhost:8080/v1/chat"
model: "llama-3.3-70b"
agents:
researcher:
role: custom
system_prompt: "You are a research agent."
adapter:
provider: my-llama-adapterPer-agent model switching
Each agent in your roster can use a different model. This is the primary way to optimize cost and capability across your company. Assign expensive, high-capability models to agents that do complex reasoning, and cheaper, faster models to agents that handle routine tasks.
agents:
cofounder:
role: strategist
adapter:
provider: claude
model: claude-opus-4 # Best reasoning for strategic decisions
writer:
role: marketer
adapter:
provider: claude
model: claude-sonnet-4 # Strong content generation
developer:
role: engineer
adapter:
provider: openai
model: gpt-4.1 # Large context for codebases
triager:
role: support
adapter:
provider: claude
model: claude-haiku-4 # Fast and cheap for ticket routingPer-step model switching
Sometimes a single agent needs different models for different tasks. A marketer might use Sonnet for writing blog posts but Haiku for categorizing inbound messages. You can override the agent's default adapter at the step level in any playbook.
steps:
- id: categorize_messages
agent: marketer
adapter: # Overrides agent default
provider: claude
model: claude-haiku-4 # Cheap model for classification
tool: classify_messages
with:
source: inbox
categories: [lead, support, spam, partnership]
- id: draft_response
agent: marketer
# No adapter override, uses agent's default (claude-sonnet-4)
tool: generate_copy
with:
type: email_reply
context: "${{ steps.categorize_messages.output }}"
rubric:
score_min: 7
criteria: "Professional tone, addresses the specific inquiry, under 200 words"Fallback chains
A fallback chain defines a sequence of adapters to try if the primary one fails. This is useful for high-availability setups where you want to guarantee execution even if a provider experiences an outage. The runtime tries each adapter in order, moving to the next one only after the previous one fails or times out.
agents:
strategist:
role: strategist
adapter:
provider: claude
model: claude-sonnet-4
fallback:
- provider: openai
model: gpt-4.1
- provider: claude
model: claude-haiku-4| Behavior | Description |
|---|---|
| Primary succeeds | Response is returned normally. Fallbacks are not invoked. |
| Primary fails (timeout, 5xx, rate limit) | Runtime tries the first fallback adapter. |
| First fallback fails | Runtime tries the second fallback adapter. |
| All adapters fail | Step is marked as failed. Retry/escalation policy takes over. |
| Cost tracking | Credits are charged for whichever adapter actually succeeds. |
| Rubric evaluation | The rubric is evaluated using the adapter that produced the output. |
Managing adapter secrets
API keys and other credentials required by adapters are stored in company secrets. Never put API keys directly in your company.yaml file. Use the ${{ secrets.KEY_NAME }} syntax to reference them.
# Set a secret capx secret set OPENAI_API_KEY --company my-company # List secrets (values are redacted) capx secret list --company my-company # Remove a secret capx secret remove OPENAI_API_KEY --company my-company
Capx encrypts all secrets at rest and in transit. Secrets are only decrypted at the moment of adapter invocation and are never logged or included in activity feeds.
Best practices
- Use the Claude adapter as your default. It offers the best integration with Capx features like rubric evaluation, extended thinking, and vision. The OpenAI adapter is a strong alternative when you need specific OpenAI capabilities or have existing OpenAI-based tooling.
- Always configure fallback chains for production companies. A single-provider setup means your entire company stops if that provider has an outage. Even a simple fallback from Claude to OpenAI (or vice versa) dramatically improves reliability.
- Test adapter changes in staging first. You can create a staging copy of your company with
capx company clone --env staging, change the adapter config, run a few playbooks, and compare rubric scores before updating production.
capx adapter benchmark --agent marketer --playbook content-pipeline. It runs the same playbook steps against multiple models and reports cost, latency, and rubric scores side by side.