Providers

orchex supports 5 LLM providers out of the box. You bring your own API keys (BYOK) — orchex routes requests to the provider you've configured, and you pay the provider directly.

Supported Providers

Provider	Models	Env Variable	Context Limit
Anthropic	Claude Sonnet 4, Claude Haiku	`ANTHROPIC_API_KEY`	200,000 tokens
OpenAI	GPT-4o, GPT-4o-mini, o1, o3-mini	`OPENAI_API_KEY`	128,000 tokens
Google	Gemini 2.5 Pro, Gemini 2.0 Flash	`GOOGLE_API_KEY` / `GEMINI_API_KEY`	1,000,000 tokens
DeepSeek	DeepSeek V3, DeepSeek R1, Coder	`DEEPSEEK_API_KEY`	128,000 tokens
Ollama	Any local model	No key (local)	Model-dependent

BYOK (Bring Your Own Key)

orchex never stores, proxies, or logs your API keys. When running locally (MCP mode), your key goes directly from your machine to the provider's API. On orchex cloud, BYOK keys are encrypted with AES-256-GCM and used only for the duration of the request.

Setting Up

Export your provider's API key:

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# OpenAI
export OPENAI_API_KEY="sk-..."

# Google Gemini
export GOOGLE_API_KEY="AI..."

# DeepSeek
export DEEPSEEK_API_KEY="sk-..."

# Ollama (no key needed — runs locally)
# Just ensure Ollama is running: ollama serve

Auto-Detection

orchex auto-detects the provider from whichever API key is set. If multiple keys are set, it uses the first one found (in the order listed above). To force a specific provider:

export ORCHEX_PROVIDER="openai"

Per-Stream Provider Selection

Each stream can target a specific provider via the provider field:

orchex.init({
  feature: "mixed-providers",
  streams: {
    "core-logic": {
      name: "Core Business Logic",
      owns: ["src/core.ts"],
      // Uses default provider (Anthropic in this case)
      plan: "Implement the payment processing pipeline"
    },
    "documentation": {
      name: "API Documentation",
      owns: ["docs/api.md"],
      provider: "deepseek",     // Use DeepSeek for docs (cheaper)
      plan: "Write comprehensive API documentation"
    },
    "data-processing": {
      name: "Data Pipeline",
      owns: ["src/pipeline.ts"],
      provider: "gemini",       // Use Gemini for large context
      plan: "Process and transform the dataset schema"
    }
  }
});

Cost Optimization Strategy

Route streams based on complexity and cost:

Use Case	Recommended Provider	Why
Complex logic, refactoring	Anthropic (Claude Sonnet)	Best code quality
Documentation, comments	DeepSeek V3	Very low cost ($0.001/1K)
Large file context (>100K tokens)	Google Gemini	1M token context window
Quick type definitions	OpenAI GPT-4o-mini	Fast, cheap
Privacy-sensitive code	Ollama (local)	Never leaves your machine

Rate Limit Strategy

When you configure multiple provider API keys, orchex can distribute parallel streams across providers. If one provider rate-limits, streams on other providers continue executing.

Wave 2: [auth-api]    → Anthropic (rate limited... waiting)
        [billing-api] → OpenAI (executing)
        [docs]        → DeepSeek (executing)

This is particularly useful for large orchestrations with many parallel streams.

Context Limits

Each provider has different context window sizes. orchex tracks context budget per stream and warns when approaching limits:

Provider	Context Limit	Soft Warning	Hard Limit
Anthropic	200,000	140,000	180,000
OpenAI	128,000	89,600	115,200
Gemini	1,000,000	700,000	900,000
DeepSeek	128,000	89,600	115,200
Ollama	128,000	89,600	115,200

Streams that exceed the soft limit generate a warning. Streams that exceed the hard limit may fail or produce truncated output. See Context Budgets for details.

Tier Limits on Providers

The number of distinct providers you can use in a single orchestration is subject to tier limits:

Tier	Max Providers
Local (Free)	1
Pro	2
Team	3
Enterprise	Unlimited

This limit counts distinct provider values across all streams. Streams without an explicit provider field use the default (auto-detected) provider and don't count toward additional provider slots.

Ollama (Local Models)

Ollama lets you run open-source models locally — no API key needed, no data leaves your machine.

Setup

Install Ollama: ollama.com
Pull a model: ollama pull codellama
Start the server: ollama serve
Set the base URL:

export OLLAMA_BASE_URL="http://localhost:11434"

Considerations

Speed — Local models are slower than cloud APIs
Quality — Code generation quality varies by model
Context — Most local models have smaller context windows
Privacy — Your code never leaves your machine

Choosing a Provider

For Getting Started

Use Anthropic (Claude Sonnet) — best overall code quality and orchex is optimized for Claude's output format.

For Cost-Sensitive Workloads

Use DeepSeek for documentation, comments, and simple transformations. Save Claude/GPT-4 for complex logic.

For Large Codebases

Use Google Gemini when streams need to read many large files. The 1M token context window handles what other providers can't.

For Enterprise/Privacy

Use Ollama to keep all code local. Combine with cloud providers for non-sensitive streams.

Streams — Stream definitions including provider field
Context Budgets — Provider-aware token limits
Configure MCP — Setting up API keys
Cloud Execution — BYOK on orchex cloud