Providers

orchex supports 5 LLM providers out of the box. You bring your own API keys (BYOK) — orchex routes requests to the provider you've configured, and you pay the provider directly.

Supported Providers

Provider Models Env Variable Context Limit
Anthropic Claude Sonnet 4, Claude Haiku ANTHROPIC_API_KEY 200,000 tokens
OpenAI GPT-4o, GPT-4o-mini, o1, o3-mini OPENAI_API_KEY 128,000 tokens
Google Gemini 2.5 Pro, Gemini 2.0 Flash GOOGLE_API_KEY / GEMINI_API_KEY 1,000,000 tokens
DeepSeek DeepSeek V3, DeepSeek R1, Coder DEEPSEEK_API_KEY 128,000 tokens
Ollama Any local model No key (local) Model-dependent

BYOK (Bring Your Own Key)

orchex never stores, proxies, or logs your API keys. When running locally (MCP mode), your key goes directly from your machine to the provider's API. On orchex cloud, BYOK keys are encrypted with AES-256-GCM and used only for the duration of the request.

Setting Up

Export your provider's API key:

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# OpenAI
export OPENAI_API_KEY="sk-..."

# Google Gemini
export GOOGLE_API_KEY="AI..."

# DeepSeek
export DEEPSEEK_API_KEY="sk-..."

# Ollama (no key needed — runs locally)
# Just ensure Ollama is running: ollama serve

Auto-Detection

orchex auto-detects the provider from whichever API key is set. If multiple keys are set, it uses the first one found (in the order listed above). To force a specific provider:

export ORCHEX_PROVIDER="openai"

Per-Stream Provider Selection

Each stream can target a specific provider via the provider field:

orchex.init({
  feature: "mixed-providers",
  streams: {
    "core-logic": {
      name: "Core Business Logic",
      owns: ["src/core.ts"],
      // Uses default provider (Anthropic in this case)
      plan: "Implement the payment processing pipeline"
    },
    "documentation": {
      name: "API Documentation",
      owns: ["docs/api.md"],
      provider: "deepseek",     // Use DeepSeek for docs (cheaper)
      plan: "Write comprehensive API documentation"
    },
    "data-processing": {
      name: "Data Pipeline",
      owns: ["src/pipeline.ts"],
      provider: "gemini",       // Use Gemini for large context
      plan: "Process and transform the dataset schema"
    }
  }
});

Cost Optimization Strategy

Route streams based on complexity and cost:

Use Case Recommended Provider Why
Complex logic, refactoring Anthropic (Claude Sonnet) Best code quality
Documentation, comments DeepSeek V3 Very low cost ($0.001/1K)
Large file context (>100K tokens) Google Gemini 1M token context window
Quick type definitions OpenAI GPT-4o-mini Fast, cheap
Privacy-sensitive code Ollama (local) Never leaves your machine

Rate Limit Strategy

When you configure multiple provider API keys, orchex can distribute parallel streams across providers. If one provider rate-limits, streams on other providers continue executing.

Wave 2: [auth-api]    → Anthropic (rate limited... waiting)
        [billing-api] → OpenAI (executing)
        [docs]        → DeepSeek (executing)

This is particularly useful for large orchestrations with many parallel streams.

Context Limits

Each provider has different context window sizes. orchex tracks context budget per stream and warns when approaching limits:

Provider Context Limit Soft Warning Hard Limit
Anthropic 200,000 140,000 180,000
OpenAI 128,000 89,600 115,200
Gemini 1,000,000 700,000 900,000
DeepSeek 128,000 89,600 115,200
Ollama 128,000 89,600 115,200

Streams that exceed the soft limit generate a warning. Streams that exceed the hard limit may fail or produce truncated output. See Context Budgets for details.

Tier Limits on Providers

The number of distinct providers you can use in a single orchestration is subject to tier limits:

Tier Max Providers
Local (Free) 1
Pro 2
Team 3
Enterprise Unlimited

This limit counts distinct provider values across all streams. Streams without an explicit provider field use the default (auto-detected) provider and don't count toward additional provider slots.

Ollama (Local Models)

Ollama lets you run open-source models locally — no API key needed, no data leaves your machine.

Setup

  1. Install Ollama: ollama.com
  2. Pull a model: ollama pull codellama
  3. Start the server: ollama serve
  4. Set the base URL:
export OLLAMA_BASE_URL="http://localhost:11434"

Considerations

  • Speed — Local models are slower than cloud APIs
  • Quality — Code generation quality varies by model
  • Context — Most local models have smaller context windows
  • Privacy — Your code never leaves your machine

Choosing a Provider

For Getting Started

Use Anthropic (Claude Sonnet) — best overall code quality and orchex is optimized for Claude's output format.

For Cost-Sensitive Workloads

Use DeepSeek for documentation, comments, and simple transformations. Save Claude/GPT-4 for complex logic.

For Large Codebases

Use Google Gemini when streams need to read many large files. The 1M token context window handles what other providers can't.

For Enterprise/Privacy

Use Ollama to keep all code local. Combine with cloud providers for non-sensitive streams.