Cloud Execution Guide

Note: Cloud execution features require an orchex.dev account and are NOT available in the local npm package (@wundam/orchex). The npm package provides local MCP orchestration using your own API keys (BYOK). See orchex.dev/pricing for cloud plans.

This guide helps you decide when to use cloud vs local execution, manage quotas and rate limits, and optimize costs.

When to Use Cloud vs Local

Use Cloud Execution When:

✅ Large or Complex Tasks

  • Tasks with multiple streams (5+ concurrent operations)
  • Long-running operations (>30 minutes)
  • Tasks requiring high parallelism
  • Complex workflows with many dependencies

✅ Team Collaboration

  • Multiple team members need to view progress
  • Sharing execution history and results
  • Coordinating work across different time zones
  • Reviewing team performance metrics

✅ Resource Constraints

  • Limited local compute resources
  • Poor local network connectivity
  • Running on low-powered devices
  • Need to preserve local battery life

✅ CI/CD Integration

  • Automated deployments and testing
  • Scheduled maintenance tasks
  • Continuous integration workflows
  • Infrastructure automation

Use Local Execution When:

✅ Quick Iterations

  • Small, fast tasks (<5 minutes)
  • Simple single-file changes
  • Exploratory work and prototyping
  • Testing manifest configurations

✅ Privacy & Security

  • Working with sensitive code or data
  • Compliance requirements for data locality
  • Private repositories without cloud access
  • Air-gapped or restricted environments

✅ Cost Sensitivity

  • Frequent small tasks (add up on cloud)
  • Budget constraints
  • Personal/hobby projects
  • Learning and experimentation

✅ Offline Work

  • No internet connectivity
  • Unreliable network conditions
  • VPN or firewall restrictions

Decision Matrix

Factor Cloud Local
Task Duration >10 min <10 min
Streams 3+ 1-2
Team Size 2+ Solo
Budget Available Limited
Privacy Standard Sensitive
Network Stable Limited
Review Needs High Low

Quota Management

Understanding Your Quotas

Cloud quotas are based on your subscription tier:

See orchex.dev/pricing for current limits. Summary:

  • Local (Free): 5 streams, 2 waves, single provider, BYOK
  • Pro ($19/mo): 100 runs/mo, 15 agents, 10 waves, 2 providers, learn, self-healing
  • Team ($49/user/mo): 500 runs/mo, 25 agents, 25 waves, 3 providers, shared orchestrations
  • Enterprise (Custom): Unlimited, self-hosted, SLA, dedicated support

Checking Quota Usage

# View current quota status
orchex cloud quota

# Example output:
# Executions: 45/100 (45%)
# This month: 45 executions
# Remaining: 55 executions
# Resets: Feb 28, 2026

Quota Best Practices

1. Monitor Usage Regularly

# Check usage before large operations
orchex cloud quota --json | jq '.remaining'

# Set up alerts (Pro/Team tiers)
orchex cloud quota alert --threshold 80

2. Batch Similar Tasks

# Instead of 10 small executions:
# Run 1 execution with 10 streams
streams:
  - id: update-component-1
    prompt: "..."
  - id: update-component-2
    prompt: "..."
  # ... 8 more

3. Use Local for Small Tasks

# Single file changes - use local
orchex execute -f manifest.yaml

# Large refactoring - use cloud
orchex execute -f manifest.yaml --cloud

4. Share Team Quota Wisely

# Coordinate with team members
orchex cloud usage --team

# Reserve quota for critical tasks
# Use local for experiments

Rate Limits

Current Rate Limits

See orchex.dev/pricing for current rate limits by tier.

Handling Rate Limits

1. Automatic Retry with Backoff

Orchex automatically retries rate-limited requests:

// Automatic behavior:
// - First retry: 1 second
// - Second retry: 2 seconds
// - Third retry: 4 seconds
// - Fail after 3 retries

2. Spread Out Submissions

# Instead of submitting 5 executions at once:
for manifest in manifests/*.yaml; do
  orchex execute -f "$manifest" --cloud
  sleep 60  # Wait 1 minute between submissions
done

3. Use Stream Parallelism

# Better: 1 execution with 10 streams
# Than: 10 executions with 1 stream each
streams:
  - id: task-1
    prompt: "..."
  - id: task-2
    prompt: "..."
  # More streams = better parallelism

4. Monitor Rate Limit Headers

# Check rate limit status
orchex cloud limits

# Output:
# Rate Limit: 45/60 per minute
# Reset: 15 seconds
# Burst Available: 15 requests

Cost Optimization

Understanding Costs

Execution Pricing:

  • Small execution (1-3 streams, <10 min): ~$0.50
  • Medium execution (4-10 streams, 10-30 min): ~$2.00
  • Large execution (10+ streams, 30+ min): ~$5.00+

Cost Drivers:

  1. Number of streams (parallelism)
  2. Execution duration
  3. Token usage (input + output)
  4. Storage for history and artifacts

Optimization Strategies

1. Right-Size Your Streams

Too Many Small Streams (Expensive):

streams:
  - id: fix-typo-1
    prompt: "Fix typo in file1.ts"
  - id: fix-typo-2
    prompt: "Fix typo in file2.ts"
  - id: fix-typo-3
    prompt: "Fix typo in file3.ts"
# Cost: 3 streams × overhead = $$$

Better: Combined Stream (Cheaper):

streams:
  - id: fix-typos
    prompt: |
      Fix typos in the following files:
      - file1.ts
      - file2.ts
      - file3.ts
# Cost: 1 stream = $

2. Optimize Context Size

# Expensive: Include everything
context:
  - "**/*"  # Sends entire codebase

# Better: Only what's needed
context:
  - "src/components/**/*.ts"
  - "src/types/component.ts"
  - "package.json"
# Reduces token costs by 70-90%

3. Use Local for Iterations

# First attempt - local (free)
orchex execute -f manifest.yaml

# If successful, done!
# If needs tweaking, fix manifest locally

# Final run - cloud (for team/history)
orchex execute -f manifest.yaml --cloud

4. Leverage Caching

# Enable intelligent caching
settings:
  cache_context: true  # Reuse context across streams
  deduplicate_files: true  # Skip unchanged files

5. Set Budget Limits

# Set per-execution budget
orchex execute -f manifest.yaml --cloud --max-cost 5.00

# Set monthly budget (Pro/Team)
orchex cloud budget set 100.00

# Get budget alerts
orchex cloud budget alert --threshold 80

6. Clean Up Old Executions

# Delete old execution history (saves storage)
orchex cloud cleanup --older-than 90days

# Keep only successful executions
orchex cloud cleanup --failed-only

Cost Comparison Examples

Example 1: Simple Bug Fix

# Local: $0 (free)
# Cloud: $0.50
# Recommendation: Use local

streams:
  - id: fix-validation-bug
    prompt: "Fix the email validation regex"

Example 2: Feature Implementation

# Local: $0 (but ties up your machine for 30min)
# Cloud: $2.50 (parallel execution, team visibility)
# Recommendation: Use cloud

streams:
  - id: api-endpoint
    prompt: "Implement POST /api/users endpoint"
  - id: api-tests
    prompt: "Add tests for user endpoint"
  - id: api-docs
    prompt: "Update API documentation"
  - id: frontend-form
    prompt: "Create user registration form"

Example 3: Large Refactoring

# Local: $0 (but 2+ hours, blocks other work)
# Cloud: $5.00 (parallel, doesn't block local work)
# Recommendation: Use cloud

streams:
  - id: migrate-db
    prompt: "Migrate database schema"
  - id: update-models
    prompt: "Update all model files"
  - id: update-controllers
    prompt: "Update controller logic"
  - id: update-views
    prompt: "Update view components"
  - id: update-tests
    prompt: "Update test suites"

Monitoring and Alerts

Dashboard Monitoring

# Open cloud dashboard
orchex cloud dashboard

# View in terminal
orchex cloud status

Key Metrics to Watch:

  • Execution success rate
  • Average execution time
  • Cost per execution
  • Quota utilization
  • Rate limit hits

Setting Up Alerts

# Quota alerts
orchex cloud alert quota --threshold 80 --email you@company.com

# Budget alerts
orchex cloud alert budget --threshold 90 --slack #orchex-alerts

# Failure alerts
orchex cloud alert failures --consecutive 3

Hybrid Workflows

Best of Both Worlds

Combine local and cloud execution for optimal results:

Development Workflow:

# 1. Prototype locally (fast iteration)
orchex execute -f manifest.yaml

# 2. Refine manifest based on results
# Edit manifest.yaml

# 3. Final execution in cloud (team visibility)
orchex execute -f manifest.yaml --cloud

CI/CD Workflow:

# .github/workflows/deploy.yml
name: Deploy Feature

on:
  pull_request:
    branches: [main]

jobs:
  local-validation:
    runs-on: ubuntu-latest
    steps:
      - name: Validate manifest
        run: orchex validate manifest.yaml
  
  cloud-execution:
    needs: local-validation
    runs-on: ubuntu-latest
    if: github.event.pull_request.merged == true
    steps:
      - name: Execute in cloud
        run: orchex execute -f manifest.yaml --cloud
        env:
          ORCHEX_API_KEY: ${{ secrets.ORCHEX_API_KEY }}

Troubleshooting

Quota Exceeded

# Error: Monthly quota exceeded

# Solution 1: Upgrade tier
orchex cloud upgrade --tier pro

# Solution 2: Use local execution
orchex execute -f manifest.yaml  # No --cloud flag

# Solution 3: Wait for reset
orchex cloud quota  # Check reset date

Rate Limited

# Error: Rate limit exceeded

# Solution 1: Wait and retry (automatic)
# Orchex retries automatically with backoff

# Solution 2: Reduce request rate
# Add delays between executions

# Solution 3: Upgrade tier for higher limits
orchex cloud upgrade --tier team

Unexpected Costs

# Review recent executions
orchex cloud usage --detailed

# Check cost breakdown
orchex cloud costs --execution <execution-id>

# Set budget limits
orchex cloud budget set 50.00

Summary

Quick Reference

Use Cloud When:

  • Complex/large tasks
  • Team collaboration needed
  • CI/CD automation
  • Resource constrained locally

Use Local When:

  • Quick iterations
  • Sensitive data
  • Learning/testing
  • Cost sensitive

Optimize Costs By:

  • Right-sizing streams
  • Minimizing context
  • Using local for iterations
  • Setting budget limits
  • Cleaning up old data

Manage Quotas By:

  • Monitoring regularly
  • Batching similar tasks
  • Coordinating team usage
  • Using local for small tasks

Additional Resources