Context Budgets
Every LLM has a context window limit — the maximum number of tokens it can process in a single request. orchex tracks context usage per stream and enforces budgets to prevent truncation, degraded output, or outright failures.
How Context Is Built
For each stream, orchex assembles a multi-layer context prompt:
┌─────────────────────────────────────┐
│ 1. Project Context │ File tree, dependencies, config
│ (~2,000-5,000 tokens) │
├─────────────────────────────────────┤
│ 2. Stream Context │ Owned files (with line numbers)
│ (varies by file count/size) │ + Read-only files
├─────────────────────────────────────┤
│ 3. Dependency Context │ Completed artifact summaries
│ (~500-2,000 per dependency) │ from upstream streams
├─────────────────────────────────────┤
│ 4. Instructions │ Artifact format rules,
│ (~1,000 tokens) │ ownership constraints, plan
└─────────────────────────────────────┘The total token count across all four layers is the stream's context budget usage.
Provider-Aware Limits
Each LLM provider has a different context window. orchex sets soft and hard limits as percentages of the provider's capacity:
| Provider | Context Window | Soft Limit (70%) | Hard Limit (90%) |
|---|---|---|---|
| Anthropic | 200,000 | 140,000 | 180,000 |
| OpenAI | 128,000 | 89,600 | 115,200 |
| Gemini | 1,000,000 | 700,000 | 900,000 |
| DeepSeek | 128,000 | 89,600 | 115,200 |
| Ollama | 128,000 | 89,600 | 115,200 |
Soft Limit
When a stream's estimated context exceeds the soft limit, orchex logs a warning but proceeds with execution:
{
"event": "budget_warning",
"streamId": "large-refactor",
"violationType": "soft",
"estimatedTokens": 156000,
"budgetLimit": 140000,
"provider": "anthropic"
}The LLM may still produce correct output, but quality can degrade as context grows.
Hard Limit
When a stream exceeds the hard limit, orchex generates an error. The stream may fail or produce truncated output:
{
"event": "budget_exceeded",
"streamId": "monolith-stream",
"violationType": "hard",
"estimatedTokens": 195000,
"budgetLimit": 180000,
"provider": "anthropic"
}Action: Split the stream into smaller sub-streams, reduce owned/read files, or switch to a provider with a larger context window (e.g., Gemini).
Configuring Budgets
You can configure per-stream context budgets in the stream definition:
"large-stream": {
name: "Large Refactor",
owns: ["src/core.ts"],
reads: ["src/types.ts", "src/config.ts"],
contextBudget: {
softLimitTokens: 100000,
hardLimitTokens: 150000,
enforcementLevel: "warn", // "warn" | "soft" | "hard"
warningThreshold: 0.8 // Warn at 80% of soft limit
}
}| Field | Default | Description |
|---|---|---|
softLimitTokens |
Provider-dependent | Warning threshold |
hardLimitTokens |
Provider-dependent | Failure threshold |
enforcementLevel |
"warn" |
"warn" = log only, "soft" = warn + degrade, "hard" = fail |
warningThreshold |
0.8 | Fraction of soft limit that triggers a warning (0.0-1.0) |
Reducing Context Usage
1. Split Large Streams
The most effective strategy. If a stream owns 6+ files, split it:
// Before: One stream, high context
"full-api": {
owns: ["src/routes/users.ts", "src/routes/posts.ts",
"src/routes/auth.ts", "src/routes/billing.ts",
"tests/api.test.ts"],
reads: ["src/types/api.ts", "src/config.ts"]
}
// After: Focused streams, manageable context
"api-users": {
owns: ["src/routes/users.ts"],
reads: ["src/types/api.ts"]
},
"api-posts": {
owns: ["src/routes/posts.ts"],
reads: ["src/types/api.ts"]
}2. Minimize Read Files
Each reads file adds its entire content to the context. Only include files the LLM actually needs:
// Bad: Reading entire config
reads: ["src/config.ts"] // 500 lines of config
// Better: Read only the relevant section
reads: ["src/config/database.ts"] // 50 lines3. Use Dependency Context Instead of Reads
If a file was created by an upstream stream, its artifact summary is automatically included in downstream context. You don't need to add it to reads unless the LLM needs the full file content:
"api-routes": {
deps: ["api-types"],
// api-types' artifact summary is automatically included
// Only add to reads if you need full file content
}4. Choose the Right Provider
For streams with inherently large context (many files, long code), use a provider with a larger window:
"massive-refactor": {
provider: "gemini", // 1M token context
owns: ["src/legacy/core.ts"],
reads: ["src/legacy/types.ts", "src/legacy/utils.ts",
"src/legacy/config.ts", "src/legacy/helpers.ts"]
}Stream Category Recommendations
orchex categorizes streams by their file patterns and recommends max file counts:
| Category | Max owns |
Max reads |
Notes |
|---|---|---|---|
| code | 4 | 4 | Implementation files |
| docs | 6 | 3 | Documentation pages |
| tutorial | 3 | 4 | Tutorial sections |
| test | 4 | 5 | Test files with imports |
| migration | 3 | 4 | Schema migrations |
These are guidelines, not hard limits. Exceeding them increases timeout and quality risk.
Adaptive Learning
orchex tracks context budget usage across executions and adapts thresholds based on your project's history:
- Per-category limits — Code streams vs. documentation vs. tutorials have different optimal budgets
- Confidence levels — Low (0-49 samples), Medium (50-99), High (100+)
- Persistent state — Saved in
.orchex/learn/thresholds.json
As orchex accumulates execution history, its budget estimates become more accurate for your specific codebase and coding patterns.
Monitoring Budget Usage
During `orchex learn`
The learn pipeline estimates token counts for each generated stream and warns about potential budget issues:
Stream "large-docs" estimated at 156,000 tokens (Anthropic soft limit: 140,000)
→ Consider splitting into focused sub-streams or using Gemini providerDuring `orchex execute`
Budget usage is reported in real-time:
Wave 1: [auth-types] context: 12,400 tokens (6% of 200K)
[api-types] context: 18,200 tokens (9% of 200K)
Wave 2: [auth-api] context: 45,600 tokens (23% of 200K)In Telemetry
Context budget metrics are recorded in orchex's telemetry system:
contextTokensEstimated— Pre-execution estimatecontextTokensActual— Actual tokens usedcontextBudgetUtilization— Ratio of actual to limit (0.0-1.0)budgetViolationType—"none","soft", or"hard"
Related
- Streams — Stream definitions and file ownership
- Providers — Provider context limits
- orchex learn — Context budget warnings during plan parsing
- Wave Planning — Optimizing stream decomposition