Self-Healing
When a stream fails, orchex doesn't just retry blindly. It categorizes the error, generates a targeted fix stream with specific repair instructions, and retries with augmented context — up to 3 attempts.
How Self-Healing Works
Stream executes → Verify commands run → Failure detected
↓
Error categorized (1 of 10 types)
↓
Fix stream generated with:
• Original plan
• Error output
• Category-specific fix instructions
↓
Fix stream executes
↓
Pass → Continue | Fail → Retry (up to 3x)Step by Step
- Detection — A stream fails during execution or when its
verifycommands return non-zero exit codes - Categorization — orchex analyzes the error output and classifies it into one of 10 error categories
- Fix Generation — A new fix stream is created with the original plan, the error details, and category-specific repair instructions
- Inheritance — The fix stream inherits the parent stream's file ownership, dependencies, and verify commands
- Execution — The fix stream runs with augmented context
- Chain Limit — If the fix fails, steps 2-5 repeat. After 3 total attempts, the stream is marked as permanently failed
Error Categories
orchex recognizes 10 distinct error types, each with a targeted fix strategy:
| Category | Example Error | Fix Strategy |
|---|---|---|
TIMEOUT |
Stream exceeded time limit | Retry with increased timeout, suggest splitting the stream |
TEST_FAILURE |
vitest or jest tests failed |
Include test output, ask LLM to fix specific failing assertions |
LINT_ERROR |
ESLint or Prettier violations | Include lint output, ask LLM to fix specific rules |
TYPE_ERROR |
TypeScript tsc compilation failed |
Include compiler errors with line numbers, fix type mismatches |
BUILD_ERROR |
Build step (npm run build) failed |
Include build output, fix configuration or missing dependencies |
RUNTIME_ERROR |
Code threw an exception at runtime | Include stack trace, fix logic errors |
SYNTAX_ERROR |
Invalid JavaScript/TypeScript syntax | Include parser error, fix syntax |
IMPORT_ERROR |
Module not found / import resolution failure | Fix import paths, add missing dependencies |
PERMISSION_ERROR |
File ownership violation | Fix file access patterns to respect owns boundaries |
UNKNOWN |
Unrecognized error | General retry with full error context |
Fix Stream Anatomy
When a test failure is detected, orchex generates a fix stream like this:
{
id: "auth-middleware_fix1",
name: "Fix: Auth Middleware (attempt 1)",
owns: ["src/middleware/auth.ts"], // Same as parent
reads: ["src/types/auth.ts"], // Same as parent
parentStreamId: "auth-middleware", // Links to parent
plan: `
The previous attempt to implement auth middleware failed.
ERROR CATEGORY: TEST_FAILURE
ERROR OUTPUT:
FAIL tests/auth.test.ts
✕ returns 401 for expired tokens (12ms)
Expected: 401
Received: 500
ORIGINAL PLAN:
Create Express middleware that validates JWT tokens...
FIX INSTRUCTIONS:
The test expects a 401 response for expired tokens, but the middleware
is returning 500. Wrap the jwt.verify() call in a try/catch and return
res.status(401) when a TokenExpiredError is caught.
`,
verify: ["npx vitest run tests/auth.test.ts"]
}Key properties of fix streams:
- Same ownership — The fix stream owns the same files as the original
- Same verify commands — It must pass the same checks
- Augmented plan — Includes the error output and targeted fix instructions
- Parent chain —
parentStreamIdlinks to the original stream for tracking
Chain Limits
Fix attempts are limited to prevent infinite loops:
- Maximum 3 total attempts (original + 2 fixes)
- Chain tracking —
parentStreamIdlinks to the immediate parent, and orchex traverses the full chain to count attempts - Escalation — After 3 failures, the stream is marked as failed and requires manual intervention
auth-middleware (attempt 1) → FAIL
↓
auth-middleware_fix1 (attempt 2) → FAIL
↓
auth-middleware_fix2 (attempt 3) → FAIL
↓
Stream marked as FAILED (manual intervention needed)When to Intervene
Self-healing handles most issues automatically. You should step in when:
The Same Error Repeats
If the same error appears across all 3 attempts, the LLM doesn't understand how to fix it. Common causes:
- Missing dependency not in
reads - Incorrect assumption in the plan
- API or library that doesn't exist
The Error is Architectural
Self-healing fixes code-level issues. It can't restructure your stream definitions:
- Wrong stream decomposition
- Missing streams
- Incorrect dependency ordering
A Dependency is Missing
If a stream needs a file that hasn't been created yet and isn't in its reads or deps, self-healing can't add the dependency. Update the stream definition manually.
Verify Command Isolation
orchex runs verify commands per-stream, not globally. This means:
- Stream A's verify failure doesn't block Stream B in the same wave
- Each fix stream re-runs only its own verify commands
- Cross-stream verify failures (e.g., a shared type check) are attributed to the stream that triggered them
Tier Gating
Self-healing is available on Pro tier and above. On the Free (Local) tier, failed streams are marked as failed without automatic fix attempts.
| Tier | Self-Healing |
|---|---|
| Local (Free) | Not available |
| Pro | Up to 3 attempts |
| Team | Up to 3 attempts |
| Enterprise | Up to 3 attempts (configurable) |
Monitoring Self-Healing
During execution, orchex reports self-healing activity:
Wave 2: [auth-middleware] FAILED — TEST_FAILURE
→ Generating fix stream (attempt 2/3)
→ [auth-middleware_fix1] executing...
→ [auth-middleware_fix1] PASSAfter execution, orchex.status() shows the self-healing chain:
{
"streams": {
"auth-middleware": {
"status": "failed",
"error": "TEST_FAILURE: Expected 401, received 500"
},
"auth-middleware_fix1": {
"status": "complete",
"parentStreamId": "auth-middleware"
}
}
}Related
- Streams — Stream lifecycle and verification
- Ownership — How fix streams inherit ownership
- Providers — Self-healing works across all providers
- Error Handling (Advanced) — All 10 error categories in detail