Documentation
inference-relay routes expensive AI inference to your users' existing subscriptions. One line of code. Pennies for orchestration. 98% cost savings.
Getting Started
Prerequisites · Install · First call (non-streaming)+3 more
Security
The Dumb Pipe Guarantee · Compile-time proof · Credential handling+3 more
Enterprise
The Shadow AI problem · Compliant by default · Fleet policy+3 more
Providers
Claude CLI (Native Gateway) · Anthropic API · OpenAI+2 more
Routing
Default behavior · Cost-based routing · Custom routing function+1 more
Fallback
How cascade works · Health states · Error classification+1 more
Streaming
RelayStream interface · Event handlers · finalMessage()+1 more
Instruction Protocol
Two-envelope format · Opaque field mapping · compileInstruction()+1 more
API Reference
InferenceRelay · BaseProvider · FallbackEngine+2 more
Advisory Opinion
Executive Summary · The Terms Landscape · Why inference-relay Is Compliant+2 more
Quick Start
npm install inference-relay # Add to .env IR_LICENSE_KEY=ir_live_your_key_here # Add to your code import 'inference-relay/auto'; // Done. Your existing Anthropic SDK calls now relay automatically.