Documentation

inference-relay routes expensive AI inference to your users' existing subscriptions. One line of code. Pennies for orchestration. 98% cost savings.

Getting Started
Prerequisites · Install · First call (non-streaming)+3 more
Security
The Dumb Pipe Guarantee · Compile-time proof · Credential handling+3 more
Enterprise
The Shadow AI problem · Compliant by default · Fleet policy+3 more
Providers
Claude CLI (Native Gateway) · Anthropic API · OpenAI+2 more
Routing
Default behavior · Cost-based routing · Custom routing function+1 more
Fallback
How cascade works · Health states · Error classification+1 more
Streaming
RelayStream interface · Event handlers · finalMessage()+1 more
Instruction Protocol
Two-envelope format · Opaque field mapping · compileInstruction()+1 more
API Reference
InferenceRelay · BaseProvider · FallbackEngine+2 more
Advisory Opinion
Executive Summary · The Terms Landscape · Why inference-relay Is Compliant+2 more
Quick Start
npm install inference-relay

# Add to .env
IR_LICENSE_KEY=ir_live_your_key_here

# Add to your code
import 'inference-relay/auto';

// Done. Your existing Anthropic SDK calls now relay automatically.