Advisory Opinion

Disclaimer

This document is published for public interest and educational purposes only. It is not legal advice. It does not create an attorney-client relationship with any reader. The author publishes this analysis to contribute to the public discourse around AI platform terms, third-party tool ecosystems, and the legal boundaries of building on AI infrastructure.

Do not rely on this analysis for your own legal decisions. The application of contract terms depends on specific facts, jurisdiction, and circumstances. If you are building a product that interacts with AI platform subscriptions, APIs, or credentials, consult your own legal counsel for advice tailored to your situation.

The author is a licensed attorney sharing an analytical framework. Publication of this analysis is an exercise of professional commentary, not the practice of law on behalf of any reader.

TL;DR

inference-relay does not bypass Claude Code. It runs Claude Code. Anthropic's February-April 2026 enforcement targeted tools that replaced Claude Code and called the API directly, generating “unusual traffic patterns without any of the usual telemetry that the Claude Code harness provides” and burning 10-20x compute. inference-relay does the opposite: inference executes through Anthropic's own binary, on the user's own machine, using documented programmatic interfaces that Anthropic designed and published for exactly this kind of use. Both the developer and the user independently pay Anthropic. inference-relay's architecture is compliant with both the letter and spirit of Anthropic's terms.

Prepared: April 6, 2026
Author: Jonathan Blecher; published for public interest
Scope: Anthropic Consumer Terms, February 2026 legal clarification, Commercial Terms, AUP, April 3-4 2026 enforcement action
Posture: Compliance-oriented with candid identification of friction points

I. Executive Summary

inference-relay is a TypeScript library ($20-50/mo) that routes AI inference calls from a developer's application to the end user's existing Claude subscription. It does this by invoking Anthropic's own Claude Code CLI binary as a subprocess on the user's machine, using documented programmatic flags that Anthropic designed and published for non-interactive use. The developer separately maintains their own Anthropic API key for orchestration. Both parties are independent Anthropic customers.

In February-April 2026, Anthropic took enforcement action against tools, principally OpenClaw, that extracted OAuth tokens from the OS keychain and called api.anthropic.com directly, bypassing Claude Code entirely. These tools burned 10-20x more compute by circumventing prompt caching and generated “unusual traffic patterns without any of the usual telemetry that the Claude Code harness provides.”

Bottom line: inference-relay is architecturally, economically, and functionally distinct from the tools Anthropic targeted. It does not bypass Claude Code; it runs Claude Code. It does not burden Anthropic's infrastructure; it creates identical compute load to manual CLI use with full caching preserved. It does not enable freeloading; both the developer and the user independently pay Anthropic.

II. The Terms Landscape

A. Consumer Terms of Service (Effective October 8, 2025)

Automated Access Prohibition:

“Except when you are accessing our Services via an Anthropic API Key or where we otherwise explicitly permit it, to access the Services through automated or non-human means, whether through a bot, script, or otherwise.”

Competing Products / Resale:

“To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models or resell the Services.”

B. The February 19, 2026 Legal-and-Compliance Clarification

“OAuth authentication is intended exclusively for purchasers of Claude Free, Pro, Max, Team, and Enterprise subscription plans and is designed to support ordinary use of Claude Code and other native Anthropic applications.”
“Developers building products or services that interact with Claude's capabilities, including those using the Agent SDK, should use API key authentication through Claude Console or a supported cloud provider. Anthropic does not permit third-party developers to offer Claude.ai login or to route requests through Free, Pro, or Max plan credentials on behalf of their users.

C. Commercial Terms (Govern the Developer)

“Customer may not and must not attempt to (a) access the Services to build a competing product or service, including to train competing AI models or resell the Services except as expressly approved by Anthropic; (b) reverse engineer or duplicate the Services.”

D. The April 3-4, 2026 Enforcement Action

Anthropic emailed affected users directly on April 3, enforcing on April 4. OpenClaw and similar tools extracted the OAuth token stored by Claude Code in the OS keychain and called the API directly, bypassing Claude Code entirely. Per community reports, a single OpenClaw instance could rack up $1K-$5K/day in API-equivalent compute on a $200/mo Max plan by burning 10-20x more compute per task without prompt caching.

What Anthropic did NOT do: Anthropic did not ban these tools outright. It required “extra usage” (pay-as-you-go billing) rather than subscription credits.

E. What Is a “Harness”?

A third-party harness in the AI context is a third-party application that replaces the provider's own client software as the interaction layer with the model. inference-relay is not a harness by any definition. It does not replace Claude Code. It does not manage the interaction loop with the model. It invokes Claude Code, which manages the interaction. inference-relay is to Claude Code what a cron job is to the program it calls: a scheduling and routing layer, not a replacement.

III. Why inference-relay Is Compliant

A. It Runs Claude Code; It Does Not Replace It

OpenClaw replaced Claude Code. Users ran OpenClaw instead of Claude Code. It was an alternative interface that bypassed Anthropic's software entirely.

inference-relay runs Claude Code. It invokes Anthropic's own CLI binary on the user's machine, through Anthropic's own infrastructure, using Anthropic's documented programmatic interfaces. Anthropic's monitoring systems see normal Claude Code traffic, because it is normal Claude Code traffic.

B. The CLI Was Designed for Programmatic Use

Anthropic's Claude Code CLI exposes an extensive suite of flags that serve no purpose in interactive human use: --print (described as “non-interactive/SDK mode”), --output-format stream-json, --json-schema, --max-budget-usd, --dangerously-skip-permissions, and dozens more. If Anthropic intended the CLI to be used only interactively by humans, these flags would not exist.

C. No Resale

Both parties independently pay Anthropic. The developer pays for API access (orchestration). The user pays for their Claude subscription (execution). inference-relay sells routing software, not Anthropic access. The analogies are well-established: Terraform provisions infrastructure in the user's own cloud account; a CI/CD pipeline invokes tools the user has installed; a database connection pool manages connections to the user's own database.

D. No Reverse Engineering

inference-relay uses the CLI's documented public interface. It never decompiles, disassembles, or inspects the binary. It treats the binary as a black box: send structured input through documented flags, receive structured output through designed interfaces.

E. No Competing Product

inference-relay is not a chatbot, coding assistant, collaborative workspace, AI model, or API provider. It is a routing library. Indeed, inference-relay drives subscription retention: every developer who integrates it creates users who need Claude subscriptions.

F. Zero Additional Compute Burden

Inference runs through Anthropic's own CLI binary. Prompt caching is fully preserved. Compute load is 1x, identical to manual CLI use. The programmatic interface is actually less compute per call, not more.

IV. The OpenClaw Distinction

FactorOpenClaw (blocked)inference-relay
Anthropic software usedNone: bypassed Claude CodeClaude Code CLI: Anthropic's own binary
API access methodDirect HTTP to api.anthropic.comCLI via documented programmatic interface
Prompt cachingBypassed: 10-20x computePreserved: 1x compute, verified
TelemetryUnusual traffic patternsNormal Claude Code telemetry
Replaces Claude Code?YesNo: invokes Claude Code
Revenue impactOne party paysBoth parties pay Anthropic

V. Remaining Friction Points

1. The “All Third-Party Harnesses” Language (Low Risk)

The language must be read in context. The email was addressed to OpenClaw users about OpenClaw-type tools. inference-relay's inference runs through Claude Code. The “harness” label does not describe inference-relay's architecture.

2. Keychain Credential Access (Low Risk)

The token is used exclusively to authenticate Anthropic's own CLI binary. It is never sent to inference-relay's servers, never written to disk, and disappears when the subprocess ends. The user explicitly consents via the OS permission dialog.

3. Marketing Framing (Moderate Risk, Easily Resolved)

Lead with privacy, enterprise compliance, zero-trust architecture, and local execution. Cost optimization as a secondary benefit, not the headline.

VI. Risk Matrix

IssueRisk LevelRationale
CLI programmatic useLOWDocumented flags including 'SDK mode'
Automated access provisionLOWCLI's programmatic interface is 'explicit permission'
'Third-party harness' classificationLOWRuns Claude Code, does not replace it
Resale characterizationLOWBoth parties independently pay Anthropic
Reverse engineeringLOWUses documented public interface
Competing productNEGLIGIBLEDrives subscription retention
Compute burdenNEGLIGIBLE1x compute, caching preserved
Keychain credential accessLOWUser's credential, user's consent
Marketing framingMODERATEEasily resolved; reframe before launch

VII. Recommended Actions

  1. Reframe marketing — Lead with privacy, enterprise compliance, zero-trust architecture, and local execution.
  2. Maintain the API-key path — For users who prefer to use a Commercial API key rather than their subscription.
  3. Document transparency — Security documentation should clearly explain credential handling.
  4. Monitor enforcement rollout — Track which tools are targeted and how Anthropic distinguishes permitted from prohibited use patterns.
  5. Draft downstream terms — Require developers to obtain their own API key, inform users, obtain consent, and comply with Anthropic's AUP.

VIII. Conclusion

The February-April 2026 enforcement action was a response to a specific abuse pattern: third-party tools that replaced Claude Code, bypassed its infrastructure, consumed 10-20x the compute Anthropic priced into flat-rate subscriptions, and generated unusual traffic patterns without standard telemetry. The enforcement was economically rational and proportionate.

inference-relay is the architectural opposite. It runs Anthropic's own software. It uses Anthropic's documented programmatic interface. It preserves Anthropic's infrastructure optimizations. It creates zero additional compute burden. It generates standard Claude Code telemetry. It creates two independent revenue streams for Anthropic instead of one.

Compliance is not merely arguable; it follows from the architecture. The product was designed around the Translation Layer principle: the app pays Anthropic for orchestration, the user's Claude Code executes the inference, and inference-relay sells the routing software that connects them. Both the letter and spirit of Anthropic's terms and enforcement actions are satisfied.

This opinion analyzes contractual provisions and enforcement posture as of April 6, 2026. It is published for public interest and educational purposes. See disclaimer above. The analysis is based on publicly available terms, publicly reported enforcement actions, and the documented technical architecture of inference-relay.