Overview
CheapestInference is an AI inference proxy that gives you access to every major model through a single API with flat monthly pricing. No per-token charges.
Base URL
Section titled “Base URL”https://api.cheapestinference.comHow it works
Section titled “How it works”CheapestInference routes your requests to the appropriate model provider (OpenAI, Anthropic, Google, Meta, DeepSeek, Qwen, Moonshot).
- You subscribe to a plan or top up credits
- You create API keys from the dashboard
- You use those keys with the OpenAI or Anthropic SDK — just change the base URL
- The platform validates your key, enforces rate limits, and forwards the request to the provider
Your API key works exactly like an OpenAI or Anthropic key. All routing, rate limiting, and spend tracking is handled automatically.
Supported endpoints
Section titled “Supported endpoints”| Endpoint | Description |
|---|---|
POST /v1/chat/completions | OpenAI-compatible chat (all models) |
POST /anthropic/v1/messages | Anthropic-compatible messages |
POST /v1/embeddings | Text embeddings |
GET /v1/models | List available models |
The response format matches the official OpenAI and Anthropic APIs exactly.
Rate limits
Section titled “Rate limits”Rate limits are enforced per key and reset every minute:
| Limit | Standard | Pro |
|---|---|---|
| Requests per minute (RPM) | 60 | 200 |
| Tokens per minute (TPM) | 3,333 | 13,333 |
Each API key has its own independent limits — one key hitting its limit does not affect other keys.
Payment options
Section titled “Payment options”| Method | How |
|---|---|
| Card | Visa, Mastercard, etc. via Stripe |
| USDC | Direct transfer on Base L2 (MetaMask, Coinbase Wallet) |
| Credits | Top up $5–$50, pay as you go, no subscription required |
Subscriptions last 30 days with no auto-renewal. You renew manually when ready.
x402 protocol
Section titled “x402 protocol”Requests without an API key receive a 402 Payment Required response with USDC pricing. AI agents can pay per request using the x402 protocol on Base L2 — no account needed.
No custom SDK required. Use the official OpenAI or Anthropic SDK in any language:
- Python:
openai,anthropic - Node.js:
openai,@anthropic-ai/sdk - Any OpenAI-compatible client (Go, Rust, Java, etc.)