Skip to content

x402 Protocol

The x402 protocol enables AI agents to subscribe or purchase credits on CheapestInference autonomously — no human setup required. Agents pay with USDC on Base L2 and receive an API key instantly.

  1. Agent sends a request to any inference endpoint without an API key
  2. The server responds with 402 Payment Required including a product catalog
  3. Agent selects a product (subscription or credit package) and pays via USDC on Base
  4. Agent receives an API key and uses it for all subsequent requests

When a request arrives without an API key, the response includes available products:

{
"error": "Payment Required",
"message": "Subscribe or purchase credits to access AI inference. Use the endpoints below to get an API key.",
"x402": {
"version": 2,
"network": "base",
"asset": "USDC",
"payTo": "0x...",
"facilitator": "https://...",
"products": [
{
"type": "subscription",
"pool": "kimi26",
"name": "Kimi K2.6, GLM 4.7, MiniMax M2.5",
"models": ["kimi-k2.6", "glm-4.7", "MiniMax-M2.5"],
"currency": "USDC",
"duration": "30 days",
"blocks": [
{ "block": "asia", "hoursUtc": "00:00-08:00 UTC", "priceUsdc": "39.00", "availableSeats": 100 },
{ "block": "europe", "hoursUtc": "08:00-16:00 UTC", "priceUsdc": "49.00", "availableSeats": 100 },
{ "block": "americas", "hoursUtc": "16:00-24:00 UTC", "priceUsdc": "45.00", "availableSeats": 100 }
],
"agentCheckout": {
"endpoint": "https://api.cheapestinference.com/api/agent/subscribe-pool",
"method": "POST",
"body": { "txHash": "<USDC_TX_HASH>", "poolSlug": "kimi26", "blocks": ["asia"] }
}
},
{
"type": "credits",
"minAmount": "10.00",
"currency": "USDC",
"description": "Pay-as-you-go credits, never expire",
"agentTopup": {
"endpoint": "https://api.cheapestinference.com/api/agent/topup",
"method": "POST",
"body": { "txHash": "<USDC_TX_HASH>", "amount": "<AMOUNT_USD>" }
}
}
],
"docs": "https://docs.cheapestinference.com/guides/x402"
}
}

Pool subscriptions are billed per 8-hour daily time block. Reserve one or more blocks (all three = full 24/7). Prices are returned live in the 402 catalog — the values below are illustrative; always read blocks[].priceUsdc from the response and send the summed amount of your chosen blocks.

ProductPrice (illustrative)What you get
Pool block (Asia-Pacific)~$39 USDC30-day unlimited key for that 8h daily window, no budget cap, 1 concurrent request
Pool block (Europe)~$49 USDC30-day unlimited key for that 8h daily window
Pool block (Americas)~$45 USDC30-day unlimited key for that 8h daily window
All three blockssum of the threeFull 24/7 unlimited access
Credits$10+ USDCPay-as-you-go balance, never expires
Terminal window
# 1. Agent requests without key — gets 402 with the live product catalog
curl https://api.cheapestinference.com/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "kimi-k2.6", "messages": [{"role": "user", "content": "Hello"}]}'
402 { "x402": { "version": 2, "products": [...], "payTo": "0x..." } }
# 2. Agent sends the summed USDC price of its chosen blocks to payTo on Base,
# then subscribes to the pool (poolSlug + blocks from the catalog)
curl -X POST https://api.cheapestinference.com/api/agent/subscribe-pool \
-H "Content-Type: application/json" \
-d '{"txHash": "0xabc...", "poolSlug": "kimi26", "blocks": ["asia"]}'
{ "success": true, "data": { "apiKey": "sk-...", "pool": "kimi26", "blocks": ["asia"], "amountUsdc": "39.00", "expiresAt": "..." } }
# 3. Agent uses key for inference
curl https://api.cheapestinference.com/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "Content-Type: application/json" \
-d '{"model": "kimi-k2.6", "messages": [{"role": "user", "content": "Hello"}]}'

For credit top-ups, agents use a similar flow:

Terminal window
# Send USDC, then verify
curl -X POST https://api.cheapestinference.com/api/agent/topup \
-H "Content-Type: application/json" \
-d '{"txHash": "0xdef...", "amount": 50}'
{ "success": true, "data": { "credited": true, "newBalance": 50 } }

CheapestInference serves an agent card at:

GET /.well-known/agent.json

This follows the Google A2A protocol and advertises:

  • Available skills (inference, model listing)
  • Supported auth methods (bearer key, x402)
  • Streaming capability
  • API endpoint URL

Agents can discover CheapestInference and autonomously decide to subscribe or purchase credits.

Pay-per-request doesn’t work for AI inference because request costs vary wildly depending on the model and token count. A flat $0.002 request to a model that costs $2/M tokens on a 100K prompt would lose money. Subscriptions and credits ensure sustainable pricing while giving agents full autonomy.