Skip to content

Plans & Limits

CheapestInference sells unlimited time-block subscriptions on a model pool. The active frontier coding pool bundles Kimi K2.6 + GLM 4.7 + MiniMax M2.5 with automatic failover across direct providers (Moonshot, Z.ai, MiniMax).

How pricing works — reserve daily time blocks

Section titled “How pricing works — reserve daily time blocks”

The day is split into three 8-hour time blocks, aligned to global peak hours and shown in your local timezone. Reserve one, two, or all three blocks — all three gives full 24/7 coverage.

BlockWindow (UTC)Price
Asia-Pacific00:00 – 08:00$39/mo
Europe08:00 – 16:00$49/mo
Americas16:00 – 24:00$45/mo

During your reserved hours, usage is truly unlimited — no tokens to count, no budget cap, no overage charges. The only ceiling is 1 concurrent request per key (create more keys for parallelism). Outside your reserved blocks, the key does not serve traffic.

  • Annual billing saves ~15% and is available per block.
  • The authoritative source for live pricing and availability is the pool page: cheapestinference.com/pools.

All three models are available throughout your reserved blocks. Use GET /v1/models for the live list.

ProviderModelModel IDInput $/MOutput $/M
MoonshotKimi K2.6kimi-k2.6$0.450$2.250
Zhipu (Z.ai)GLM 4.7glm-4.7$0.400$1.750
MiniMaxMiniMax M2.5MiniMax-M2.5$0.270$0.950

The prices above are the underlying per-token cost from the inference provider, shown for comparison only. On a time-block subscription you pay a flat monthly fee — not per-token charges. Per-model details: Kimi K2.6, GLM 4.7, MiniMax M2.5.

Additional pools open through community pledges. Pledge for a model at cheapestinference.com/pools; when enough subscribers commit to a specific model, a new pool activates. See the Unlimited Subscriptions API for the full subscribe flow and response shapes.

Don’t want a time-block subscription? Top up credits starting at $10. Any amount accepted, no maximum. Credit keys have 600 RPM by default. Budget is consumed as you use the API and never resets — top up again when depleted. See the Credits guide.

MethodFor
Card (Stripe)Time-block subscriptions (monthly + annual) and credits
USDC on BaseSubscriptions and credits
x402Agent subscriptions and credits (no human setup needed)

Subscriptions renew on their billing cycle (monthly or yearly). Cancel anytime — access continues to the end of the paid period.

Rate limits are enforced at the key level:

  • Concurrency — 1 in-flight request per key during your reserved block. Create additional keys to run requests in parallel.
  • RPM / TPM — Credit keys are rate-limited per minute; exceeding the limit returns 429. Limits reset every minute and are not shared between keys.