Plans & Limits

CheapestInference sells unlimited time-block subscriptions on a model pool. The active frontier coding pool bundles Kimi K2.6 + GLM 4.7 + MiniMax M2.5 with automatic failover across direct providers (Moonshot, Z.ai, MiniMax).

How pricing works — reserve daily time blocks

The day is split into three 8-hour time blocks, aligned to global peak hours and shown in your local timezone. Reserve one, two, or all three blocks — all three gives full 24/7 coverage.

Block	Window (UTC)	Price
Asia-Pacific	00:00 – 08:00	$39/mo
Europe	08:00 – 16:00	$49/mo
Americas	16:00 – 24:00	$45/mo

During your reserved hours, usage is truly unlimited — no tokens to count, no budget cap, no overage charges. The only ceiling is 1 concurrent request per key (create more keys for parallelism). Outside your reserved blocks, the key does not serve traffic.

Annual billing saves ~15% and is available per block.
The authoritative source for live pricing and availability is the pool page: cheapestinference.com/pools.

Models in the pool

All three models are available throughout your reserved blocks. Use GET /v1/models for the live list.

Provider	Model	Model ID	Input $/M	Output $/M
Moonshot	Kimi K2.6	`kimi-k2.6`	$0.450	$2.250
Zhipu (Z.ai)	GLM 4.7	`glm-4.7`	$0.400	$1.750
MiniMax	MiniMax M2.5	`MiniMax-M2.5`	$0.270	$0.950

The prices above are the underlying per-token cost from the inference provider, shown for comparison only. On a time-block subscription you pay a flat monthly fee — not per-token charges. Per-model details: Kimi K2.6, GLM 4.7, MiniMax M2.5.

More pools via community pledges

Additional pools open through community pledges. Pledge for a model at cheapestinference.com/pools; when enough subscribers commit to a specific model, a new pool activates. See the Unlimited Subscriptions API for the full subscribe flow and response shapes.

Credits (pay-as-you-go)

Don’t want a time-block subscription? Top up credits starting at $10. Any amount accepted, no maximum. Credit keys have 600 RPM by default. Budget is consumed as you use the API and never resets — top up again when depleted. See the Credits guide.

Payment methods

Method	For
Card (Stripe)	Time-block subscriptions (monthly + annual) and credits
USDC on Base	Subscriptions and credits
x402	Agent subscriptions and credits (no human setup needed)

Subscriptions renew on their billing cycle (monthly or yearly). Cancel anytime — access continues to the end of the paid period.

How rate limits work

Rate limits are enforced at the key level:

Concurrency — 1 in-flight request per key during your reserved block. Create additional keys to run requests in parallel.
RPM / TPM — Credit keys are rate-limited per minute; exceeding the limit returns 429. Limits reset every minute and are not shared between keys.