Plans & Limits
CheapestInference sells unlimited time-block subscriptions on a model pool. The active frontier coding pool bundles Kimi K2.6 + GLM 4.7 + MiniMax M2.5 with automatic failover across direct providers (Moonshot, Z.ai, MiniMax).
How pricing works — reserve daily time blocks
Section titled “How pricing works — reserve daily time blocks”The day is split into three 8-hour time blocks, aligned to global peak hours and shown in your local timezone. Reserve one, two, or all three blocks — all three gives full 24/7 coverage.
| Block | Window (UTC) | Price |
|---|---|---|
| Asia-Pacific | 00:00 – 08:00 | $39/mo |
| Europe | 08:00 – 16:00 | $49/mo |
| Americas | 16:00 – 24:00 | $45/mo |
During your reserved hours, usage is truly unlimited — no tokens to count, no budget cap, no overage charges. The only ceiling is 1 concurrent request per key (create more keys for parallelism). Outside your reserved blocks, the key does not serve traffic.
- Annual billing saves ~15% and is available per block.
- The authoritative source for live pricing and availability is the pool page: cheapestinference.com/pools.
Models in the pool
Section titled “Models in the pool”All three models are available throughout your reserved blocks. Use GET /v1/models for the live list.
| Provider | Model | Model ID | Input $/M | Output $/M |
|---|---|---|---|---|
| Moonshot | Kimi K2.6 | kimi-k2.6 | $0.450 | $2.250 |
| Zhipu (Z.ai) | GLM 4.7 | glm-4.7 | $0.400 | $1.750 |
| MiniMax | MiniMax M2.5 | MiniMax-M2.5 | $0.270 | $0.950 |
The prices above are the underlying per-token cost from the inference provider, shown for comparison only. On a time-block subscription you pay a flat monthly fee — not per-token charges. Per-model details: Kimi K2.6, GLM 4.7, MiniMax M2.5.
More pools via community pledges
Section titled “More pools via community pledges”Additional pools open through community pledges. Pledge for a model at cheapestinference.com/pools; when enough subscribers commit to a specific model, a new pool activates. See the Unlimited Subscriptions API for the full subscribe flow and response shapes.
Credits (pay-as-you-go)
Section titled “Credits (pay-as-you-go)”Don’t want a time-block subscription? Top up credits starting at $10. Any amount accepted, no maximum. Credit keys have 600 RPM by default. Budget is consumed as you use the API and never resets — top up again when depleted. See the Credits guide.
Payment methods
Section titled “Payment methods”| Method | For |
|---|---|
| Card (Stripe) | Time-block subscriptions (monthly + annual) and credits |
| USDC on Base | Subscriptions and credits |
| x402 | Agent subscriptions and credits (no human setup needed) |
Subscriptions renew on their billing cycle (monthly or yearly). Cancel anytime — access continues to the end of the paid period.
How rate limits work
Section titled “How rate limits work”Rate limits are enforced at the key level:
- Concurrency — 1 in-flight request per key during your reserved block. Create additional keys to run requests in parallel.
- RPM / TPM — Credit keys are rate-limited per minute; exceeding the limit returns
429. Limits reset every minute and are not shared between keys.