Pricing

Pricing Calculator

Estimate your monthly costs with BatchIn and compare against verified platform competitors.

API endpoint

https://api.batchin.tech/v1

Funding rails

Shown from the live account catalog

Capacity path

Shared core / reserved inference / dedicated 8+ GPU delivery

Production posture

Stable production-scale text and multimodal traffic

Global Access, Unified Ingress

Providing consistent performance and reliability across global and domestic markets.

BatchIn coordinates API gateways, security controls, and compute availability to support localized delivery with unified engineering standards.

View pricing

Global entry

BatchIn serves global developers and enterprise buyers through the English storefront and global API.

Use this path for USD pricing, public MCP discovery, OpenAI-compatible access, and global-facing sales delivery.

https://batchin.tech · https://api.batchin.tech/v1

Production posture

Built for stable production-scale traffic.

Traffic mix

Text plus vision, audio, image, and video workloads.

Streaming path

Regional ingress, stable streaming, and request continuity.

Control guardrails

Scoped limits, request isolation, and backpressure controls.

Public contract and readiness

OpenAI-compatible endpoints stay stable across chat, responses, embeddings, images, audio, and video.

Public MCP transport and tool discovery stay on the BatchIn contract instead of exposing execution details.

Traffic policy is designed for stable production text and multimodal workloads, not only demo-scale traffic.

Capability availability follows aligned usage, cost, billing, trace, and verification records.

Shared core

Self-serve developers and ordinary enterprise traffic run on the shared BatchIn control core.

This is where public Model API, batch, usage, billing, and public MCP contract stay consistent.

Private lanes

Reserved inference, dedicated endpoints, and larger enterprise traffic move into stricter capacity lanes.

Customer UI keeps one product truth while delivery, quota, and isolation can vary by contract.

Compute truth

Dedicated 8+ GPU delivery and smaller hourly rental both resolve against the same compute and capacity truth.

Public pages show inventory and availability only from the verified compute registry.

Edge ingress

Global traffic enters through a regional edge designed for resilient access and stable session continuity.

Latency work starts at the customer-facing edge before requests enter the primary execution path.

Streaming delivery

BatchIn maintains stable streaming behavior across cross-region and mixed-media workloads.

Connection reuse and consumer isolation are tuned to reduce jitter and long-tail failures.

Traffic policy

Traffic policy stays explicit through scoped protection, retry discipline, and graceful overload handling.

Customers see a simple API and clear limits while BatchIn handles traffic protection behind the scenes.

Funding and protocols

Keep customer funding, account credit, and settlement protocols separate

Funding and protocol status follows the live workspace catalog; this page does not claim funding or settlement availability without it.

Payment and protocol status is temporarily unavailable. The public surface does not claim funding or settlement availability without a live protocol catalog.

Concurrency and capability posture

Text and multimodal workloads share one commercial account model

Built for enabled accounts running text, vision, audio, image, and video workloads.

The global service prioritizes resilient ingress, connection reuse, streaming delivery, and regional rate control.

Ordinary enterprise traffic uses the shared core, while reserved inference, dedicated endpoints, and 8+ GPU delivery move into private lanes and stronger quota isolation.

Estimated monthly capacity

Committed spend is not a fixed token package

These token figures are planning estimates. Actual capacity changes with model mix, input-output ratio, cache use, multimodal jobs, and routing.

Monthly committed spend	Planning estimate
Contact sales	Entry production usage on lower-cost text models; actual token volume depends on model mix, input/output ratio, and cache hit rate.
Contact sales	Higher-volume text usage or production traffic across mixed models.
Contact sales	Higher-volume production, coding agents, multimodal workloads, and monthly reconciliation.
Contact sales	Higher self-serve committed spend; dedicated capacity or custom routing still requires review.
$10,000+	Custom routing, reserved capacity, and dedicated lanes with sales support.

Model pricing

On-demand model pricing

25 models

Pricing v39.0-public-pricing-2026-06-13

Model	Unit	PAYG	$5k committed	$10k committed	$20k committed	Enterprise	Availability
minimax-m2.5 MiniMax M2.5	1M tokens	In $0.112 / Out $0.447	In $0.099 / Out $0.398	In $0.09 / Out $0.36	In $0.081 / Out $0.323	Contact sales	Self-serve
minimax-m2.7 MiniMax M2.7	1M tokens	In $0.242 / Out $0.969	In $0.23 / Out $0.92	In $0.217 / Out $0.87	In $0.205 / Out $0.82	Contact sales	Self-serve
kimi-k2.5 Kimi K2.5	1M tokens	In $0.213 / Out $1.12	In $0.189 / Out $0.994	In $0.172 / Out $0.901	In $0.154 / Out $0.808	Contact sales	Self-serve
glm-5.1 GLM-5.1	1M tokens	In $0.692 / Out $2.77	In $0.657 / Out $2.63	In $0.621 / Out $2.49	In $0.586 / Out $2.34	Contact sales	Self-serve
deepseek-v4-pro DeepSeek V4 Pro	1M tokens	In $0.16 / Out $0.32	In $0.142 / Out $0.284	In $0.129 / Out $0.257	In $0.115 / Out $0.231	Contact sales	Self-serve
gpt-5.4-mini GPT-5.4 Mini	1M tokens	In $0.188 / Out $1.13	In $0.173 / Out $1.04	In $0.158 / Out $0.945	In $0.143 / Out $0.855	Contact sales	Self-serve
gpt-5.5 GPT-5.5	1M tokens	In $1.25 / Out $7.50	In $1.15 / Out $6.90	In $1.05 / Out $6.30	In $0.95 / Out $5.70	Contact sales	Self-serve
deepseek-v3.2 DeepSeek V3.2	1M tokens	In $0.107 / Out $0.16	In $0.095 / Out $0.142	In $0.086 / Out $0.129	In $0.077 / Out $0.115	Contact sales	Self-serve
qwen3.5-plus Qwen 3.5 Plus	1M tokens	In $0.043 / Out $0.256	In $0.038 / Out $0.227	In $0.034 / Out $0.206	In $0.031 / Out $0.185	Contact sales	Self-serve
qwen3.6-max-preview Qwen 3.6 Max Preview	1M tokens	In $1.04 / Out $6.23	In $0.985 / Out $5.91	In $0.932 / Out $5.59	In $0.879 / Out $5.27	Contact sales	Self-serve
qwen3.7-max Qwen 3.7 Max	1M tokens	In $1.60 / Out $4.79	In $1.54 / Out $4.63	In $1.49 / Out $4.47	In $1.44 / Out $4.31	Contact sales	Self-serve
kimi-k2.6 Kimi K2.6	1M tokens	In $0.346 / Out $1.44	In $0.308 / Out $1.28	In $0.279 / Out $1.16	In $0.25 / Out $1.04	Contact sales	Self-serve
mimo-v2.5-pro MiMo-V2.5-Pro	1M tokens	In $0.932 / Out $2.80	In $0.901 / Out $2.70	In $0.87 / Out $2.61	In $0.839 / Out $2.52	Request access	Preview access
mimo-v2.5 MiMo-V2.5	1M tokens	In $0.373 / Out $1.86	In $0.36 / Out $1.80	In $0.348 / Out $1.74	In $0.336 / Out $1.68	Request access	Preview access
claude-opus-4-6 Claude Opus 4.6	1M tokens	In $1.25 / Out $6.25	In $1.15 / Out $5.75	In $1.05 / Out $5.25	In $0.95 / Out $4.75	Contact sales	Self-serve
gemini-3.5-flash Gemini 3.5 Flash	1M tokens	In $0.375 / Out $2.25	In $0.345 / Out $2.07	In $0.315 / Out $1.89	In $0.285 / Out $1.71	Contact sales	Self-serve
gpt-5.4 GPT-5.4	1M tokens	In $0.625 / Out $3.75	In $0.575 / Out $3.45	In $0.525 / Out $3.15	In $0.475 / Out $2.85	Contact sales	Self-serve
claude-opus-4-8 Claude Opus 4.8	1M tokens	In $1.25 / Out $6.25	In $1.15 / Out $5.75	In $1.05 / Out $5.25	In $0.95 / Out $4.75	Contact sales	Self-serve

Multimodal pricing

Video, image, and asset jobs use job or media-unit pricing

Contact our team for custom tiers

Model / spec	Unit	PAYG	$5k committed	$10k committed	$20k committed	Enterprise	Availability
Seedance video
doubao-seedance-1-5-pro-251215 5s 480p, no reference $0.059 / job · 5s 480p, reference $0.118 / job · 5s 720p, no reference $0.059 / job	video task	$0.059-$1.48 / job	$0.059-$1.48 / job	$0.056-$1.41 / job	$0.056-$1.41 / job	from $0.053-$1.33 / job	Self-serve
doubao-seedance-2-0-260128 480p, no video input $0.068 / second · 480p, with video input $0.166 / second · 720p, no video input $0.147 / second	video second	$0.068-$0.367 / second	$0.068-$0.367 / second	$0.065-$0.349 / second	$0.065-$0.349 / second	from $0.062-$0.33 / second	Self-serve
GPT Image 2 image
gpt-image-2 GPT Image 2	1M image/text tokens	Text in $2.00 / Cached text in $0.50 / Image in $3.20 / Cached image in $0.80 / Image out $12.00	Text in $1.80 / Cached text in $0.45 / Image in $2.88 / Cached image in $0.72 / Image out $10.80	Text in $1.70 / Cached text in $0.425 / Image in $2.72 / Cached image in $0.68 / Image out $10.20	Text in $1.60 / Cached text in $0.40 / Image in $2.56 / Cached image in $0.64 / Image out $9.60	Contact sales	Self-serve
gpt-image-2-4k GPT Image 2 4K	image generation	$0.032 / job	$0.029 / job	$0.027 / job	$0.026 / job	Request access	Preview access
gpt-image-2-4k-auto GPT Image 2 4K Auto	image generation	$0.064 / job	$0.058 / job	$0.054 / job	$0.051 / job	Contact sales	Self-serve
Kling v3 video
kling-v3 Standard, no audio $0.084 / second · Standard, with audio $0.126 / second · Professional, no audio $0.112 / second	video second	$0.084-$0.422 / second	$0.082-$0.408 / second	$0.08-$0.399 / second	$0.078-$0.391 / second	from $0.075-$0.377 / second	Self-serve
Kling v3 Omni video
kling-v3-omni Standard, no reference, no audio $0.084 / second · Standard, no reference, with audio $0.112 / second · Standard, reference, no audio $0.126 / second	video second	$0.084-$0.422 / second	$0.082-$0.408 / second	$0.08-$0.399 / second	$0.078-$0.391 / second	from $0.075-$0.377 / second	Self-serve

PAYG

Status unavailable

Pay as you go

No monthly committed spend; built for tests, prototypes, and variable usage.

Standard pay-as-you-go pricing
Self-serve access follows account permissions and live model status
Real usage, real billing, and live balance updates

Monthly committed spend

Committed-spend rates

Contact sales

A monthly minimum spend commitment, not a fixed token package. If usage is lower, the invoice stays at the committed amount; if usage is higher, the invoice follows actual usage.

Self-serve committed spend follows the tiers visible to your account
Higher committed spend is confirmed with the BatchIn team
Team API keys, billing export, usage records, reliability controls, and spend policies

Coding workloads

Workload profile

No price change

For code generation, developer automation, and agent calls: usage breakdowns, access controls, usage records, and workload-specific limits.

Repo / task / branch / PR cost attribution
Coding task usage records and exports
Long-context budgets, coding call policies, and continuity controls

Multimodal workloads

Workload profile

No price change

For priced image understanding, image generation, video generation, image-to-video, and multimodal agent workflows.

Video job queue, pre-charge, refunds on failure, retries, and callbacks
Project / campaign / asset / job cost attribution
Video and image workloads include task status, output assets, and usage receipts

Dedicated Capacity

Commercial delivery

Contact sales

Quoted by 8-GPU whole-machine monthly price, with a 32-GPU minimum and a three-month minimum term.

Public monthly price per 8-GPU whole machine
32-GPU / four-machine minimum and three-month term
Dedicated endpoints, whole machines, and managed deployments are contact-sales only

Dedicated Capacity

Whole-machine GPU monthly pricing

Per-machine monthly prices, with a 32-GPU minimum, a three-month minimum term, and quote-led delivery.

GPU	GPU Count	Public Price
GB300 NVL72	NVL72 / 72 GPUs	Contact sales
GB200 NVL72	NVL72 / 72 GPUs	Contact sales
HGX B300	8 GPUs	Contact sales
HGX B200	8 GPUs	Contact sales
H200 141GB	8 GPUs	$12.0k / month
H100 80GB	8 GPUs	$9.2k / month
A100 80GB NVLink	8 GPUs	$7.1k / month
L40S 48GB	8 GPUs	$4.1k / month
RTX 5090 32GB	8 GPUs	$2.2k / month
RTX 4090 24GB	8 GPUs	$1.6k / month

Six public entries

Public pricing and capability access now center on these six entries

Model API

Text, coding, reasoning, and image-understanding calls

Multimodal API

Image, video, audio, speech, task status, and output assets

Billing

Balance, budgets, monthly committed spend, billing records, invoices, and exports

VaaS

Verifiable usage records and reconciliation

Dedicated Capacity

Dedicated endpoints, GPU capacity, and managed delivery

Agentic Payments

Service catalog, x402, wallet policies, and agent payment proofs