Pricing

Pricing Calculator

Estimate your monthly costs with BatchIn and compare against verified platform competitors.

API endpoint

https://api.batchin.tech/v1

Funding rails

Shown from the live account catalog

Capacity path

Shared core / reserved inference / dedicated 8+ GPU delivery

Production posture

Stable production-scale text and multimodal traffic

Global Access, Unified Ingress

Providing consistent performance and reliability across global and domestic markets.

BatchIn coordinates API gateways, security controls, and compute availability to support localized delivery with unified engineering standards.

View pricing

Global entry

BatchIn serves global developers and enterprise buyers through the English storefront and global API.

Use this path for USD pricing, public MCP discovery, OpenAI-compatible access, and global-facing sales delivery.

https://batchin.tech · https://api.batchin.tech/v1

Production posture

Built for stable production-scale traffic.

Traffic mix

Text plus vision, audio, image, and video workloads.

Streaming path

Regional ingress, stable streaming, and request continuity.

Control guardrails

Scoped limits, request isolation, and backpressure controls.

Public contract and readiness

OpenAI-compatible endpoints stay stable across chat, responses, embeddings, images, audio, and video.
Public MCP transport and tool discovery stay on the BatchIn contract instead of exposing execution details.
Traffic policy is designed for stable production text and multimodal workloads, not only demo-scale traffic.
Capability availability follows aligned usage, cost, billing, trace, and verification records.

Shared core

Self-serve developers and ordinary enterprise traffic run on the shared BatchIn control core.

This is where public Model API, batch, usage, billing, and public MCP contract stay consistent.

Private lanes

Reserved inference, dedicated endpoints, and larger enterprise traffic move into stricter capacity lanes.

Customer UI keeps one product truth while delivery, quota, and isolation can vary by contract.

Compute truth

Dedicated 8+ GPU delivery and smaller hourly rental both resolve against the same compute and capacity truth.

Public pages show inventory and availability only from the verified compute registry.

Edge ingress

Global traffic enters through a regional edge designed for resilient access and stable session continuity.

Latency work starts at the customer-facing edge before requests enter the primary execution path.

Streaming delivery

BatchIn maintains stable streaming behavior across cross-region and mixed-media workloads.

Connection reuse and consumer isolation are tuned to reduce jitter and long-tail failures.

Traffic policy

Traffic policy stays explicit through scoped protection, retry discipline, and graceful overload handling.

Customers see a simple API and clear limits while BatchIn handles traffic protection behind the scenes.

Funding and protocols

Keep customer funding, account credit, and settlement protocols separate

Funding and protocol status follows the live workspace catalog; this page does not claim funding or settlement availability without it.

Payment and protocol status is temporarily unavailable. The public surface does not claim funding or settlement availability without a live protocol catalog.

Concurrency and capability posture

Text and multimodal workloads share one commercial account model

Built for enabled accounts running text, vision, audio, image, and video workloads.
The global service prioritizes resilient ingress, connection reuse, streaming delivery, and regional rate control.
Ordinary enterprise traffic uses the shared core, while reserved inference, dedicated endpoints, and 8+ GPU delivery move into private lanes and stronger quota isolation.

Estimated monthly capacity

Committed spend is not a fixed token package

These token figures are planning estimates. Actual capacity changes with model mix, input-output ratio, cache use, multimodal jobs, and routing.

Monthly committed spendPlanning estimate
Contact salesEntry production usage on lower-cost text models; actual token volume depends on model mix, input/output ratio, and cache hit rate.
Contact salesHigher-volume text usage or production traffic across mixed models.
Contact salesHigher-volume production, coding agents, multimodal workloads, and monthly reconciliation.
Contact salesHigher self-serve committed spend; dedicated capacity or custom routing still requires review.
$10,000+Custom routing, reserved capacity, and dedicated lanes with sales support.

Model pricing

On-demand model pricing

25 models

Pricing v39.0-public-pricing-2026-06-13

ModelUnitPAYG$5k committed$10k committed$20k committedEnterpriseAvailability
minimax-m2.5
MiniMax M2.5
1M tokensIn $0.112 / Out $0.447In $0.099 / Out $0.398In $0.09 / Out $0.36In $0.081 / Out $0.323Contact salesSelf-serve
minimax-m2.7
MiniMax M2.7
1M tokensIn $0.242 / Out $0.969In $0.23 / Out $0.92In $0.217 / Out $0.87In $0.205 / Out $0.82Contact salesSelf-serve
kimi-k2.5
Kimi K2.5
1M tokensIn $0.213 / Out $1.12In $0.189 / Out $0.994In $0.172 / Out $0.901In $0.154 / Out $0.808Contact salesSelf-serve
glm-5.1
GLM-5.1
1M tokensIn $0.692 / Out $2.77In $0.657 / Out $2.63In $0.621 / Out $2.49In $0.586 / Out $2.34Contact salesSelf-serve
deepseek-v4-pro
DeepSeek V4 Pro
1M tokensIn $0.16 / Out $0.32In $0.142 / Out $0.284In $0.129 / Out $0.257In $0.115 / Out $0.231Contact salesSelf-serve
gpt-5.4-mini
GPT-5.4 Mini
1M tokensIn $0.188 / Out $1.13In $0.173 / Out $1.04In $0.158 / Out $0.945In $0.143 / Out $0.855Contact salesSelf-serve
gpt-5.5
GPT-5.5
1M tokensIn $1.25 / Out $7.50In $1.15 / Out $6.90In $1.05 / Out $6.30In $0.95 / Out $5.70Contact salesSelf-serve
deepseek-v3.2
DeepSeek V3.2
1M tokensIn $0.107 / Out $0.16In $0.095 / Out $0.142In $0.086 / Out $0.129In $0.077 / Out $0.115Contact salesSelf-serve
qwen3.5-plus
Qwen 3.5 Plus
1M tokensIn $0.043 / Out $0.256In $0.038 / Out $0.227In $0.034 / Out $0.206In $0.031 / Out $0.185Contact salesSelf-serve
qwen3.6-max-preview
Qwen 3.6 Max Preview
1M tokensIn $1.04 / Out $6.23In $0.985 / Out $5.91In $0.932 / Out $5.59In $0.879 / Out $5.27Contact salesSelf-serve
qwen3.7-max
Qwen 3.7 Max
1M tokensIn $1.60 / Out $4.79In $1.54 / Out $4.63In $1.49 / Out $4.47In $1.44 / Out $4.31Contact salesSelf-serve
kimi-k2.6
Kimi K2.6
1M tokensIn $0.346 / Out $1.44In $0.308 / Out $1.28In $0.279 / Out $1.16In $0.25 / Out $1.04Contact salesSelf-serve
mimo-v2.5-pro
MiMo-V2.5-Pro
1M tokensIn $0.932 / Out $2.80In $0.901 / Out $2.70In $0.87 / Out $2.61In $0.839 / Out $2.52Request accessPreview access
mimo-v2.5
MiMo-V2.5
1M tokensIn $0.373 / Out $1.86In $0.36 / Out $1.80In $0.348 / Out $1.74In $0.336 / Out $1.68Request accessPreview access
claude-opus-4-6
Claude Opus 4.6
1M tokensIn $1.25 / Out $6.25In $1.15 / Out $5.75In $1.05 / Out $5.25In $0.95 / Out $4.75Contact salesSelf-serve
gemini-3.5-flash
Gemini 3.5 Flash
1M tokensIn $0.375 / Out $2.25In $0.345 / Out $2.07In $0.315 / Out $1.89In $0.285 / Out $1.71Contact salesSelf-serve
gpt-5.4
GPT-5.4
1M tokensIn $0.625 / Out $3.75In $0.575 / Out $3.45In $0.525 / Out $3.15In $0.475 / Out $2.85Contact salesSelf-serve
claude-opus-4-8
Claude Opus 4.8
1M tokensIn $1.25 / Out $6.25In $1.15 / Out $5.75In $1.05 / Out $5.25In $0.95 / Out $4.75Contact salesSelf-serve

Multimodal pricing

Video, image, and asset jobs use job or media-unit pricing

Contact our team for custom tiers

Model / specUnitPAYG$5k committed$10k committed$20k committedEnterpriseAvailability
Seedance video
doubao-seedance-1-5-pro-251215
5s 480p, no reference $0.059 / job · 5s 480p, reference $0.118 / job · 5s 720p, no reference $0.059 / job
video task$0.059-$1.48 / job$0.059-$1.48 / job$0.056-$1.41 / job$0.056-$1.41 / jobfrom $0.053-$1.33 / jobSelf-serve
doubao-seedance-2-0-260128
480p, no video input $0.068 / second · 480p, with video input $0.166 / second · 720p, no video input $0.147 / second
video second$0.068-$0.367 / second$0.068-$0.367 / second$0.065-$0.349 / second$0.065-$0.349 / secondfrom $0.062-$0.33 / secondSelf-serve
GPT Image 2 image
gpt-image-2
GPT Image 2
1M image/text tokensText in $2.00 / Cached text in $0.50 / Image in $3.20 / Cached image in $0.80 / Image out $12.00Text in $1.80 / Cached text in $0.45 / Image in $2.88 / Cached image in $0.72 / Image out $10.80Text in $1.70 / Cached text in $0.425 / Image in $2.72 / Cached image in $0.68 / Image out $10.20Text in $1.60 / Cached text in $0.40 / Image in $2.56 / Cached image in $0.64 / Image out $9.60Contact salesSelf-serve
gpt-image-2-4k
GPT Image 2 4K
image generation$0.032 / job$0.029 / job$0.027 / job$0.026 / jobRequest accessPreview access
gpt-image-2-4k-auto
GPT Image 2 4K Auto
image generation$0.064 / job$0.058 / job$0.054 / job$0.051 / jobContact salesSelf-serve
Kling v3 video
kling-v3
Standard, no audio $0.084 / second · Standard, with audio $0.126 / second · Professional, no audio $0.112 / second
video second$0.084-$0.422 / second$0.082-$0.408 / second$0.08-$0.399 / second$0.078-$0.391 / secondfrom $0.075-$0.377 / secondSelf-serve
Kling v3 Omni video
kling-v3-omni
Standard, no reference, no audio $0.084 / second · Standard, no reference, with audio $0.112 / second · Standard, reference, no audio $0.126 / second
video second$0.084-$0.422 / second$0.082-$0.408 / second$0.08-$0.399 / second$0.078-$0.391 / secondfrom $0.075-$0.377 / secondSelf-serve

PAYG

Status unavailable

Pay as you go

No monthly committed spend; built for tests, prototypes, and variable usage.

  • Standard pay-as-you-go pricing
  • Self-serve access follows account permissions and live model status
  • Real usage, real billing, and live balance updates

Monthly committed spend

Committed-spend rates

Contact sales

A monthly minimum spend commitment, not a fixed token package. If usage is lower, the invoice stays at the committed amount; if usage is higher, the invoice follows actual usage.

  • Self-serve committed spend follows the tiers visible to your account
  • Higher committed spend is confirmed with the BatchIn team
  • Team API keys, billing export, usage records, reliability controls, and spend policies

Coding workloads

Workload profile

No price change

For code generation, developer automation, and agent calls: usage breakdowns, access controls, usage records, and workload-specific limits.

  • Repo / task / branch / PR cost attribution
  • Coding task usage records and exports
  • Long-context budgets, coding call policies, and continuity controls

Multimodal workloads

Workload profile

No price change

For priced image understanding, image generation, video generation, image-to-video, and multimodal agent workflows.

  • Video job queue, pre-charge, refunds on failure, retries, and callbacks
  • Project / campaign / asset / job cost attribution
  • Video and image workloads include task status, output assets, and usage receipts

Dedicated Capacity

Commercial delivery

Contact sales

Quoted by 8-GPU whole-machine monthly price, with a 32-GPU minimum and a three-month minimum term.

  • Public monthly price per 8-GPU whole machine
  • 32-GPU / four-machine minimum and three-month term
  • Dedicated endpoints, whole machines, and managed deployments are contact-sales only

Dedicated Capacity

Whole-machine GPU monthly pricing

Per-machine monthly prices, with a 32-GPU minimum, a three-month minimum term, and quote-led delivery.

GPUGPU CountPublic Price
GB300 NVL72NVL72 / 72 GPUsContact sales
GB200 NVL72NVL72 / 72 GPUsContact sales
HGX B3008 GPUsContact sales
HGX B2008 GPUsContact sales
H200 141GB8 GPUs$12.0k / month
H100 80GB8 GPUs$9.2k / month
A100 80GB NVLink8 GPUs$7.1k / month
L40S 48GB8 GPUs$4.1k / month
RTX 5090 32GB8 GPUs$2.2k / month
RTX 4090 24GB8 GPUs$1.6k / month