Pricing

Pricing Calculator

Estimate your monthly costs with BatchIn and compare against verified platform competitors.

API endpoint

https://api.batchin.tech/v1

Funding rails

USDC / enterprise billing / Alipay, subject to product and account state

Capacity path

Shared core / reserved inference / dedicated 8+ GPU delivery

Production posture

Stable production-scale text and multimodal traffic

Global Access, Unified Ingress

Providing consistent performance and reliability across global and domestic markets.

BatchIn coordinates API gateways, security controls, and compute availability to guarantee localized delivery with unified engineering standards.

Review live contract

Global entry

BatchIn serves global developers and enterprise buyers through the English storefront and global API.

Use this path for USD pricing, public MCP discovery, OpenAI-compatible access, and global-facing sales delivery.

https://batchin.tech · https://api.batchin.tech/v1

China entry

LuminaPath serves domestic customers through the Chinese storefront and domestic API contract.

Use this path for RMB-facing onboarding, domestic support posture, and China-specific commercial delivery without changing the BatchIn product experience.

https://luminapath.tech · https://api.luminapath.tech/v1

Production posture

Built for stable production-scale traffic.

Traffic mix

Text plus vision, audio, image, and video workloads.

Streaming path

Regional ingress, stable streaming, and request continuity.

Control guardrails

Scoped limits, request isolation, and backpressure controls.

Public contract and readiness

OpenAI-compatible endpoints stay stable across chat, responses, embeddings, images, audio, and video.
Public MCP transport and tool discovery stay on the BatchIn contract instead of exposing execution details.
Traffic policy is designed for stable production text and multimodal workloads, not only demo-scale traffic.
Capability availability follows aligned usage, cost, billing, trace, and verification records.

Shared core

Self-serve developers and ordinary enterprise traffic run on the shared BatchIn control core.

This is where public Model API, batch, usage, billing, and public MCP contract stay consistent.

Private lanes

Reserved inference, dedicated endpoints, and larger enterprise traffic move into stricter capacity lanes.

Customer UI keeps one product truth while delivery, quota, and isolation can vary by contract.

Compute truth

Dedicated 8+ GPU delivery and smaller hourly rental both resolve against the same compute and capacity truth.

Public pages show inventory and availability only from the verified compute registry.

Edge ingress

Global traffic enters through a regional edge designed for resilient access and stable session continuity.

Latency work starts at the customer-facing edge before requests enter the primary execution path.

Streaming delivery

BatchIn maintains stable streaming behavior across cross-region and mixed-media workloads.

Connection reuse and consumer isolation are tuned to reduce jitter and long-tail failures.

Traffic policy

Traffic policy stays explicit through scoped protection, retry discipline, and graceful overload handling.

Customers see a simple API and clear limits while BatchIn handles traffic protection behind the scenes.

Funding and protocols

Keep human funding, agent receipts, and settlement protocols separate

The pricing page keeps payment roles clear: USDC and x402 are English-site funding rails, ERC-8004 is for receipt verification, and Alipay stays on the Chinese storefront.

Payment and protocol status is temporarily unavailable. The public surface does not claim funding or settlement availability without a live protocol catalog.

Concurrency and capability posture

Text and multimodal workloads share one commercial account model

Built for stable production-scale text, vision, audio, image, and video traffic.
The global entry prioritizes edge ingress, connection reuse, streaming delivery, and regional rate control, while the China entry keeps localized onboarding and commercial delivery.
Ordinary enterprise traffic uses the shared core, while reserved inference, dedicated endpoints, and 8+ GPU delivery move into private lanes and stronger quota isolation.

Model pricing

Static pricing fallback

38 models

ModelStatusPublic priceAvailability
deepseek-v4-flash
DeepSeek V4 Flash
liveContact usAvailable
deepseek-v4-pro
DeepSeek V4 Pro
exclusiveContact usUnavailable
deepseek-v3-2
DeepSeek V3.2
liveContact usAvailable
qwen3-coder-480b-a35b
Qwen3-Coder-480B-A35B
liveContact usAvailable
qwen3-coder-30b-a3b
Qwen3-Coder-30B-A3B
liveContact usAvailable
qwen3-next-80b-a3b
Qwen3-Next-80B-A3B
liveContact usAvailable
qwen-3-6-plus
Qwen 3.6 Plus
exclusiveContact usUnavailable
qwen3-5-397b
Qwen3.5-397B
liveContact usAvailable
glm-5-1
GLM-5.1
liveContact usAvailable
glm-5
GLM-5
liveContact usAvailable
kimi-k2-6
Kimi K2.6
exclusiveContact usUnavailable
kimi-k2-5
Kimi K2.5
liveContact usAvailable
kimi-k2-instruct-0905
Kimi K2 Instruct 0905
liveContact usAvailable
minimax-m2-7
MiniMax M2.7
exclusiveContact usUnavailable
minimax-m2-5
MiniMax M2.5
exclusiveContact usUnavailable
mimo-v2-pro
MiMo-V2-Pro
liveContact usAvailable
mimo-v2-omni
MiMo-V2-Omni
liveContact usAvailable
mimo-v2-flash
MiMo-V2-Flash
liveContact usAvailable
gpt-oss-120b
GPT-OSS-120B
liveContact usAvailable
gpt-oss-20b
GPT-OSS-20B
liveContact usAvailable
ernie-4-5-300b
ERNIE-4.5-300B
liveContact usAvailable
kat-coder-pro-v2
KAT-Coder-Pro V2
liveContact usAvailable
seed-1-6-flash
Seed 1.6 Flash
liveContact usAvailable
grok-4-1-fast
Grok 4.1 Fast
exclusiveContact usUnavailable
gemini-3-1-pro-preview
Gemini 3.1 Pro Preview
exclusiveContact usUnavailable
gemini-3-flash-preview
Gemini 3 Flash Preview
exclusiveContact usUnavailable
gpt-5-5
GPT-5.5
exclusiveContact usUnavailable
gpt-5-4-mini
GPT-5.4 Mini
exclusiveContact usUnavailable
llama-4-scout
Llama 4 Scout
liveContact usAvailable
llama-3-3-70b
Llama 3.3 70B
liveContact usAvailable
thudm-glm-4-9b
GLM-4-9B
liveContact usAvailable
step-3-5-flash
Step 3.5 Flash
liveContact usAvailable
gemini-2-5-flash
Gemini 2.5 Flash
liveContact usAvailable
qwen3-embedding-8b
Qwen3 Embedding 8B
liveContact usAvailable
qwen3-235b-a22b-07-25
Qwen3 235B A22B Instruct 2507
liveContact usAvailable
gemma-4-26b-a4b-it
Gemma 4 26B A4B
liveContact usAvailable
glm-4-5-air
GLM 4.5 Air
liveContact usAvailable
gemma-4-31b-it
Gemma 4 31B
liveContact usAvailable

Model API

Status unavailable

Pay as you go

For testing, prototypes, and variable usage.

  • Instant API key creation
  • Real usage, real billing, and live balance updates
  • Unified model catalog and pricing page

Media API

Self-serve

On-demand checkout

Developer APIs for image, video, and audio generation jobs, task status, fallback, cost estimates, and receipts.

  • Generation jobs and task status
  • Failure reasons and fallback
  • Usage receipts and cost estimates

Spend Control

See pricing

See current price in Plans

Workspace budgets, API key limits, model allow/block controls, and customer or feature attribution.

  • Monthly workspace budgets
  • API key caps and request cost limits
  • Model allowlist and blocklist controls

VaaS Ledger

Public receipt path

On-demand checkout

Verifiable usage receipts for billable requests, downstream billing, and reconciliation.

  • Request-level receipts and trace links
  • Billing, audit, and dispute evidence
  • Cost attribution by customer and feature

Dedicated Capacity

Commercial delivery

Contact team

For customers that need larger long-term capacity, dedicated resource planning, or infrastructure-grade delivery. Dedicated GPU Capacity includes dedicated delivery support.

  • Dedicated endpoints stay on-demand where eligible
  • Card-hour GPU rental stays on-demand
  • 8+ GPU whole-node or whole-cluster capacity includes dedicated delivery