Doubao API Setup 2026: 19 ByteDance Models, $0.022/M Floor, Python in 5 Min

ByteDance ships 19 active Doubao API SKUs in 2026 — chat tiers from $0.022/M output (Seed 1.6 Flash) up to $2.57/M (Seed 2.0 Pro flagship), plus four Seedream image models and four Seedance video models. All chat models share a 256K context window. Seed 2.0 and Seed 1.6 chat models support vision, tool calls, JSON output, streaming, and thinking mode. Doubao 1.5 sits on a smaller 32K context.

The honest catch: Doubao’s direct API path (Volcano Engine Ark) gates registration behind a Chinese-mainland phone number and real-name verification. The OpenAI-compatible aggregator path (TokenMix) skips that gate but charges what amounts to a parity-routed price. All numbers in this guide are from the TokenMix model registry pulled 2026-05-14. The “cheapest tier” line: doubao-seed-1.6-flash at $0.022 input / $0.219 output per million tokens — about 6x cheaper output than Doubao Seed 2.0 Pro and roughly an order of magnitude cheaper than GPT-5.5.

What Is Doubao and Why It Matters
The 19-Model Doubao Lineup
Pricing Breakdown: What You Actually Pay
Direct Volcano Ark vs Aggregator Access
Supported LLM Providers and Model Routing
Quick Installation Guide
Known Limitations and Gotchas
When to Use Doubao (Decision Table)
FAQ

What Is Doubao and Why It Matters {#what-is-doubao}

Doubao is ByteDance’s foundation-model family, served from Volcano Engine (Ark). It is the largest Chinese-origin model lineup behind a single OpenAI-compatible endpoint and currently spans four generations:

Seed 2.0 (released 2026-02-14): flagship, multimodal, agentic-coding focus, 256K context. Four tiers: Pro, Code, Lite, Mini.
Seed 1.8 (2025-12-27) and Seed 1.6 (2025-10-14): same 256K context, vision + tools + thinking mode, cheaper baseline.
Doubao 1.5 (2025-01-14): older 32K-context series. Cheap output floor but limited context.
Seedream (image) and Seedance (video): separate per-generation pricing.

The performance claim: ByteDance positions Seed 2.0 Pro as leading multimodal + agentic reasoning with state-of-the-art vision benchmarks. Cross-vendor benchmarks against Claude/GPT/Gemini have not been published with comparable rigor, so treat agentic-leadership claims as vendor-stated until independent third-parties weigh in.

The honest caveat: Doubao 1.5’s $0.044/$0.088 floor pricing on Lite looks attractive but the 32K context cap excludes most modern RAG, codebase, and long-document workloads. For new builds the realistic floor is doubao-seed-1.6-flash at $0.022/$0.219.

The 19-Model Doubao Lineup {#doubao-lineup}

All prices are USD per 1M tokens. Capabilities (V = vision, T = tools, R = reasoning) reflect the TokenMix model registry as of 2026-05-14.

Chat models (12 active SKUs)

short_id	Generation	Input	Output	Context	V	T	R	Released
doubao-seed-2.0-pro	Seed 2.0	$0.514	$2.57	256K	✓	✓	✓	2026-02-14
doubao-seed-2.0-code	Seed 2.0	$0.467	$2.34	256K	✓	✓	✓	2026-02-14
doubao-seed-2.0-lite	Seed 2.0	$0.088	$0.526	256K	✓	✓	✓	2026-02-14
doubao-seed-2.0-mini	Seed 2.0	$0.029	$0.292	256K	✓	✓	✓	2026-02-14
doubao-seed-1.8	Seed 1.8	$0.117	$1.168	256K	✓	✓	✓	2025-12-27
doubao-seed-1.6	Seed 1.6	$0.117	$1.168	256K	✓	✓	✓	2025-10-14
doubao-seed-1.6-lite	Seed 1.6	$0.044	$0.350	256K	✓	✓	✓	2025-10-14
doubao-seed-1.6-flash	Seed 1.6	$0.022	$0.219	256K	✓	✓	✓	2025-08-27
doubao-1.5-pro	1.5	$0.117	$0.292	32K	✗	✓	✗	2025-01-14
doubao-1.5-vision-pro	1.5	$0.438	$1.314	32K	✓	✓	✗	2025-01-14
doubao-1.5-lite	1.5	$0.044	$0.088	32K	✗	✓	✗	2025-01-14

Bold = the floor. New builds should default here.

Image and video (7 models)

short_id	Type	Released	Notes
seedream-5.0	Image	2026-01-27	Latest text-to-image flagship
seedream-4.5	Image	2025-11-27	Previous flagship
seedream-4.0	Image	2025-08-27	Stable text-to-image
seedream-3.0-t2i	Image	2025-04-14	Earlier gen
seedance-2.0	Video	2026-01-27	Current video flagship
seedance-2.0-fast	Video	2026-01-27	Speed variant
seedance-1.5-pro	Video	2025-12-14	Previous Pro

Image/video are priced per generation rather than per token.

Pricing Breakdown: What You Actually Pay {#pricing}

Token economics matter more than headline rates because each model uses tokens differently. Below are scenario-based monthly costs at Doubao’s standard tier (uncached input baseline; Doubao does not currently expose cache-hit pricing through TokenMix).

Workload	Tokens in / out	Model	Monthly Cost
Support chatbot	100M / 30M	doubao-seed-1.6-flash	$8.77
RAG with 256K context	400M / 100M	doubao-seed-2.0-lite	$87.80
Agentic coding assistant	500M / 100M (80% Code + 20% Pro)	doubao-seed-2.0-code → Pro	$476.80
2-tier smart router	1B / 200M (90% Flash + 10% Pro)	flash → pro	$162.02
Same workload on Seed 2.0 Pro only	1B / 200M	doubao-seed-2.0-pro	$1,028

Key judgment: Running everything on Seed 2.0 Pro versus a 90/10 Flash/Pro router costs ~6.3x more. Default-then-escalate is the right pattern.

Cost optimization paths:

Start at doubao-seed-1.6-flash for high-volume classification, extraction, draft generation
Escalate to doubao-seed-2.0-pro only when vision, 256K context, or agentic-coding benchmarks justify the 23x output-price premium
Use Seed 2.0 Code (doubao-seed-2.0-code) specifically for code generation steps
Skip Doubao 1.5 for new builds — 32K context kills modern RAG flows

Direct Volcano Ark vs Aggregator Access {#access-path}

Direct Volcano Ark gives the lowest theoretical per-token cost (raw vendor list price). The aggregator path removes the China-residency gate that blocks most non-Chinese developers. The right pick depends on whether your business entity is in mainland China.

Dimension	Volcano Ark Direct	OpenAI-Compatible Aggregator
Account requirement	Volcano account + Chinese mainland phone + real-name verification	Single signup, email-only
Free credits	500K-5M free tokens per model at signup	Pay-as-you-go from request 1
Models	Full Doubao + Seedream + Seedance catalog + Volcano-only third-party	19 active Doubao models alongside 150+ models from other providers
SDK	Volcano Ark SDK or OpenAI-compatible via `ark.cn-beijing.volces.com`	OpenAI-compatible via aggregator base_url — drop-in for any OpenAI SDK
Billing	RMB invoices	USD card or unified credit
Multi-region failover	Manual	Automatic where applicable
Where it wins	Per-token cost floor, Chinese-mainland builds	Anyone outside mainland China; multi-model workloads

Supported LLM Providers and Model Routing {#supported-providers}

If you are building a multi-model application, picking one provider per model family creates 5+ accounts, 5+ billing surfaces, and 5+ rate-limit dashboards. The aggregator pattern collapses this into one OpenAI-compatible endpoint.

TokenMix.ai is OpenAI-compatible and routes to 150+ models including Doubao Seed 2.0, Claude Opus 4.7, GPT-5.5, Gemini 3 Pro, DeepSeek V4, Kimi K2.6, and MiniMax M2.7 through one API key. The configuration is a single env-var change:

export OPENAI_API_KEY="tkmx-..."
export OPENAI_BASE_URL="https://api.tokenmix.ai/v1"

Or for SDKs that take both inline:

from openai import OpenAI

client = OpenAI(
    api_key="tkmx-...",
    base_url="https://api.tokenmix.ai/v1",
)

The same client object now calls doubao-seed-2.0-pro, gpt-5.5, claude-opus-4-7, deepseek-v4-flash, and so on by changing only the model parameter per request. That makes Doubao a first-class choice in a routing strategy rather than an isolated experiment.

For Chinese-mainland production with regulatory requirements, go direct to Volcano Ark instead.

Quick Installation Guide {#installation}

Doubao via the OpenAI-compatible aggregator path takes about 5 minutes from zero. Direct Volcano Ark setup takes longer because of real-name verification but follows the same SDK pattern once the account is approved.

# 1. Install OpenAI SDK
pip install openai

# 2. Export credentials
export OPENAI_API_KEY="tkmx-..."           # from tokenmix.ai dashboard
export OPENAI_BASE_URL="https://api.tokenmix.ai/v1"

Cheapest tier call (doubao-seed-1.6-flash):

from openai import OpenAI
import os

client = OpenAI()  # picks up env vars

response = client.chat.completions.create(
    model="doubao-seed-1.6-flash",
    messages=[
        {"role": "user", "content": "Summarize this support ticket in two sentences: " + ticket_body}
    ],
)
print(response.choices[0].message.content)

Flagship tier with tools (doubao-seed-2.0-pro):

response = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{"role": "user", "content": "Plan the next 3 steps to fix this bug..."}],
    tools=[{"type": "function", "function": {
        "name": "run_tests",
        "description": "Execute the test suite",
        "parameters": {"type": "object", "properties": {}},
    }}],
)

Vision input on Seed 2.0 (image + text):

response = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/img.png"}},
        ],
    }],
)

Streaming mode (any chat model):

stream = client.chat.completions.create(
    model="doubao-seed-1.6-flash",
    messages=[{"role": "user", "content": "Write a haiku about API latency."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Known Limitations and Gotchas {#limitations}

1. Doubao 1.5 is 32K context only. New RAG/coding/long-doc workloads should not target the 1.5 series despite its lower output price. The accuracy savings from being able to keep full context in one call outweigh the per-token savings.

2. Vision is not on every chat model. Doubao 1.5 non-Vision SKUs (doubao-1.5-pro, doubao-1.5-lite) do not accept image input. Confirm support_vision=true in the registry before sending multimodal payloads.

3. Model IDs are case-sensitive. Use lowercase doubao-seed-2.0-pro exactly. Doubao-Seed-2.0-Pro will return model not found.

4. max_tokens parameter required for long output. SDK defaults can cap output at 4K even when the model supports 128K max output. Pass max_tokens explicitly when you need long completions.

5. Thinking mode adds output tokens you pay for. Seed 2.0 / 1.6 thinking mode emits reasoning traces alongside the final answer. Disable it on latency-sensitive paths where users only see the final answer.

6. Tool-call protocol requires both messages in next turn. When the model emits a tool_call, you must pass back the assistant’s tool_call message AND the tool_result message in the next request. Missing either yields empty responses or errors.

7. Image and video models are per-generation priced, not per-token. Seedream and Seedance pricing does not follow the input/output token model. Pull current per-call rates before integrating high-volume image or video pipelines.

When to Use Doubao (Decision Table) {#when-to-use}

Workload	Start with	Escalate to	Avoid
Classification, extraction	doubao-seed-1.6-flash	doubao-seed-1.6-lite if structure fails	Doubao 1.5 (context cap)
Customer support draft	doubao-seed-1.6-lite	doubao-seed-2.0-lite	Pro for first-pass replies
RAG with 256K context	doubao-seed-2.0-lite	doubao-seed-2.0-pro for hard queries	32K-only models
Agentic coding agent	doubao-seed-2.0-code	doubao-seed-2.0-pro for planning	Seed 1.6 for tool-heavy chains
Vision-heavy multimodal	doubao-seed-2.0-pro	—	Doubao 1.5 non-Vision
Long-document review	doubao-seed-2.0-pro (256K)	—	32K-only models
Text-to-image	seedream-5.0	seedream-4.5 for cost	Older Seedream 3.0
Short video generation	seedance-2.0-fast	seedance-2.0 for quality	1.0 series

Decision heuristic: start at the cheapest tier that meets your accuracy bar, then escalate per-call only when a failing step justifies the cost. A 90% Flash + 10% Pro router beats running everything on Pro by ~84% on monthly cost.

FAQ {#faq}

What is the cheapest Doubao chat model in 2026?

doubao-seed-1.6-flash at $0.022 input / $0.219 output per million tokens. It supports vision, tools, JSON, streaming, and thinking mode, with a 256K context window. It is the realistic floor for new Doubao builds — older Doubao 1.5 Lite is cheaper on output but capped at 32K context.

Which Doubao model is best for coding?

doubao-seed-2.0-code at $0.467 input / $2.34 output per million tokens, 256K context. For agentic coding loops that mix planning and execution, route planning to doubao-seed-2.0-pro and execution to Seed 2.0 Code or Seed 1.6 Flash.

Do I need a Chinese phone number to use Doubao?

You need one to register on Volcano Ark directly. You do not need one to access Doubao through an OpenAI-compatible aggregator — those route to ByteDance upstream without exposing the verification gate to the developer.

Is Doubao OpenAI-compatible?

Yes, both directly (ark.cn-beijing.volces.com exposes an OpenAI-style endpoint) and via aggregators like TokenMix.ai (api.tokenmix.ai/v1). You can use the standard OpenAI Python SDK by changing only base_url and model.

Does Doubao Seed 2.0 support tool calls and JSON mode?

All Seed 2.0 and Seed 1.6 chat models support tool calls (function calling), JSON mode output, structured output, and streaming. Doubao 1.5 supports tools but not reasoning/thinking mode.

How does Doubao pricing compare to DeepSeek and Qwen?

DeepSeek V4-Flash ($0.14 input / $0.28 output per MTok) is roughly 73% cheaper input and 89% cheaper output than Doubao Seed 2.0 Pro. Doubao’s advantage is multimodal vision + agentic-coding positioning. Qwen offers more multilingual tiers. A multi-model setup with all three through one API key is typically cheaper than committing to any single family.

Can I use Seedream image and Seedance video models the same way?

Yes — both are listed in the registry and routable through OpenAI-compatible aggregators. Pricing is per generation rather than per token, so check live rates before integrating high-volume image or video pipelines.

Author: TokenMix Research Lab | Last Updated: 2026-05-14 | Data Sources: TokenMix Model Registry, Volcano Engine Doubao, Volcano Pricing Docs | Original article: tokenmix.ai/blog/doubao-api-getting-started