Doubao API Setup 2026: 19 ByteDance Models, $0.022/M Floor, Python in 5 Min

ByteDance ships 19 active Doubao API SKUs in 2026 — chat tiers from $0.022/M output (Seed 1.6 Flash) up to $2.57/M (Seed 2.0 Pro flagship), plus four Seedream image models and four Seedance video models. All chat models share a 256K context window. Seed 2.0 and Seed 1.6 chat models support vision, tool calls, JSON output, streaming, and thinking mode. Doubao 1.5 sits on a smaller 32K context.

The honest catch: Doubao’s direct API path (Volcano Engine Ark) gates registration behind a Chinese-mainland phone number and real-name verification. The OpenAI-compatible aggregator path (TokenMix) skips that gate but charges what amounts to a parity-routed price. All numbers in this guide are from the TokenMix model registry pulled 2026-05-14. The “cheapest tier” line: doubao-seed-1.6-flash at $0.022 input / $0.219 output per million tokens — about 6x cheaper output than Doubao Seed 2.0 Pro and roughly an order of magnitude cheaper than GPT-5.5.

Table of Contents

  • What Is Doubao and Why It Matters
  • The 19-Model Doubao Lineup
  • Pricing Breakdown: What You Actually Pay
  • Direct Volcano Ark vs Aggregator Access
  • Supported LLM Providers and Model Routing
  • Quick Installation Guide
  • Known Limitations and Gotchas
  • When to Use Doubao (Decision Table)
  • FAQ

What Is Doubao and Why It Matters {#what-is-doubao}

Doubao is ByteDance’s foundation-model family, served from Volcano Engine (Ark). It is the largest Chinese-origin model lineup behind a single OpenAI-compatible endpoint and currently spans four generations:

  • Seed 2.0 (released 2026-02-14): flagship, multimodal, agentic-coding focus, 256K context. Four tiers: Pro, Code, Lite, Mini.
  • Seed 1.8 (2025-12-27) and Seed 1.6 (2025-10-14): same 256K context, vision + tools + thinking mode, cheaper baseline.
  • Doubao 1.5 (2025-01-14): older 32K-context series. Cheap output floor but limited context.
  • Seedream (image) and Seedance (video): separate per-generation pricing.

The performance claim: ByteDance positions Seed 2.0 Pro as leading multimodal + agentic reasoning with state-of-the-art vision benchmarks. Cross-vendor benchmarks against Claude/GPT/Gemini have not been published with comparable rigor, so treat agentic-leadership claims as vendor-stated until independent third-parties weigh in.

The honest caveat: Doubao 1.5’s $0.044/$0.088 floor pricing on Lite looks attractive but the 32K context cap excludes most modern RAG, codebase, and long-document workloads. For new builds the realistic floor is doubao-seed-1.6-flash at $0.022/$0.219.

The 19-Model Doubao Lineup {#doubao-lineup}

All prices are USD per 1M tokens. Capabilities (V = vision, T = tools, R = reasoning) reflect the TokenMix model registry as of 2026-05-14.

Chat models (12 active SKUs)

short_id Generation Input Output Context V T R Released
doubao-seed-2.0-pro Seed 2.0 $0.514 $2.57 256K 2026-02-14
doubao-seed-2.0-code Seed 2.0 $0.467 $2.34 256K 2026-02-14
doubao-seed-2.0-lite Seed 2.0 $0.088 $0.526 256K 2026-02-14
doubao-seed-2.0-mini Seed 2.0 $0.029 $0.292 256K 2026-02-14
doubao-seed-1.8 Seed 1.8 $0.117 $1.168 256K 2025-12-27
doubao-seed-1.6 Seed 1.6 $0.117 $1.168 256K 2025-10-14
doubao-seed-1.6-lite Seed 1.6 $0.044 $0.350 256K 2025-10-14
doubao-seed-1.6-flash Seed 1.6 $0.022 $0.219 256K 2025-08-27
doubao-1.5-pro 1.5 $0.117 $0.292 32K 2025-01-14
doubao-1.5-vision-pro 1.5 $0.438 $1.314 32K 2025-01-14
doubao-1.5-lite 1.5 $0.044 $0.088 32K 2025-01-14

Bold = the floor. New builds should default here.

Image and video (7 models)

short_id Type Released Notes
seedream-5.0 Image 2026-01-27 Latest text-to-image flagship
seedream-4.5 Image 2025-11-27 Previous flagship
seedream-4.0 Image 2025-08-27 Stable text-to-image
seedream-3.0-t2i Image 2025-04-14 Earlier gen
seedance-2.0 Video 2026-01-27 Current video flagship
seedance-2.0-fast Video 2026-01-27 Speed variant
seedance-1.5-pro Video 2025-12-14 Previous Pro

Image/video are priced per generation rather than per token.

Pricing Breakdown: What You Actually Pay {#pricing}

Token economics matter more than headline rates because each model uses tokens differently. Below are scenario-based monthly costs at Doubao’s standard tier (uncached input baseline; Doubao does not currently expose cache-hit pricing through TokenMix).

Workload Tokens in / out Model Monthly Cost
Support chatbot 100M / 30M doubao-seed-1.6-flash $8.77
RAG with 256K context 400M / 100M doubao-seed-2.0-lite $87.80
Agentic coding assistant 500M / 100M (80% Code + 20% Pro) doubao-seed-2.0-code → Pro $476.80
2-tier smart router 1B / 200M (90% Flash + 10% Pro) flash → pro $162.02
Same workload on Seed 2.0 Pro only 1B / 200M doubao-seed-2.0-pro $1,028

Key judgment: Running everything on Seed 2.0 Pro versus a 90/10 Flash/Pro router costs ~6.3x more. Default-then-escalate is the right pattern.

Cost optimization paths:

  1. Start at doubao-seed-1.6-flash for high-volume classification, extraction, draft generation
  2. Escalate to doubao-seed-2.0-pro only when vision, 256K context, or agentic-coding benchmarks justify the 23x output-price premium
  3. Use Seed 2.0 Code (doubao-seed-2.0-code) specifically for code generation steps
  4. Skip Doubao 1.5 for new builds — 32K context kills modern RAG flows

Direct Volcano Ark vs Aggregator Access {#access-path}

Direct Volcano Ark gives the lowest theoretical per-token cost (raw vendor list price). The aggregator path removes the China-residency gate that blocks most non-Chinese developers. The right pick depends on whether your business entity is in mainland China.

Dimension Volcano Ark Direct OpenAI-Compatible Aggregator
Account requirement Volcano account + Chinese mainland phone + real-name verification Single signup, email-only
Free credits 500K-5M free tokens per model at signup Pay-as-you-go from request 1
Models Full Doubao + Seedream + Seedance catalog + Volcano-only third-party 19 active Doubao models alongside 150+ models from other providers
SDK Volcano Ark SDK or OpenAI-compatible via ark.cn-beijing.volces.com OpenAI-compatible via aggregator base_url — drop-in for any OpenAI SDK
Billing RMB invoices USD card or unified credit
Multi-region failover Manual Automatic where applicable
Where it wins Per-token cost floor, Chinese-mainland builds Anyone outside mainland China; multi-model workloads

Supported LLM Providers and Model Routing {#supported-providers}

If you are building a multi-model application, picking one provider per model family creates 5+ accounts, 5+ billing surfaces, and 5+ rate-limit dashboards. The aggregator pattern collapses this into one OpenAI-compatible endpoint.

TokenMix.ai is OpenAI-compatible and routes to 150+ models including Doubao Seed 2.0, Claude Opus 4.7, GPT-5.5, Gemini 3 Pro, DeepSeek V4, Kimi K2.6, and MiniMax M2.7 through one API key. The configuration is a single env-var change:

export OPENAI_API_KEY="tkmx-..."
export OPENAI_BASE_URL="https://api.tokenmix.ai/v1"

Or for SDKs that take both inline:

from openai import OpenAI

client = OpenAI(
    api_key="tkmx-...",
    base_url="https://api.tokenmix.ai/v1",
)

The same client object now calls doubao-seed-2.0-pro, gpt-5.5, claude-opus-4-7, deepseek-v4-flash, and so on by changing only the model parameter per request. That makes Doubao a first-class choice in a routing strategy rather than an isolated experiment.

For Chinese-mainland production with regulatory requirements, go direct to Volcano Ark instead.

Quick Installation Guide {#installation}

Doubao via the OpenAI-compatible aggregator path takes about 5 minutes from zero. Direct Volcano Ark setup takes longer because of real-name verification but follows the same SDK pattern once the account is approved.

# 1. Install OpenAI SDK
pip install openai

# 2. Export credentials
export OPENAI_API_KEY="tkmx-..."           # from tokenmix.ai dashboard
export OPENAI_BASE_URL="https://api.tokenmix.ai/v1"

Cheapest tier call (doubao-seed-1.6-flash):

from openai import OpenAI
import os

client = OpenAI()  # picks up env vars

response = client.chat.completions.create(
    model="doubao-seed-1.6-flash",
    messages=[
        {"role": "user", "content": "Summarize this support ticket in two sentences: " + ticket_body}
    ],
)
print(response.choices[0].message.content)

Flagship tier with tools (doubao-seed-2.0-pro):

response = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{"role": "user", "content": "Plan the next 3 steps to fix this bug..."}],
    tools=[{"type": "function", "function": {
        "name": "run_tests",
        "description": "Execute the test suite",
        "parameters": {"type": "object", "properties": {}},
    }}],
)

Vision input on Seed 2.0 (image + text):

response = client.chat.completions.create(
    model="doubao-seed-2.0-pro",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {"type": "image_url", "image_url": {"url": "https://example.com/img.png"}},
        ],
    }],
)

Streaming mode (any chat model):

stream = client.chat.completions.create(
    model="doubao-seed-1.6-flash",
    messages=[{"role": "user", "content": "Write a haiku about API latency."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Known Limitations and Gotchas {#limitations}

1. Doubao 1.5 is 32K context only. New RAG/coding/long-doc workloads should not target the 1.5 series despite its lower output price. The accuracy savings from being able to keep full context in one call outweigh the per-token savings.

2. Vision is not on every chat model. Doubao 1.5 non-Vision SKUs (doubao-1.5-pro, doubao-1.5-lite) do not accept image input. Confirm support_vision=true in the registry before sending multimodal payloads.

3. Model IDs are case-sensitive. Use lowercase doubao-seed-2.0-pro exactly. Doubao-Seed-2.0-Pro will return model not found.

4. max_tokens parameter required for long output. SDK defaults can cap output at 4K even when the model supports 128K max output. Pass max_tokens explicitly when you need long completions.

5. Thinking mode adds output tokens you pay for. Seed 2.0 / 1.6 thinking mode emits reasoning traces alongside the final answer. Disable it on latency-sensitive paths where users only see the final answer.

6. Tool-call protocol requires both messages in next turn. When the model emits a tool_call, you must pass back the assistant’s tool_call message AND the tool_result message in the next request. Missing either yields empty responses or errors.

7. Image and video models are per-generation priced, not per-token. Seedream and Seedance pricing does not follow the input/output token model. Pull current per-call rates before integrating high-volume image or video pipelines.

When to Use Doubao (Decision Table) {#when-to-use}

Workload Start with Escalate to Avoid
Classification, extraction doubao-seed-1.6-flash doubao-seed-1.6-lite if structure fails Doubao 1.5 (context cap)
Customer support draft doubao-seed-1.6-lite doubao-seed-2.0-lite Pro for first-pass replies
RAG with 256K context doubao-seed-2.0-lite doubao-seed-2.0-pro for hard queries 32K-only models
Agentic coding agent doubao-seed-2.0-code doubao-seed-2.0-pro for planning Seed 1.6 for tool-heavy chains
Vision-heavy multimodal doubao-seed-2.0-pro Doubao 1.5 non-Vision
Long-document review doubao-seed-2.0-pro (256K) 32K-only models
Text-to-image seedream-5.0 seedream-4.5 for cost Older Seedream 3.0
Short video generation seedance-2.0-fast seedance-2.0 for quality 1.0 series

Decision heuristic: start at the cheapest tier that meets your accuracy bar, then escalate per-call only when a failing step justifies the cost. A 90% Flash + 10% Pro router beats running everything on Pro by ~84% on monthly cost.

FAQ {#faq}

What is the cheapest Doubao chat model in 2026?

doubao-seed-1.6-flash at $0.022 input / $0.219 output per million tokens. It supports vision, tools, JSON, streaming, and thinking mode, with a 256K context window. It is the realistic floor for new Doubao builds — older Doubao 1.5 Lite is cheaper on output but capped at 32K context.

Which Doubao model is best for coding?

doubao-seed-2.0-code at $0.467 input / $2.34 output per million tokens, 256K context. For agentic coding loops that mix planning and execution, route planning to doubao-seed-2.0-pro and execution to Seed 2.0 Code or Seed 1.6 Flash.

Do I need a Chinese phone number to use Doubao?

You need one to register on Volcano Ark directly. You do not need one to access Doubao through an OpenAI-compatible aggregator — those route to ByteDance upstream without exposing the verification gate to the developer.

Is Doubao OpenAI-compatible?

Yes, both directly (ark.cn-beijing.volces.com exposes an OpenAI-style endpoint) and via aggregators like TokenMix.ai (api.tokenmix.ai/v1). You can use the standard OpenAI Python SDK by changing only base_url and model.

Does Doubao Seed 2.0 support tool calls and JSON mode?

All Seed 2.0 and Seed 1.6 chat models support tool calls (function calling), JSON mode output, structured output, and streaming. Doubao 1.5 supports tools but not reasoning/thinking mode.

How does Doubao pricing compare to DeepSeek and Qwen?

DeepSeek V4-Flash ($0.14 input / $0.28 output per MTok) is roughly 73% cheaper input and 89% cheaper output than Doubao Seed 2.0 Pro. Doubao’s advantage is multimodal vision + agentic-coding positioning. Qwen offers more multilingual tiers. A multi-model setup with all three through one API key is typically cheaper than committing to any single family.

Can I use Seedream image and Seedance video models the same way?

Yes — both are listed in the registry and routable through OpenAI-compatible aggregators. Pricing is per generation rather than per token, so check live rates before integrating high-volume image or video pipelines.

Author: TokenMix Research Lab | Last Updated: 2026-05-14 | Data Sources: TokenMix Model Registry, Volcano Engine Doubao, Volcano Pricing Docs | Original article: tokenmix.ai/blog/doubao-api-getting-started