All models

Save 91%

574 models across multiple vendors and groups

Save 91%

gpt-image-2

OpenAI

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

GPT Image 2 is our state-of-the-art image generation model, enabling fast, high-quality image generation and editing. It supports flexible image resolutions and high-fidelity image inputs.

Volume billingPaintingDALL-E 3 format

Save 85%

claude-opus-4-7

Anthropic

Input: $0.7500 / 1M Tokens

Output: $3.7500 / 1M Tokens

Claude Opus 4.7 supports a 1-million-token context window, a maximum output of 128,000 tokens, adaptive reasoning, and the same tools and platform capabilities as Claude Opus 4.6.

Volume billingDialogueImage recognitionTool

Save 15%

gpt-5.5-pro

OpenAI

Input: $25.5000 / 1M Tokens

Output: $153.0000 / 1M Tokens

GPT-5.5pro is now available to handle Responses API requests, including through the BBatch API, enabling multi-turn model interactions before responding to API requests. Additional advanced API features will be rolled out in the future. As GPT-5.5pro is designed to tackle complex tasks, some requests may take several minutes to process. To prevent timeouts, we recommend using background mode.

Volume billingDialogueTool

Save 85%

deepseek-v4-flash

DeepSeek

Input: $0.1500 / 1M Tokens

Output: $0.3000 / 1M Tokens

DeepSeek-V4-Flash is a lightweight variant of the DeepSeek V4 family, emphasizing cost-effectiveness and high throughput. It is well-suited for general-purpose dialogue and foundational text tasks, while also supporting million-token-long contexts and efficient inference.

Volume billingDialogueTool

Save 85%

deepseek-v4-pro

DeepSeek

Input: $1.8000 / 1M Tokens

Output: $3.6000 / 1M Tokens

DeepSeek-V4-Pro is a high-performance open-source large model released by DeepSeek, featuring state-of-the-art reasoning and agent capabilities. It supports ultra-long context windows, is optimized for domestically produced Ascend chips, and offers exceptional value for money.

Volume billingDialogueThinkingTool

Save 79%

gemini-3.1-flash-tts-preview

Google

Input: $0.2100 / 1M Tokens

Output: $4.2000 / 1M Tokens

The Gemini-3.1-Flash-TTS-Preview text-to-speech audio model has been optimized to deliver cost-effective, low-latency, and controllable speech synthesis.

Volume billingAudio

Save 91%

gpt-5.5

OpenAI

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

GPT-5.5 is OpenAI’s flagship large language model, released on April 24, 2026. Positioned as a new generation of AI tailored for real-world work and autonomous agents, its key breakthrough lies in autonomous planning and execution of multi-step, complex tasks. It excels in programming, computer operations, scientific research and analysis, and delivers higher efficiency with significantly reduced token consumption.

Volume billingDialogueImage recognition

Save 91%

grok-4-20-non-reasoning

Grok (xAI)

Input: $0.1800 / 1M Tokens

Output: $0.5400 / 1M Tokens

Grok 4.20 is X.AI’s latest flagship model, delivering industry-leading speed and advanced tool-call capabilities. It combines the lowest hallucination rate in the market with rigorous adherence to factual accuracy, ensuring consistently precise and trustworthy responses.

Volume billingDialogueTool

Save 91%

grok-4-20-reasoning

Grok (xAI)

Input: $0.1800 / 1M Tokens

Output: $0.5400 / 1M Tokens

Volume billingDialogueTool

Save 91%

gpt-5.4-mini

OpenAI

Input: $0.0675 / 1M Tokens

Output: $0.4050 / 1M Tokens

GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.

Volume billingDialogueImage recognitionThinking+1

Save 91%

gpt-5.4-mini-2026-03-17

OpenAI

Input: $0.0675 / 1M Tokens

Output: $0.4050 / 1M Tokens

GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.

Volume billingDialogueImage recognitionThinking+1

Save 91%

gpt-5.4-nano

OpenAI

Input: $0.0180 / 1M Tokens

Output: $0.1125 / 1M Tokens

GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.

Volume billingDialogueImage recognitionThinking+1

Save 91%

gpt-5.4-nano-2026-03-17

OpenAI

Input: $0.0180 / 1M Tokens

Output: $0.1125 / 1M Tokens

GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.

Volume billingDialogueImage recognitionThinking+1

Save 85%

grok-videos

Grok (xAI)

$0.0300 / 次

Grok video model

Per-call billingAudio and videoAsynchronousGenerate videos with multiple image references

Save 85%

qwen3.6-27b

Bailian (阿里云百炼)

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

The Qwen3.6 27B native vision-language Dense model builds upon the 3.5-27B version, with key improvements in agentic coding capabilities and enhanced STEM reasoning. In the vision modality, it delivers significant advancements in spatial intelligence, object localization, and detection, while video understanding, document OCR, and visual agent capabilities continue to improve steadily.

Volume billingDialogueImage recognition

Save 85%

qwen3.6-35b-a3b

Bailian (阿里云百炼)

Input: $0.2700 / 1M Tokens

Output: $1.6200 / 1M Tokens

The Qwen3.6 35B-A3B is a native vision-language model built on a hybrid architecture that integrates linear attention mechanisms with a sparse mixture-of-experts framework, delivering superior inference efficiency. Compared with the 3.5-35B-A3B, this model demonstrates markedly improved agentic coding capabilities, mathematical and code reasoning skills, spatial intelligence, as well as object localization and object detection performance.

Volume billingImage recognitionDialogue

Save 85%

MiniMax-M2.7

Minimax

Input: $0.3150 / 1M Tokens

Output: $1.2600 / 1M Tokens

MiniMax-M2.7 has achieved or surpassed the state-of-the-art (SOTA) performance in productivity scenarios such as programming, tool invocation and search, and office work.

Volume billingDialogueTool

Save 79%

doubao-seed-2-0-code-preview-260215

Doubao (豆包)

Input: $0.6720 / 1M Tokens

Output: $3.3600 / 1M Tokens

Doubao-Seed-2.0-Code is optimized for enterprise-level programming needs. Building on Seed 2.0’s outstanding agent and VLM capabilities, it significantly enhances code-generation performance. It excels in front-end development and is specially tuned to meet common multi-language coding requirements in enterprises, making it ideal for integration with a wide range of AI-powered coding tools.

Volume billingDialogueThinkingImage recognition

Save 79%

doubao-seed-2-0-lite-260215

Doubao (豆包)

Input: $0.1260 / 1M Tokens

Output: $0.7560 / 1M Tokens

Doubao-Seed-2.0-lite is a balanced model designed for high-frequency enterprise use cases, striking an optimal trade-off between performance and cost. It outperforms its predecessor, Doubao-Seed-1.8, in overall capabilities. The model excels at production-oriented tasks such as unstructured information processing, content generation, search and recommendation, and data analytics. It supports long-context understanding, multi-source information fusion, multi-step instruction execution, and high-fidelity structured output. While maintaining stable performance, it also significantly reduces costs.

Volume billingDialogueImage recognition

Save 79%

doubao-seed-2-0-mini-260215

Doubao (豆包)

Input: $0.0420 / 1M Tokens

Output: $0.4200 / 1M Tokens

Doubao-Seed-2.0-mini is designed for low-latency, high-concurrency, and cost-sensitive scenarios, delivering exceptional model inference speed. Its performance is comparable to that of Doubao-Seed-1.6. It supports a 256K token context window, four levels of reasoning length, and multimodal understanding, making it ideal for lightweight tasks where cost efficiency and speed are paramount.

Volume billingDialogueImage recognition

Showing 1 - 20 / 574