Invite friends & earn free tokens!Learn more

All models

Save 91%

574 models across multiple vendors and groups

AI
Save 91%
OpenAI
gpt-image-2
OpenAI

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

GPT Image 2 is our state-of-the-art image generation model, enabling fast, high-quality image generation and editing. It supports flexible image resolutions and high-fidelity image inputs.

Volume billingPaintingDALL-E 3 format
Save 85%
Claude
claude-opus-4-7
Anthropic

Input: $0.7500 / 1M Tokens

Output: $3.7500 / 1M Tokens

Claude Opus 4.7 supports a 1-million-token context window, a maximum output of 128,000 tokens, adaptive reasoning, and the same tools and platform capabilities as Claude Opus 4.6.

Volume billingDialogueImage recognitionTool
Save 15%
OpenAI
gpt-5.5-pro
OpenAI

Input: $25.5000 / 1M Tokens

Output: $153.0000 / 1M Tokens

GPT-5.5pro is now available to handle Responses API requests, including through the BBatch API, enabling multi-turn model interactions before responding to API requests. Additional advanced API features will be rolled out in the future. As GPT-5.5pro is designed to tackle complex tasks, some requests may take several minutes to process. To prevent timeouts, we recommend using background mode.

Volume billingDialogueTool
Save 85%
DeepSeek
deepseek-v4-flash
DeepSeek

Input: $0.1500 / 1M Tokens

Output: $0.3000 / 1M Tokens

DeepSeek-V4-Flash is a lightweight variant of the DeepSeek V4 family, emphasizing cost-effectiveness and high throughput. It is well-suited for general-purpose dialogue and foundational text tasks, while also supporting million-token-long contexts and efficient inference.

Volume billingDialogueTool
Save 85%
DeepSeek
deepseek-v4-pro
DeepSeek

Input: $1.8000 / 1M Tokens

Output: $3.6000 / 1M Tokens

DeepSeek-V4-Pro is a high-performance open-source large model released by DeepSeek, featuring state-of-the-art reasoning and agent capabilities. It supports ultra-long context windows, is optimized for domestically produced Ascend chips, and offers exceptional value for money.

Volume billingDialogueThinkingTool
Save 79%
OpenAI
gemini-3.1-flash-tts-preview
Google

Input: $0.2100 / 1M Tokens

Output: $4.2000 / 1M Tokens

The Gemini-3.1-Flash-TTS-Preview text-to-speech audio model has been optimized to deliver cost-effective, low-latency, and controllable speech synthesis.

Volume billingAudio
Save 91%
OpenAI
gpt-5.5
OpenAI

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

GPT-5.5 is OpenAI’s flagship large language model, released on April 24, 2026. Positioned as a new generation of AI tailored for real-world work and autonomous agents, its key breakthrough lies in autonomous planning and execution of multi-step, complex tasks. It excels in programming, computer operations, scientific research and analysis, and delivers higher efficiency with significantly reduced token consumption.

Volume billingDialogueImage recognition
Save 91%
Grok
grok-4-20-non-reasoning
Grok (xAI)

Input: $0.1800 / 1M Tokens

Output: $0.5400 / 1M Tokens

Grok 4.20 is X.AI’s latest flagship model, delivering industry-leading speed and advanced tool-call capabilities. It combines the lowest hallucination rate in the market with rigorous adherence to factual accuracy, ensuring consistently precise and trustworthy responses.

Volume billingDialogueTool
Save 91%
Grok
grok-4-20-reasoning
Grok (xAI)

Input: $0.1800 / 1M Tokens

Output: $0.5400 / 1M Tokens

Grok 4.20 is X.AI’s latest flagship model, delivering industry-leading speed and advanced tool-call capabilities. It combines the lowest hallucination rate in the market with rigorous adherence to factual accuracy, ensuring consistently precise and trustworthy responses.

Volume billingDialogueTool
Save 91%
OpenAI
gpt-5.4-mini
OpenAI

Input: $0.0675 / 1M Tokens

Output: $0.4050 / 1M Tokens

GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.

Volume billingDialogueImage recognitionThinking+1
Save 91%
OpenAI
gpt-5.4-mini-2026-03-17
OpenAI

Input: $0.0675 / 1M Tokens

Output: $0.4050 / 1M Tokens

GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.

Volume billingDialogueImage recognitionThinking+1
Save 91%
OpenAI
gpt-5.4-nano
OpenAI

Input: $0.0180 / 1M Tokens

Output: $0.1125 / 1M Tokens

GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.

Volume billingDialogueImage recognitionThinking+1
Save 91%
OpenAI
gpt-5.4-nano-2026-03-17
OpenAI

Input: $0.0180 / 1M Tokens

Output: $0.1125 / 1M Tokens

GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.

Volume billingDialogueImage recognitionThinking+1
Save 85%
Grok
grok-videos
Grok (xAI)

$0.0300 / 次

Grok video model

Per-call billingAudio and videoAsynchronousGenerate videos with multiple image references
Save 85%
Qwen
qwen3.6-27b
Bailian (阿里云百炼)

Input: $0.4500 / 1M Tokens

Output: $2.7000 / 1M Tokens

The Qwen3.6 27B native vision-language Dense model builds upon the 3.5-27B version, with key improvements in agentic coding capabilities and enhanced STEM reasoning. In the vision modality, it delivers significant advancements in spatial intelligence, object localization, and detection, while video understanding, document OCR, and visual agent capabilities continue to improve steadily.

Volume billingDialogueImage recognition
Save 85%
Qwen
qwen3.6-35b-a3b
Bailian (阿里云百炼)

Input: $0.2700 / 1M Tokens

Output: $1.6200 / 1M Tokens

The Qwen3.6 35B-A3B is a native vision-language model built on a hybrid architecture that integrates linear attention mechanisms with a sparse mixture-of-experts framework, delivering superior inference efficiency. Compared with the 3.5-35B-A3B, this model demonstrates markedly improved agentic coding capabilities, mathematical and code reasoning skills, spatial intelligence, as well as object localization and object detection performance.

Volume billingImage recognitionDialogue
Save 85%
Minimax
MiniMax-M2.7
Minimax

Input: $0.3150 / 1M Tokens

Output: $1.2600 / 1M Tokens

MiniMax-M2.7 has achieved or surpassed the state-of-the-art (SOTA) performance in productivity scenarios such as programming, tool invocation and search, and office work.

Volume billingDialogueTool
Save 79%
Doubao
doubao-seed-2-0-code-preview-260215
Doubao (豆包)

Input: $0.6720 / 1M Tokens

Output: $3.3600 / 1M Tokens

Doubao-Seed-2.0-Code is optimized for enterprise-level programming needs. Building on Seed 2.0’s outstanding agent and VLM capabilities, it significantly enhances code-generation performance. It excels in front-end development and is specially tuned to meet common multi-language coding requirements in enterprises, making it ideal for integration with a wide range of AI-powered coding tools.

Volume billingDialogueThinkingImage recognition
Save 79%
Doubao
doubao-seed-2-0-lite-260215
Doubao (豆包)

Input: $0.1260 / 1M Tokens

Output: $0.7560 / 1M Tokens

Doubao-Seed-2.0-lite is a balanced model designed for high-frequency enterprise use cases, striking an optimal trade-off between performance and cost. It outperforms its predecessor, Doubao-Seed-1.8, in overall capabilities. The model excels at production-oriented tasks such as unstructured information processing, content generation, search and recommendation, and data analytics. It supports long-context understanding, multi-source information fusion, multi-step instruction execution, and high-fidelity structured output. While maintaining stable performance, it also significantly reduces costs.

Volume billingDialogueImage recognition
Save 79%
Doubao
doubao-seed-2-0-mini-260215
Doubao (豆包)

Input: $0.0420 / 1M Tokens

Output: $0.4200 / 1M Tokens

Doubao-Seed-2.0-mini is designed for low-latency, high-concurrency, and cost-sensitive scenarios, delivering exceptional model inference speed. Its performance is comparable to that of Doubao-Seed-1.6. It supports a 256K token context window, four levels of reasoning length, and multimodal understanding, making it ideal for lightweight tasks where cost efficiency and speed are paramount.

Volume billingDialogueImage recognition
Showing 1 - 20 / 574