All models
Save 91%574 models across multiple vendors and groups
Input: $0.4500 / 1M Tokens
Output: $2.7000 / 1M Tokens
GPT Image 2 is our state-of-the-art image generation model, enabling fast, high-quality image generation and editing. It supports flexible image resolutions and high-fidelity image inputs.
Input: $0.7500 / 1M Tokens
Output: $3.7500 / 1M Tokens
Claude Opus 4.7 supports a 1-million-token context window, a maximum output of 128,000 tokens, adaptive reasoning, and the same tools and platform capabilities as Claude Opus 4.6.
Input: $25.5000 / 1M Tokens
Output: $153.0000 / 1M Tokens
GPT-5.5pro is now available to handle Responses API requests, including through the BBatch API, enabling multi-turn model interactions before responding to API requests. Additional advanced API features will be rolled out in the future. As GPT-5.5pro is designed to tackle complex tasks, some requests may take several minutes to process. To prevent timeouts, we recommend using background mode.
Input: $0.1500 / 1M Tokens
Output: $0.3000 / 1M Tokens
DeepSeek-V4-Flash is a lightweight variant of the DeepSeek V4 family, emphasizing cost-effectiveness and high throughput. It is well-suited for general-purpose dialogue and foundational text tasks, while also supporting million-token-long contexts and efficient inference.
Input: $1.8000 / 1M Tokens
Output: $3.6000 / 1M Tokens
DeepSeek-V4-Pro is a high-performance open-source large model released by DeepSeek, featuring state-of-the-art reasoning and agent capabilities. It supports ultra-long context windows, is optimized for domestically produced Ascend chips, and offers exceptional value for money.
Input: $0.2100 / 1M Tokens
Output: $4.2000 / 1M Tokens
The Gemini-3.1-Flash-TTS-Preview text-to-speech audio model has been optimized to deliver cost-effective, low-latency, and controllable speech synthesis.
Input: $0.4500 / 1M Tokens
Output: $2.7000 / 1M Tokens
GPT-5.5 is OpenAI’s flagship large language model, released on April 24, 2026. Positioned as a new generation of AI tailored for real-world work and autonomous agents, its key breakthrough lies in autonomous planning and execution of multi-step, complex tasks. It excels in programming, computer operations, scientific research and analysis, and delivers higher efficiency with significantly reduced token consumption.
Input: $0.1800 / 1M Tokens
Output: $0.5400 / 1M Tokens
Grok 4.20 is X.AI’s latest flagship model, delivering industry-leading speed and advanced tool-call capabilities. It combines the lowest hallucination rate in the market with rigorous adherence to factual accuracy, ensuring consistently precise and trustworthy responses.
Input: $0.1800 / 1M Tokens
Output: $0.5400 / 1M Tokens
Grok 4.20 is X.AI’s latest flagship model, delivering industry-leading speed and advanced tool-call capabilities. It combines the lowest hallucination rate in the market with rigorous adherence to factual accuracy, ensuring consistently precise and trustworthy responses.
Input: $0.0675 / 1M Tokens
Output: $0.4050 / 1M Tokens
GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.
Input: $0.0675 / 1M Tokens
Output: $0.4050 / 1M Tokens
GPT-5.4mini integrates the strengths of GPT-5.4 into a faster, more efficient model, specifically designed for high-load workloads.
Input: $0.0180 / 1M Tokens
Output: $0.1125 / 1M Tokens
GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.
Input: $0.0180 / 1M Tokens
Output: $0.1125 / 1M Tokens
GPT-5.4 Nano is the lightest and fastest variant of GPT-5.4, specifically designed for tasks with extremely high demands on speed and cost efficiency.
$0.0300 / 次
Grok video model
Input: $0.4500 / 1M Tokens
Output: $2.7000 / 1M Tokens
The Qwen3.6 27B native vision-language Dense model builds upon the 3.5-27B version, with key improvements in agentic coding capabilities and enhanced STEM reasoning. In the vision modality, it delivers significant advancements in spatial intelligence, object localization, and detection, while video understanding, document OCR, and visual agent capabilities continue to improve steadily.
Input: $0.2700 / 1M Tokens
Output: $1.6200 / 1M Tokens
The Qwen3.6 35B-A3B is a native vision-language model built on a hybrid architecture that integrates linear attention mechanisms with a sparse mixture-of-experts framework, delivering superior inference efficiency. Compared with the 3.5-35B-A3B, this model demonstrates markedly improved agentic coding capabilities, mathematical and code reasoning skills, spatial intelligence, as well as object localization and object detection performance.
Input: $0.3150 / 1M Tokens
Output: $1.2600 / 1M Tokens
MiniMax-M2.7 has achieved or surpassed the state-of-the-art (SOTA) performance in productivity scenarios such as programming, tool invocation and search, and office work.
Input: $0.6720 / 1M Tokens
Output: $3.3600 / 1M Tokens
Doubao-Seed-2.0-Code is optimized for enterprise-level programming needs. Building on Seed 2.0’s outstanding agent and VLM capabilities, it significantly enhances code-generation performance. It excels in front-end development and is specially tuned to meet common multi-language coding requirements in enterprises, making it ideal for integration with a wide range of AI-powered coding tools.
Input: $0.1260 / 1M Tokens
Output: $0.7560 / 1M Tokens
Doubao-Seed-2.0-lite is a balanced model designed for high-frequency enterprise use cases, striking an optimal trade-off between performance and cost. It outperforms its predecessor, Doubao-Seed-1.8, in overall capabilities. The model excels at production-oriented tasks such as unstructured information processing, content generation, search and recommendation, and data analytics. It supports long-context understanding, multi-source information fusion, multi-step instruction execution, and high-fidelity structured output. While maintaining stable performance, it also significantly reduces costs.
Input: $0.0420 / 1M Tokens
Output: $0.4200 / 1M Tokens
Doubao-Seed-2.0-mini is designed for low-latency, high-concurrency, and cost-sensitive scenarios, delivering exceptional model inference speed. Its performance is comparable to that of Doubao-Seed-1.6. It supports a 256K token context window, four levels of reasoning length, and multimodal understanding, making it ideal for lightweight tasks where cost efficiency and speed are paramount.