Create Chat Completion (qwen-mt-turbo)
Given a prompt, the model returns one or more predicted completions and can also provide the probabilities of alternative tokens at each position. This interface is primarily designed for conversing with the model and handling natural language tasks such as machine translation.
Official Documentation: Aliyun Model Studio - Machine Translation Model
Endpoint Information
- Protocol: HTTP/HTTPS
- Method:
POST - Path:
https://api.codingplanx.ai/v1/chat/completions
Request Headers
| Parameter | Required | Type | Example | Description |
|---|---|---|---|---|
| Content-Type | Yes | string | application/json | Data format |
| Accept | Yes | string | application/json | Expected response format |
| Authorization | No | string | Bearer {{YOUR_API_KEY}} | API authentication credential |
Request Body
The request body format is application/json.
| Parameter | Required | Type | Description |
|---|---|---|---|
| model | Yes | string | The ID of the model to use. e.g., qwen-mt-turbo. |
| messages | Yes | array | A list of messages comprising the conversation so far. Includes role and content. |
| ? role | No | string | The role of the message sender, e.g., user, assistant, or system. |
| ? content | No | string | The content of the message. |
| tools | Yes | array | A list of tools the model may call. Currently, only functions are supported. Used to provide a list of functions for which the model can generate JSON inputs. |
| tool_choice | Yes | object | Controls which function the model calls (if any). none means no call; auto means the model chooses. Use {"type": "function", "function": {"name": "my_function"}} to force a specific call. |
| temperature | No | integer | Sampling temperature between 0 and 2. Higher values (e.g., 0.8) make output more random; lower values (e.g., 0.2) make it more focused. Recommended to modify either this or top_p, but not both. |
| top_p | No | integer | Nucleus sampling parameter. 0.1 means only tokens comprising the top 10% probability mass are considered. Recommended to modify either this or temperature, but not both. |
| n | No | integer | How many chat completion choices to generate for each input message. Defaults to 1. |
| stream | No | boolean | Whether to enable streaming output. If true, partial message increments are sent via Server-Sent Events (SSE), followed by data: [DONE] upon completion. Defaults to false. |
| stop | No | string | Up to 4 sequences where the API will stop generating further tokens. |
| max_tokens | No | integer | The maximum number of tokens to generate in the completion. The total length is limited by the model's context window. |
| presence_penalty | No | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the likelihood of talking about new topics. |
| frequency_penalty | No | number | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, decreasing the likelihood of repetition. |
| logit_bias | No | object/null | Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object mapping token IDs to bias values (-100 to 100). |
| user | No | string | A unique identifier representing your end-user, helping to monitor and detect abuse. |
| response_format | No | object | Specifies the output format. Use {"type": "json_object"} to enable JSON mode. Note: You must also instruct the model via a message to generate JSON. |
| seen | No | integer | Beta Feature. Seed for deterministic sampling. Requests with the same seed and parameters should return the same result. |
Request Example (Machine Translation Scenario)
{
"model": "qwen-mt-turbo",
"messages": [
{
"role": "user",
"content": "看完这个视频我没有笑"
}
],
"translation_options": {
"source_lang": "auto",
"target_lang": "English"
}
}
(Note: translation_options is an extended parameter specific to the qwen-mt-turbo translation model.)
Response Body
The response format is application/json.
| Parameter | Type | Description |
|---|---|---|
| id | string | Unique identifier for the chat completion. |
| object | string | Object type, typically chat.completion. |
| created | integer | Unix timestamp (seconds) when the completion was created. |
| choices | array | A list of completion choices. |
| ? index | integer | The index of the choice in the list. |
| ? message | object | The message object generated by the model. Includes role and content. |
| ? finish_reason | string | The reason the model stopped generating (e.g., stop for completion, length for reaching max tokens). |
| usage | object | Token usage statistics for the request. |
| ? prompt_tokens | integer | Number of tokens in the prompt. |
| ? completion_tokens | integer | Number of tokens in the generated content. |
| ? total_tokens | integer | Total tokens consumed. |
Response Example (200 OK)
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\r
\r
I didn't laugh after watching this video."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
FAQs
1. What are the primary use cases for the qwen-mt-turbo model?
qwen-mt-turbo is specifically optimized for Machine Translation (MT). While it utilizes the standard Chat Completions interface, it delivers high-quality translation results when provided with source text and configured source/target languages.
2. How can I implement a "typewriter" effect (Streaming Output)?
Set the stream parameter to true in the request body. The interface will then push data chunks via the Server-Sent Events (SSE) protocol instead of returning a single JSON response. A data: [DONE] message indicates the end of the stream.
3. Why am I receiving a 401 Unauthorized error?
Please verify the Authorization field in your headers. Ensure the format is Bearer <YOUR_API_KEY> (with a space between "Bearer" and the key), and that your API Key is active and has permissions for this model.
4. How should I configure temperature and top_p?
Both parameters control the randomness of the output. It is officially recommended to modify only one of them while keeping the other at its default. For rigorous and consistent translations or Q&A, use a lower temperature (e.g., 0.1 - 0.2); for more creative content, use a higher value (e.g., 0.8).
5. Is the translation_options parameter in the example mandatory?
It is an extended configuration supported by specialized translation models like qwen-mt-turbo to explicitly define source_lang and target_lang. While you can prompt the model to translate using natural language within messages, using translation_options provides clearer instructions and more stable results. It can be omitted for standard non-translation dialogues.