Create Chat Completion (qwen-mt-turbo)

Given a prompt, the model returns one or more predicted completions and can also provide the probabilities of alternative tokens at each position. This interface is primarily designed for conversing with the model and handling natural language tasks such as machine translation.

Official Documentation: Aliyun Model Studio - Machine Translation Model

Endpoint Information

Protocol: HTTP/HTTPS
Method: POST
Path: https://api.codingplanx.ai/v1/chat/completions

Request Headers

Parameter	Required	Type	Example	Description
Content-Type	Yes	string	`application/json`	Data format
Accept	Yes	string	`application/json`	Expected response format
Authorization	No	string	`Bearer {{YOUR_API_KEY}}`	API authentication credential

Request Body

The request body format is application/json.

Parameter	Required	Type	Description
model	Yes	string	The ID of the model to use. e.g., `qwen-mt-turbo`.
messages	Yes	array	A list of messages comprising the conversation so far. Includes `role` and `content`.
? role	No	string	The role of the message sender, e.g., `user`, `assistant`, or `system`.
? content	No	string	The content of the message.
tools	Yes	array	A list of tools the model may call. Currently, only functions are supported. Used to provide a list of functions for which the model can generate JSON inputs.
tool_choice	Yes	object	Controls which function the model calls (if any). `none` means no call; `auto` means the model chooses. Use `{"type": "function", "function": {"name": "my_function"}}` to force a specific call.
temperature	No	integer	Sampling temperature between 0 and 2. Higher values (e.g., 0.8) make output more random; lower values (e.g., 0.2) make it more focused. Recommended to modify either this or `top_p`, but not both.
top_p	No	integer	Nucleus sampling parameter. 0.1 means only tokens comprising the top 10% probability mass are considered. Recommended to modify either this or `temperature`, but not both.
n	No	integer	How many chat completion choices to generate for each input message. Defaults to 1.
stream	No	boolean	Whether to enable streaming output. If `true`, partial message increments are sent via Server-Sent Events (SSE), followed by `data: [DONE]` upon completion. Defaults to `false`.
stop	No	string	Up to 4 sequences where the API will stop generating further tokens.
max_tokens	No	integer	The maximum number of tokens to generate in the completion. The total length is limited by the model's context window.
presence_penalty	No	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the likelihood of talking about new topics.
frequency_penalty	No	number	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, decreasing the likelihood of repetition.
logit_bias	No	object/null	Modifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object mapping token IDs to bias values (-100 to 100).
user	No	string	A unique identifier representing your end-user, helping to monitor and detect abuse.
response_format	No	object	Specifies the output format. Use `{"type": "json_object"}` to enable JSON mode. Note: You must also instruct the model via a message to generate JSON.
seen	No	integer	Beta Feature. Seed for deterministic sampling. Requests with the same seed and parameters should return the same result.

Request Example (Machine Translation Scenario)

{
  "model": "qwen-mt-turbo",
  "messages": [
    {
      "role": "user",
      "content": "看完这个视频我没有笑"
    }
  ],
  "translation_options": {
    "source_lang": "auto",
    "target_lang": "English"
  }
}

(Note: translation_options is an extended parameter specific to the qwen-mt-turbo translation model.)

Response Body

The response format is application/json.

Parameter	Type	Description
id	string	Unique identifier for the chat completion.
object	string	Object type, typically `chat.completion`.
created	integer	Unix timestamp (seconds) when the completion was created.
choices	array	A list of completion choices.
? index	integer	The index of the choice in the list.
? message	object	The message object generated by the model. Includes `role` and `content`.
? finish_reason	string	The reason the model stopped generating (e.g., `stop` for completion, `length` for reaching max tokens).
usage	object	Token usage statistics for the request.
? prompt_tokens	integer	Number of tokens in the prompt.
? completion_tokens	integer	Number of tokens in the generated content.
? total_tokens	integer	Total tokens consumed.

Response Example (200 OK)

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\r
\r
I didn't laugh after watching this video."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}

FAQs

1. What are the primary use cases for the `qwen-mt-turbo` model?

qwen-mt-turbo is specifically optimized for Machine Translation (MT). While it utilizes the standard Chat Completions interface, it delivers high-quality translation results when provided with source text and configured source/target languages.

2. How can I implement a "typewriter" effect (Streaming Output)?

Set the stream parameter to true in the request body. The interface will then push data chunks via the Server-Sent Events (SSE) protocol instead of returning a single JSON response. A data: [DONE] message indicates the end of the stream.

3. Why am I receiving a `401 Unauthorized` error?

Please verify the Authorization field in your headers. Ensure the format is Bearer <YOUR_API_KEY> (with a space between "Bearer" and the key), and that your API Key is active and has permissions for this model.

4. How should I configure `temperature` and `top_p`?

Both parameters control the randomness of the output. It is officially recommended to modify only one of them while keeping the other at its default. For rigorous and consistent translations or Q&A, use a lower temperature (e.g., 0.1 - 0.2); for more creative content, use a higher value (e.g., 0.8).

5. Is the `translation_options` parameter in the example mandatory?

It is an extended configuration supported by specialized translation models like qwen-mt-turbo to explicitly define source_lang and target_lang. While you can prompt the model to translate using natural language within messages, using translation_options provides clearer instructions and more stable results. It can be omitted for standard non-translation dialogues.