Create Chat Completion (qwen-mt-turbo)

Given a prompt, the model returns one or more predicted completions and can also provide the probabilities of alternative tokens at each position. This interface is primarily designed for conversing with the model and handling natural language tasks such as machine translation.

Official Documentation: Aliyun Model Studio - Machine Translation Model

Endpoint Information

  • Protocol: HTTP/HTTPS
  • Method: POST
  • Path: https://api.codingplanx.ai/v1/chat/completions

Request Headers

ParameterRequiredTypeExampleDescription
Content-TypeYesstringapplication/jsonData format
AcceptYesstringapplication/jsonExpected response format
AuthorizationNostringBearer {{YOUR_API_KEY}}API authentication credential

Request Body

The request body format is application/json.

ParameterRequiredTypeDescription
modelYesstringThe ID of the model to use. e.g., qwen-mt-turbo.
messagesYesarrayA list of messages comprising the conversation so far. Includes role and content.
? roleNostringThe role of the message sender, e.g., user, assistant, or system.
? contentNostringThe content of the message.
toolsYesarrayA list of tools the model may call. Currently, only functions are supported. Used to provide a list of functions for which the model can generate JSON inputs.
tool_choiceYesobjectControls which function the model calls (if any). none means no call; auto means the model chooses. Use {"type": "function", "function": {"name": "my_function"}} to force a specific call.
temperatureNointegerSampling temperature between 0 and 2. Higher values (e.g., 0.8) make output more random; lower values (e.g., 0.2) make it more focused. Recommended to modify either this or top_p, but not both.
top_pNointegerNucleus sampling parameter. 0.1 means only tokens comprising the top 10% probability mass are considered. Recommended to modify either this or temperature, but not both.
nNointegerHow many chat completion choices to generate for each input message. Defaults to 1.
streamNobooleanWhether to enable streaming output. If true, partial message increments are sent via Server-Sent Events (SSE), followed by data: [DONE] upon completion. Defaults to false.
stopNostringUp to 4 sequences where the API will stop generating further tokens.
max_tokensNointegerThe maximum number of tokens to generate in the completion. The total length is limited by the model's context window.
presence_penaltyNonumberNumber between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the likelihood of talking about new topics.
frequency_penaltyNonumberNumber between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, decreasing the likelihood of repetition.
logit_biasNoobject/nullModifies the likelihood of specified tokens appearing in the completion. Accepts a JSON object mapping token IDs to bias values (-100 to 100).
userNostringA unique identifier representing your end-user, helping to monitor and detect abuse.
response_formatNoobjectSpecifies the output format. Use {"type": "json_object"} to enable JSON mode. Note: You must also instruct the model via a message to generate JSON.
seenNointegerBeta Feature. Seed for deterministic sampling. Requests with the same seed and parameters should return the same result.

Request Example (Machine Translation Scenario)

{
  "model": "qwen-mt-turbo",
  "messages": [
    {
      "role": "user",
      "content": "看完这个视频我没有笑"
    }
  ],
  "translation_options": {
    "source_lang": "auto",
    "target_lang": "English"
  }
}

(Note: translation_options is an extended parameter specific to the qwen-mt-turbo translation model.)

Response Body

The response format is application/json.

ParameterTypeDescription
idstringUnique identifier for the chat completion.
objectstringObject type, typically chat.completion.
createdintegerUnix timestamp (seconds) when the completion was created.
choicesarrayA list of completion choices.
? indexintegerThe index of the choice in the list.
? messageobjectThe message object generated by the model. Includes role and content.
? finish_reasonstringThe reason the model stopped generating (e.g., stop for completion, length for reaching max tokens).
usageobjectToken usage statistics for the request.
? prompt_tokensintegerNumber of tokens in the prompt.
? completion_tokensintegerNumber of tokens in the generated content.
? total_tokensintegerTotal tokens consumed.

Response Example (200 OK)

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\r
\r
I didn't laugh after watching this video."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}

FAQs

1. What are the primary use cases for the qwen-mt-turbo model?

qwen-mt-turbo is specifically optimized for Machine Translation (MT). While it utilizes the standard Chat Completions interface, it delivers high-quality translation results when provided with source text and configured source/target languages.

2. How can I implement a "typewriter" effect (Streaming Output)?

Set the stream parameter to true in the request body. The interface will then push data chunks via the Server-Sent Events (SSE) protocol instead of returning a single JSON response. A data: [DONE] message indicates the end of the stream.

3. Why am I receiving a 401 Unauthorized error?

Please verify the Authorization field in your headers. Ensure the format is Bearer <YOUR_API_KEY> (with a space between "Bearer" and the key), and that your API Key is active and has permissions for this model.

4. How should I configure temperature and top_p?

Both parameters control the randomness of the output. It is officially recommended to modify only one of them while keeping the other at its default. For rigorous and consistent translations or Q&A, use a lower temperature (e.g., 0.1 - 0.2); for more creative content, use a higher value (e.g., 0.8).

5. Is the translation_options parameter in the example mandatory?

It is an extended configuration supported by specialized translation models like qwen-mt-turbo to explicitly define source_lang and target_lang. While you can prompt the model to translate using natural language within messages, using translation_options provides clearer instructions and more stable results. It can be omitted for standard non-translation dialogues.