Chat Completions API Documentation
This API is used to provide a series of conversation messages, and the model will return one or more predicted completion results. It supports streaming output, tool calling (Function Calling), and various sampling parameter adjustments.
1. Interface Information
- Interface Name: Official N Test
- HTTP Method:
POST - Endpoint URL:
https://api.codingplanx.ai/v1/chat/completions - Content-Type:
application/json - Authentication:
Bearer {{YOUR_API_KEY}}
2. Request Headers
| Parameter | Required | Type | Description | Example |
|---|---|---|---|---|
| Content-Type | Yes | string | Media type identifier | application/json |
| Accept | Yes | string | Response format accepted by the client | application/json |
| Authorization | No | string | API Access Token | Bearer sk-xxxxxx |
3. Request Body
| Parameter | Required | Type | Description |
|---|---|---|---|
| model | Yes | string | ID of the model to use (e.g., gpt-4o, gpt-3.5-turbo). |
| messages | Yes | array | A list of messages comprising the conversation. Each object contains role (system/user/assistant) and content. |
| tools | Yes | array | A list of tools the model may call. Currently, only functions are supported. |
| tool_choice | Yes | object | Controls which (if any) tool is called by the model. Options: none, auto, or a specific function. |
| temperature | No | number | Sampling temperature (0-2). Higher values mean more random, lower values mean more deterministic. It is recommended to alter this or top_p but not both. |
| top_p | No | number | Nucleus sampling. 0.1 means only tokens comprising the top 10% probability mass are considered. |
| n | No | integer | Defaults to 1. The number of chat completion choices to generate for each input message. |
| stream | No | boolean | Defaults to false. If true, tokens are sent as data-only server-sent events as they become available. |
| stop | No | string | Stop sequences. The model will stop generating further tokens when these characters are encountered. |
| max_tokens | No | integer | The maximum number of tokens to generate. Defaults to inf. |
| presence_penalty | No | number | Between -2.0 and 2.0. Penalizes new tokens based on whether they appear in the text so far, increasing the likelihood of talking about new topics. |
| frequency_penalty | No | number | Between -2.0 and 2.0. Penalizes new tokens based on their existing frequency in the text, decreasing the likelihood of repetition. |
| logit_bias | No | object | Modifies the likelihood of specified tokens appearing in the completion. |
| user | No | string | A unique identifier representing your end-user, which can help in monitoring and detecting abuse. |
| response_format | No | object | Specifies the format that the model must output. For example, { "type": "json_object" } enables JSON mode. |
| seed | No | integer | Experimental feature. If specified, the system will make a best effort to sample deterministically for reproducible outputs. |
4. Response Parameters
| Parameter | Type | Description |
|---|---|---|
| id | string | A unique identifier for the chat completion. |
| object | string | The object type, which is always chat.completion. |
| created | integer | The Unix timestamp (in seconds) of when the chat completion was created. |
| choices | array | A list of chat completion choices. |
| ├─ index | integer | The index of the choice in the list of choices. |
| ├─ message | object | A chat completion message generated by the model. |
| │ ├─ role | string | The role of the author of this message (usually assistant). |
| │ └─ content | string | The contents of the message. |
| └─ finish_reason | string | The reason the model stopped generating (e.g., stop, length, tool_calls). |
| usage | object | Usage statistics for the completion request. |
| ├─ prompt_tokens | integer | Number of tokens in the prompt. |
| ├─ completion_tokens | integer | Number of tokens in the generated completion. |
| └─ total_tokens | integer | Total number of tokens used in the request (prompt + completion). |
5. Example
Request Example
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "who are u?"
}
],
"n": 1,
"max_tokens": 100,
"temperature": 0.8,
"stream": false
}
Response Example
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\r
\r
Hello there, how may I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
6. FAQs
Q: Why do I get an error when I set response_format: { "type": "json_object" }?
A: When using JSON mode, you must explicitly instruct the model to produce JSON via a system or user message (e.g., "Please reply in JSON format"). Otherwise, the model might fail or hang because it cannot generate valid JSON.
Q: Can I adjust temperature and top_p at the same time?
A: Technically yes, but it is generally recommended to adjust only one of them. Adjusting one is usually sufficient to change the randomness of the output; adjusting both can lead to unpredictable results.
Q: How do I receive data when stream: true is enabled?
A: When streaming is enabled, the server sends a series of Server-Sent Events (SSE). The data portion of each event is a JSON object, and the stream ends with data: [DONE]. You need to use a streaming library (like Python's response.iter_lines()) to process the response line by line.
Q: How is token consumption calculated?
A: For English, 1 token is approximately 4 characters or 0.75 words. For Chinese, one character may correspond to 1-2 tokens. The final consumption is based on the usage field in the response body.
Q: What does finish_reason: "length" mean?
A: This indicates that the generated content exceeded the limit set in max_tokens or reached the model's maximum context length limit, resulting in the content being truncated.
Q: Does the API support Function Calling?
A: Yes. By defining function prototypes via the tools parameter, the model will return tool_calls in choices[0].message instead of plain content when a function call is required.