Official Function Calling API Documentation
This endpoint is used to create a Chat Completion. By providing a list of messages and an optional list of tools (Functions), the model can generate text responses or generate arguments conforming to specific JSON schemas to invoke external functions.
1. Interface Information
- Endpoint:
https://api.codingplanx.ai/v1/chat/completions - Method:
POST - Content-Type:
application/json - Authentication:
Authorization: Bearer {{YOUR_API_KEY}}
2. Request Parameters
2.1 Headers
| Parameter | Type | Required | Example | Description |
|---|---|---|---|---|
| Content-Type | string | Yes | application/json | Must be application/json |
| Accept | string | Yes | application/json | Recommended to be application/json |
| Authorization | string | Yes | Bearer sk-xxxxxx | API Key authentication |
2.2 Request Body
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| model | string | Yes | - | ID of the model to use (e.g., gpt-4o, gpt-3.5-turbo). |
| messages | array | Yes | - | A list of messages comprising the conversation so far. Each object contains role ("system", "user", "assistant", "tool") and content. |
| tools | array | Yes | - | A list of tools the model may call. Currently, only type: "function" is supported. Used to define function names, descriptions, and parameters. |
| tool_choice | object/str | No | auto | Controls which (if any) function is called by the model. none: no call; auto: automatic; {"type": "function", "function": {"name": "xxx"}}: force a specific function call. |
| temperature | number | No | 1 | Sampling temperature (0-2). Higher is more random, lower is more deterministic. |
| top_p | number | No | 1 | Nucleus sampling probability. |
| n | integer | No | 1 | Number of chat completion choices to generate for each input message. |
| stream | boolean | No | false | If set to true, tokens will be sent as server-sent events (SSE). |
| stop | string/array | No | null | Sequences where the API will stop generating further tokens. |
| max_tokens | integer | No | inf | The maximum number of tokens to generate. |
| presence_penalty | number | No | 0 | Number between -2.0 and 2.0. Increases the model's likelihood to talk about new topics. |
| frequency_penalty | number | No | 0 | Number between -2.0 and 2.0. Decreases the model's likelihood to repeat the same line verbatim. |
| response_format | object | No | - | Specifies the output format. E.g., {"type": "json_object"} enables JSON mode. |
| seed | integer | No | - | If specified, the system will make a best effort to sample deterministically. |
| user | string | No | - | A unique identifier representing your end-user, used for monitoring and preventing abuse. |
3. Request Example
{
"model": "gpt-4o",
"messages": [
{
"role": "user",
"content": "What's the weather like in Beijing today?"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a specific location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city name, e.g., Beijing"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto",
"temperature": 0.8
}
4. Response Parameters
4.1 Response Body Structure
| Parameter | Type | Description |
|---|---|---|
| id | string | A unique identifier for the request. |
| object | string | The object type, usually chat.completion. |
| created | integer | The Unix timestamp (in seconds) of when the completion was created. |
| choices | array | A list of chat completion choices. |
| └ message | object | Contains role, content, and potentially tool_calls. |
| └ finish_reason | string | Reason the model stopped: stop (natural stop), length (max_tokens reached), tool_calls (tool invocation required). |
| usage | object | Token usage statistics: prompt_tokens, completion_tokens, total_tokens. |
4.2 Response Example (Normal Text)
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "It is sunny in Beijing today with a temperature of 25 degrees Celsius."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
5. FAQs
Q1: What is Function Calling (Tool Calling)? A: It is a way to provide models with external capabilities. The model does not execute code directly; instead, based on your description, it generates a JSON object containing a function name and required arguments. You need to extract this JSON, execute the function locally or on your server, and then pass the result back to the model.
Q2: Why did the model not return tool_calls despite me setting tools?
A: This could happen for several reasons:
- The model determines the user's query does not require calling that function.
- Your function
descriptionis not clear enough for the model to match the user's intent. tool_choiceis explicitly set tonone.- Model capability limitations (it is recommended to use models like
gpt-4oorgpt-3.5-turbowhich officially support tool calling).
Q3: How can I force the model to call a specific function?
A: You can set tool_choice to a specific object, for example:
"tool_choice": {"type": "function", "function": {"name": "get_current_weather"}}. This forces the model to generate parameters for that specific function even if it deems it unnecessary.
Q4: What should I do if the finish_reason is length?
A: This means the generated response exceeded the max_tokens limit or the model's context window. You can try increasing the max_tokens parameter or shortening the messages history.
Q5: What are the precautions for using JSON Mode?
A: When setting response_format: { "type": "json_object" }, you must also instruct the model via a System or User message to output JSON (e.g., "respond in JSON format"). Otherwise, the model might output an endless stream of whitespace.
Q6: How do I handle multiple tool calls?
A: In a single request, the model may generate multiple tool calls (e.g., checking the weather for Beijing and Shanghai simultaneously). You should iterate through the choices[0].message.tool_calls array and execute the local logic for each call.