Official Function Calling API Documentation

This endpoint is used to create a Chat Completion. By providing a list of messages and an optional list of tools (Functions), the model can generate text responses or generate arguments conforming to specific JSON schemas to invoke external functions.

1. Interface Information

Endpoint: https://api.codingplanx.ai/v1/chat/completions
Method: POST
Content-Type: application/json
Authentication: Authorization: Bearer {{YOUR_API_KEY}}

2. Request Parameters

2.1 Headers

Parameter	Type	Required	Example	Description
Content-Type	string	Yes	`application/json`	Must be `application/json`
Accept	string	Yes	`application/json`	Recommended to be `application/json`
Authorization	string	Yes	`Bearer sk-xxxxxx`	API Key authentication

2.2 Request Body

Parameter	Type	Required	Default	Description
model	string	Yes	-	ID of the model to use (e.g., `gpt-4o`, `gpt-3.5-turbo`).
messages	array	Yes	-	A list of messages comprising the conversation so far. Each object contains `role` ("system", "user", "assistant", "tool") and `content`.
tools	array	Yes	-	A list of tools the model may call. Currently, only `type: "function"` is supported. Used to define function names, descriptions, and parameters.
tool_choice	object/str	No	`auto`	Controls which (if any) function is called by the model. `none`: no call; `auto`: automatic; `{"type": "function", "function": {"name": "xxx"}}`: force a specific function call.
temperature	number	No	1	Sampling temperature (0-2). Higher is more random, lower is more deterministic.
top_p	number	No	1	Nucleus sampling probability.
n	integer	No	1	Number of chat completion choices to generate for each input message.
stream	boolean	No	false	If set to `true`, tokens will be sent as server-sent events (SSE).
stop	string/array	No	null	Sequences where the API will stop generating further tokens.
max_tokens	integer	No	inf	The maximum number of tokens to generate.
presence_penalty	number	No	0	Number between -2.0 and 2.0. Increases the model's likelihood to talk about new topics.
frequency_penalty	number	No	0	Number between -2.0 and 2.0. Decreases the model's likelihood to repeat the same line verbatim.
response_format	object	No	-	Specifies the output format. E.g., `{"type": "json_object"}` enables JSON mode.
seed	integer	No	-	If specified, the system will make a best effort to sample deterministically.
user	string	No	-	A unique identifier representing your end-user, used for monitoring and preventing abuse.

3. Request Example

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": "What's the weather like in Beijing today?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather for a specific location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city name, e.g., Beijing"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"]
                        }
                    },
                    "required": ["location"]
                }
            }
        }
    ],
    "tool_choice": "auto",
    "temperature": 0.8
}

4. Response Parameters

4.1 Response Body Structure

Parameter	Type	Description
id	string	A unique identifier for the request.
object	string	The object type, usually `chat.completion`.
created	integer	The Unix timestamp (in seconds) of when the completion was created.
choices	array	A list of chat completion choices.
└ message	object	Contains `role`, `content`, and potentially `tool_calls`.
└ finish_reason	string	Reason the model stopped: `stop` (natural stop), `length` (max_tokens reached), `tool_calls` (tool invocation required).
usage	object	Token usage statistics: `prompt_tokens`, `completion_tokens`, `total_tokens`.

4.2 Response Example (Normal Text)

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It is sunny in Beijing today with a temperature of 25 degrees Celsius."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}

5. FAQs

Q1: What is Function Calling (Tool Calling)? A: It is a way to provide models with external capabilities. The model does not execute code directly; instead, based on your description, it generates a JSON object containing a function name and required arguments. You need to extract this JSON, execute the function locally or on your server, and then pass the result back to the model.

Q2: Why did the model not return tool_calls despite me setting tools? A: This could happen for several reasons:

The model determines the user's query does not require calling that function.
Your function description is not clear enough for the model to match the user's intent.
tool_choice is explicitly set to none.
Model capability limitations (it is recommended to use models like gpt-4o or gpt-3.5-turbo which officially support tool calling).

Q3: How can I force the model to call a specific function? A: You can set tool_choice to a specific object, for example: "tool_choice": {"type": "function", "function": {"name": "get_current_weather"}}. This forces the model to generate parameters for that specific function even if it deems it unnecessary.

Q4: What should I do if the finish_reason is length? A: This means the generated response exceeded the max_tokens limit or the model's context window. You can try increasing the max_tokens parameter or shortening the messages history.

Q5: What are the precautions for using JSON Mode? A: When setting response_format: { "type": "json_object" }, you must also instruct the model via a System or User message to output JSON (e.g., "respond in JSON format"). Otherwise, the model might output an endless stream of whitespace.

Q6: How do I handle multiple tool calls? A: In a single request, the model may generate multiple tool calls (e.g., checking the weather for Beijing and Shanghai simultaneously). You should iterate through the choices[0].message.tool_calls array and execute the local logic for each call.