Official Function Calling API Documentation

This endpoint is used to create a Chat Completion. By providing a list of messages and an optional list of tools (Functions), the model can generate text responses or generate arguments conforming to specific JSON schemas to invoke external functions.

1. Interface Information

  • Endpoint: https://api.codingplanx.ai/v1/chat/completions
  • Method: POST
  • Content-Type: application/json
  • Authentication: Authorization: Bearer {{YOUR_API_KEY}}

2. Request Parameters

2.1 Headers

ParameterTypeRequiredExampleDescription
Content-TypestringYesapplication/jsonMust be application/json
AcceptstringYesapplication/jsonRecommended to be application/json
AuthorizationstringYesBearer sk-xxxxxxAPI Key authentication

2.2 Request Body

ParameterTypeRequiredDefaultDescription
modelstringYes-ID of the model to use (e.g., gpt-4o, gpt-3.5-turbo).
messagesarrayYes-A list of messages comprising the conversation so far. Each object contains role ("system", "user", "assistant", "tool") and content.
toolsarrayYes-A list of tools the model may call. Currently, only type: "function" is supported. Used to define function names, descriptions, and parameters.
tool_choiceobject/strNoautoControls which (if any) function is called by the model. none: no call; auto: automatic; {"type": "function", "function": {"name": "xxx"}}: force a specific function call.
temperaturenumberNo1Sampling temperature (0-2). Higher is more random, lower is more deterministic.
top_pnumberNo1Nucleus sampling probability.
nintegerNo1Number of chat completion choices to generate for each input message.
streambooleanNofalseIf set to true, tokens will be sent as server-sent events (SSE).
stopstring/arrayNonullSequences where the API will stop generating further tokens.
max_tokensintegerNoinfThe maximum number of tokens to generate.
presence_penaltynumberNo0Number between -2.0 and 2.0. Increases the model's likelihood to talk about new topics.
frequency_penaltynumberNo0Number between -2.0 and 2.0. Decreases the model's likelihood to repeat the same line verbatim.
response_formatobjectNo-Specifies the output format. E.g., {"type": "json_object"} enables JSON mode.
seedintegerNo-If specified, the system will make a best effort to sample deterministically.
userstringNo-A unique identifier representing your end-user, used for monitoring and preventing abuse.

3. Request Example

{
    "model": "gpt-4o",
    "messages": [
        {
            "role": "user",
            "content": "What's the weather like in Beijing today?"
        }
    ],
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_current_weather",
                "description": "Get the current weather for a specific location",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city name, e.g., Beijing"
                        },
                        "unit": {
                            "type": "string",
                            "enum": ["celsius", "fahrenheit"]
                        }
                    },
                    "required": ["location"]
                }
            }
        }
    ],
    "tool_choice": "auto",
    "temperature": 0.8
}

4. Response Parameters

4.1 Response Body Structure

ParameterTypeDescription
idstringA unique identifier for the request.
objectstringThe object type, usually chat.completion.
createdintegerThe Unix timestamp (in seconds) of when the completion was created.
choicesarrayA list of chat completion choices.
└ messageobjectContains role, content, and potentially tool_calls.
└ finish_reasonstringReason the model stopped: stop (natural stop), length (max_tokens reached), tool_calls (tool invocation required).
usageobjectToken usage statistics: prompt_tokens, completion_tokens, total_tokens.

4.2 Response Example (Normal Text)

{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "It is sunny in Beijing today with a temperature of 25 degrees Celsius."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}

5. FAQs

Q1: What is Function Calling (Tool Calling)? A: It is a way to provide models with external capabilities. The model does not execute code directly; instead, based on your description, it generates a JSON object containing a function name and required arguments. You need to extract this JSON, execute the function locally or on your server, and then pass the result back to the model.

Q2: Why did the model not return tool_calls despite me setting tools? A: This could happen for several reasons:

  1. The model determines the user's query does not require calling that function.
  2. Your function description is not clear enough for the model to match the user's intent.
  3. tool_choice is explicitly set to none.
  4. Model capability limitations (it is recommended to use models like gpt-4o or gpt-3.5-turbo which officially support tool calling).

Q3: How can I force the model to call a specific function? A: You can set tool_choice to a specific object, for example: "tool_choice": {"type": "function", "function": {"name": "get_current_weather"}}. This forces the model to generate parameters for that specific function even if it deems it unnecessary.

Q4: What should I do if the finish_reason is length? A: This means the generated response exceeded the max_tokens limit or the model's context window. You can try increasing the max_tokens parameter or shortening the messages history.

Q5: What are the precautions for using JSON Mode? A: When setting response_format: { "type": "json_object" }, you must also instruct the model via a System or User message to output JSON (e.g., "respond in JSON format"). Otherwise, the model might output an endless stream of whitespace.

Q6: How do I handle multiple tool calls? A: In a single request, the model may generate multiple tool calls (e.g., checking the weather for Beijing and Shanghai simultaneously). You should iterate through the choices[0].message.tool_calls array and execute the local logic for each call.