API Documentation: Create Chat Function Calling
Interface Description
Given a prompt, the model returns one or more predicted completions and can also return the probabilities of alternative tokens at each position. This interface specifically supports the Function Calling feature, allowing the model to choose intelligently between generating a message and calling a specified function.
- Official Reference: Function Calling Guide
Base Information
- Request Method:
POST - Request URL:
https://api.codingplanx.ai/v1/chat/completions - Data Format:
application/json
Request Headers
| Parameter | Type | Required | Example | Description |
|---|---|---|---|---|
Content-Type | string | Yes | application/json | Media type of the resource |
Accept | string | Yes | application/json | Expected response media type |
Authorization | string | No | Bearer {{YOUR_API_KEY}} | Authentication credentials |
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | ID of the model to use (e.g., gpt-4o). Refer to the model compatibility table for details. |
messages | array | Yes | A list of messages comprising the conversation so far. Each object must include role and content. |
tools | array | Yes | A list of tools the model may call. Currently, only functions are supported as tools. Used to provide a list of functions for which the model can generate JSON inputs. |
tool_choice | object/string | Yes | Controls which function is called by the model (if any). none means no call; auto means the model chooses. You can also force a specific function via {"type": "function", "function": {"name": "my_function"}}. |
temperature | number | No | Sampling temperature between 0 and 2. Higher values (e.g., 0.8) make output more random, while lower values (e.g., 0.2) make it more deterministic. It is recommended not to modify this and top_p simultaneously. |
top_p | number | No | Nucleus sampling. 0.1 means only tokens comprising the top 10% probability mass are considered. It is recommended not to modify this and temperature simultaneously. |
n | integer | No | How many chat completion choices to generate for each input message. Defaults to 1. |
stream | boolean | No | Defaults to false. If set to true, partial message deltas will be sent as Server-Sent Events (SSE), terminating with data: [DONE]. |
stop | string/array | No | Up to 4 sequences where the API will stop generating further tokens. Defaults to null. |
max_tokens | integer | No | The maximum number of tokens to generate in the chat completion. Total length is limited by the model's context length. |
presence_penalty | number | No | Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the likelihood of talking about new topics. |
frequency_penalty | number | No | Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text, decreasing the likelihood of repetition. |
logit_bias | object | No | Modifies the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values (-100 to 100). |
user | string | No | A unique identifier representing your end-user, which can help monitor and detect abuse. |
response_format | object | No | Specifies the format the model must output. For example, use { "type": "json_object" } to enable JSON mode. |
seed | integer | No | Beta feature. If specified, the system will attempt to sample deterministically, so that requests with the same seed and parameters return the same result. |
Request Body Example
{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."
},
{
"role": "user",
"content": "Hi, can you tell me the delivery date for my order?"
},
{"role": "assistant", "content": "Hi there! I can help with that. Can you please provide your order ID?"},
{"role": "user", "content": "i think it is order_12345"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_delivery_date",
"description": "Get the delivery date for a customer's order. Call this whenever you need to know the delivery date, for example when a customer asks 'Where is my package'",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "The customer's order ID."
}
},
"required": [
"order_id"
],
"additionalProperties": false
}
}
}
]
}
Response Parameters
| Parameter | Type | Description |
|---|---|---|
id | string | A unique identifier for the chat completion. |
object | string | The object type, usually chat.completion. |
created | integer | The Unix timestamp of when the chat completion was created. |
choices | array | A list of chat completion choices. |
∟ index | integer | The index of the choice in the list of choices. |
∟ message | object | The message object generated by the model, containing role and content. |
∟ finish_reason | string | The reason the model stopped generating tokens (e.g., stop, length, or tool_calls). |
usage | object | Usage statistics for the completion request. |
∟ prompt_tokens | integer | Number of tokens in the prompt (input). |
∟ completion_tokens | integer | Number of tokens in the generated completion (output). |
∟ total_tokens | integer | Total number of tokens used in the request. |
Response Example (200 OK)
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "\r
\r
Hello there, how may I assist you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
FAQs
Q1: How do I force the model to call a specific function?
A: You can achieve this by setting the tool_choice parameter. For example, if your function name is get_weather, set tool_choice to: {"type": "function", "function": {"name": "get_weather"}}. This forces the model to ignore a standard reply and output the arguments for that specific function.
Q2: Why is my response content truncated?
A: Check the value of choices[0].finish_reason in the response. If it is "length", it means the generated text exceeded the max_tokens limit or the model's maximum context length. You can try increasing the max_tokens value.
Q3: Both temperature and top_p control randomness; how should I use them?
A: Both parameters adjust the diversity of the output. Higher values make the output more creative, while lower values make it more deterministic. Official documentation recommends adjusting only one of these parameters and keeping the other at its default value to avoid unpredictable results.
Q4: What is the data format when stream: true is enabled?
A: When streaming is enabled, the interface no longer returns a single JSON object. Instead, it returns a stream of Server-Sent Events (SSE). Data is sent in chunks as data: {"id": "...", "choices": [{"delta": {"content": "..."}}]}. The final message will be data: [DONE].
Q5: What should I keep in mind when enabling response_format for JSON mode?
A: When setting response_format to { "type": "json_object" }, you must also explicitly instruct the model via a message (system or user prompt) to generate JSON. Failure to do so may cause the model to output endless whitespace, leading to request timeouts or token limit exhaustion.