Create Translation (Not Supported)
?? Note: The current endpoint is marked as "(Not Supported)". This may indicate that it is deprecated, temporarily unavailable, or in a testing phase. Please confirm its availability with the backend administrator before integrating it into a production environment.
1. Basic Endpoint Information
- Description: Translates audio files into English text.
- Request Method:
POST - Request URL:
https://api.codingplanx.ai/v1/audio/translations - Content-Type:
multipart/form-data
2. Request Parameters
2.1 Header Parameters
An authentication token is typically required (e.g., Authorization: Bearer <Your-Token>). Please refer to the global authentication specifications for details.
2.2 Body Parameters (form-data)
| Parameter | Type | Required | Description | Example Value |
|---|---|---|---|---|
file | file | Yes | The audio file object to translate (Note: You must upload the actual file, not a file name string).<br>Supported formats include: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm. | test.m4a |
model | string | Yes | ID of the model to use.<br>Note: Usually whisper-1. | gpt-4o-transcribe |
prompt | string | No | An optional text to guide the model's style or continue a previous audio segment.<br>Note: The prompt must be in English. | Translate the following German speech into English. |
response_format | string | No | Specifies the output format of the translation.<br>Supported values: json, text, srt, verbose_json, or vtt. Defaults to json. | json |
temperature | number | No | The sampling temperature, between 0 and 1. Defaults to 0.<br>- Higher values (e.g., 0.8) will make the output more random.<br>- Lower values (e.g., 0.2) will make it more focused and deterministic.<br>- If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. | 0 |
3. Response Description
3.1 Successful Response Format
When response_format is default or specified as json, the following JSON object is returned:
| Field | Type | Required | Description |
|---|---|---|---|
text | string | Yes | The translated and transcribed plain English text. |
4. Request & Response Examples
4.1 cURL Request Example
curl --location --request POST 'https://api.codingplanx.ai/v1/audio/translations' \
--header 'Authorization: Bearer <YOUR_API_KEY>' \
--form 'file=@"/C:/Users/Administrator/Desktop/test.m4a"' \
--form 'model="gpt-4o-transcribe"' \
--form 'response_format="json"' \
--form 'temperature="0"'
4.2 Successful Response Example (HTTP 200)
{
"text": "Hello, my name is Wolfgang and I come from Germany. Where are you heading today?"
}
5. Frequently Asked Questions (FAQs)
Q1: The endpoint is marked as "Not Supported", can I still use it?
A: This label usually indicates that the model or endpoint is currently in beta testing, not fully compatible, or its maintenance has been paused on the gateway/platform. It is highly recommended to contact the platform administrator to confirm before use, or use the standard Transcriptions API as an alternative.
Q2: What is the difference between the Translations API and the Transcriptions API?
A: The Transcriptions API transcribes the audio into its original language text; the Translations API, regardless of the input audio language (e.g., German, French, Chinese), will recognize it and directly translate the output into English text.
Q3: Why must the prompt parameter be in English?
A: Because the target output language of the Translations API is English. Using an English prompt provides better context, spelling guidance for proper nouns, or tonal references for the model, thereby significantly improving the accuracy of the English output.
Q4: Is there a file size limit for uploaded audio?
A: Although not explicitly stated in this document, based on the standard limitations of underlying models (like Whisper), it is highly recommended that a single audio file does not exceed 25MB. If your file is larger, consider compressing it or chunking the audio before uploading.
Q5: What is the best setting for temperature?
A: For audio translation, the goal is accuracy rather than creativity. Therefore, it is strongly recommended to keep the default value of
0. At this setting, the model operates with maximum determinism, yielding the most accurate translations. Only consider slightly increasing this value (e.g., to0.2) if you notice the translation getting stuck in a loop or failing to transcribe certain specific dialects.