Audio transcription is a Pro feature. Upgrade to access.
Endpoint
Request Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
file | file | Yes | Audio file to transcribe (max 25MB). |
model | string | Yes | Use whisper-1 for standard transcription. |
language | string | No | ISO-639-1 code (e.g., en, es, fr). Auto-detected if omitted. |
prompt | string | No | Optional context to guide transcription style. |
response_format | string | No | Output format: json, text, srt, vtt, verbose_json. |
timestamp_granularities | array | No | Include word and/or segment timestamps. |
Supported Audio Formats
mp3, mp4, mpeg, mpga, m4a, wav, webm
Example Usage
Response Formats
| Format | Description |
|---|---|
json | Simple JSON with text only. |
text | Plain text output. |
srt | SubRip subtitle format. |
vtt | WebVTT subtitle format. |
verbose_json | JSON with word-level timestamps and metadata. |
Verbose JSON Response
Best Practices
- Audio Quality: Clear audio with minimal background noise yields best results
- File Size: Keep files under 25MB (use compression if needed)
- Language Hints: Specify
languagefor non-English audio to improve accuracy - Prompt Context: Use
promptto provide domain-specific terminology
Batch Processing
Need to process large volumes? Contact us for batch API access.