Skip to main content
Audio transcription is a Pro feature. Upgrade to access.
Transform audio files into accurate text transcriptions using our optimized Whisper implementation.

Endpoint

POST https://api.llm.kiwi/v1/audio/transcriptions

Request Parameters

ParameterTypeRequiredDescription
filefileYesAudio file to transcribe (max 25MB).
modelstringYesUse whisper-1 for standard transcription.
languagestringNoISO-639-1 code (e.g., en, es, fr). Auto-detected if omitted.
promptstringNoOptional context to guide transcription style.
response_formatstringNoOutput format: json, text, srt, vtt, verbose_json.
timestamp_granularitiesarrayNoInclude word and/or segment timestamps.

Supported Audio Formats

mp3, mp4, mpeg, mpga, m4a, wav, webm

Example Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.llm.kiwi/v1",
    api_key="YOUR_API_KEY"
)

with open("recording.mp3", "rb") as audio_file:
    transcript = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        response_format="verbose_json",
        timestamp_granularities=["word", "segment"]
    )

print(transcript.text)

Response Formats

FormatDescription
jsonSimple JSON with text only.
textPlain text output.
srtSubRip subtitle format.
vttWebVTT subtitle format.
verbose_jsonJSON with word-level timestamps and metadata.

Verbose JSON Response

{
  "text": "Hello, welcome to our podcast.",
  "language": "en",
  "duration": 5.2,
  "words": [
    { "word": "Hello", "start": 0.0, "end": 0.5 },
    { "word": "welcome", "start": 0.6, "end": 1.0 },
    { "word": "to", "start": 1.1, "end": 1.2 },
    { "word": "our", "start": 1.3, "end": 1.5 },
    { "word": "podcast", "start": 1.6, "end": 2.1 }
  ]
}

Best Practices

  • Audio Quality: Clear audio with minimal background noise yields best results
  • File Size: Keep files under 25MB (use compression if needed)
  • Language Hints: Specify language for non-English audio to improve accuracy
  • Prompt Context: Use prompt to provide domain-specific terminology

Batch Processing

Need to process large volumes? Contact us for batch API access.