Chat Completions

The Chat Completions API is your primary interface for building chatbots, virtual assistants, and text generation applications.

Endpoint

POST https://api.llm.kiwi/v1/chat/completions

Request Parameters

model

string

required

Model ID: default (Free), or any Pro model (e.g. gpt-oss-20b).

messages

array

required

Conversation history as message objects.

stream

boolean

default:"false"

Enable streaming responses.

temperature

number

default:"1"

Sampling temperature (0-2).

max_tokens

integer

Maximum tokens to generate.

response_format

object

Set { "type": "json_object" } for JSON mode. ^Pro

tools

array

Function definitions for tool calling. ^Pro

tool_choice

string/object

Control tool selection behavior. ^Pro

Message Format

Each message has a role and content:

Role	Description
`system`	Sets the assistant’s behavior and persona.
`user`	Input from the end user.
`assistant`	Previous model responses for context.
`tool`	Results from function/tool calls.

Example Conversation

[
  {"role": "system", "content": "You are a helpful coding assistant."},
  {"role": "user", "content": "How do I reverse a string in Python?"}
]

Basic Request

from openai import OpenAI

client = OpenAI(
    base_url="https://api.llm.kiwi/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="default",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Streaming Responses

Enable streaming for real-time token delivery:

Python

stream = client.chat.completions.create(
    model="default",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Response Object

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1706745600,
  "model": "gpt-oss-20b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Use slicing: `my_string[::-1]`"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 20,
    "total_tokens": 32
  }
}

Endpoint

Request Parameters

Message Format

Example Conversation

Basic Request

Streaming Responses

Response Object

Pro Features

JSON Mode

Function Calling

​Endpoint

​Request Parameters

​Message Format

​Example Conversation

​Basic Request

​Streaming Responses

​Response Object

​Pro Features

JSON Mode

Function Calling

Endpoint

Request Parameters

Message Format

Example Conversation

Basic Request

Streaming Responses

Response Object

Pro Features