The Chat Completions API is your primary interface for building chatbots, virtual assistants, and text generation applications.
Endpoint
POST https://api.llm.kiwi/v1/chat/completions
Request Parameters
Model ID: default (Free), or any Pro model (e.g. gpt-oss-20b).
Conversation history as message objects.
Enable streaming responses.
Sampling temperature (0-2).
Maximum tokens to generate.
Set { "type": "json_object" } for JSON mode. Pro
Function definitions for tool calling. Pro
Control tool selection behavior. Pro
Each message has a role and content:
Role Description systemSets the assistant’s behavior and persona. userInput from the end user. assistantPrevious model responses for context. toolResults from function/tool calls.
Example Conversation
[
{ "role" : "system" , "content" : "You are a helpful coding assistant." },
{ "role" : "user" , "content" : "How do I reverse a string in Python?" }
]
Basic Request
from openai import OpenAI
client = OpenAI(
base_url = "https://api.llm.kiwi/v1" ,
api_key = "YOUR_API_KEY"
)
response = client.chat.completions.create(
model = "default" ,
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
print (response.choices[ 0 ].message.content)
Streaming Responses
Enable streaming for real-time token delivery:
stream = client.chat.completions.create(
model = "default" ,
messages = [{ "role" : "user" , "content" : "Tell me a story." }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Response Object
{
"id" : "chatcmpl-abc123" ,
"object" : "chat.completion" ,
"created" : 1706745600 ,
"model" : "gpt-oss-20b" ,
"choices" : [
{
"index" : 0 ,
"message" : {
"role" : "assistant" ,
"content" : "Use slicing: `my_string[::-1]`"
},
"finish_reason" : "stop"
}
],
"usage" : {
"prompt_tokens" : 12 ,
"completion_tokens" : 20 ,
"total_tokens" : 32
}
}
Pro Features
JSON Mode Get structured JSON responses with guaranteed valid output.
Function Calling Enable models to call your functions and APIs.