Rate Limits - LLM.kiwi

To ensure platform stability and fair access for all users, llm.kiwi implements rate limits across all API endpoints.

Rate Limits by Tier

All limits reset daily at Midnight UTC.

Tier	Daily Requests	Requests/Min (RPM)	Max Tokens
Anonymous	100 (shared)	10	2,048
Free (registered)	500	20	4,096
Pro	Unlimited*	100	32,768
Enterprise	Custom	Custom	Custom

*Pro tier has soft limits for abuse prevention. Contact support if you need higher throughput.

Rate Limit Headers

Each API response includes headers to help you track your usage:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed per minute
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Unix timestamp when the limit resets

Handling Rate Limits

When a rate limit is reached, the API returns a 429 Too Many Requests response.

Retry Strategy

Implement exponential backoff with jitter for optimal retry handling:

import time
import random

def make_request_with_retry(func, max_retries=5):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            wait_time = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait_time)

Best Practices

Exponential Backoff: Wait progressively longer between retries (2s, 4s, 8s, etc.)
Request Batching: Combine multiple operations when possible
Token Monitoring: Check response headers to anticipate limits
Caching: Cache responses for repeated identical queries

Increasing Your Limits

Upgrade to Pro

Unlock unlimited requests and higher token limits.

Enterprise

Custom limits tailored to your organization’s needs.

Platform Features Free Tier

​Rate Limits by Tier

​Rate Limit Headers

​Handling Rate Limits

​Retry Strategy

​Best Practices

​Increasing Your Limits

Upgrade to Pro

Enterprise

Rate Limits by Tier

Rate Limit Headers

Handling Rate Limits

Retry Strategy

Best Practices

Increasing Your Limits