Skip to main content

Two Types of Limits

Ardie enforces two types of limits on API usage:
  1. Rate Limits — Requests per minute
  2. Usage Caps — Total queries per billing period

Rate Limits

To ensure fair usage and system stability, API requests are rate-limited:
LimitValue
Requests per minute60
Per API keyYes
Rate limits are applied per API key, not per account. If you have multiple keys, each has its own rate limit.

Rate Limit Headers

Every response includes headers showing your rate limit status:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1704067200
HeaderDescription
X-RateLimit-LimitMaximum requests per minute
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the limit resets

Rate Limit Exceeded

When you exceed the rate limit:
{
  "error": "rate_limit_exceeded",
  "message": "Too many requests. Please wait before retrying.",
  "retry_after": 15
}
Best practice: Implement exponential backoff when you receive a 429 response.

Usage Caps

Your plan includes a fixed number of queries per billing period. This is a hard cap—when reached, requests are rejected until the next period or you upgrade.

Checking Your Usage

Monitor usage via the Dashboard or check the response headers:
X-Usage-Used: 450
X-Usage-Cap: 500
X-Usage-Remaining: 50

Cap Exceeded Response

When you hit your query cap:
{
  "error": "query_cap_exceeded",
  "message": "You have exceeded your monthly query cap of 500 queries.",
  "usage": {
    "used": 500,
    "cap": 500
  },
  "upgrade_url": "https://ardie.ai/billing"
}

Handling Limits in Code

import time
import requests

def query_with_retry(kb_id, query, api_key, max_retries=3):
    url = f"https://api.ardie.ai/kb/{kb_id}/query"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json={"query": query})
        
        if response.status_code == 200:
            return response.json()
        
        if response.status_code == 429:
            error = response.json()
            if error.get("error") == "query_cap_exceeded":
                raise Exception("Query cap exceeded. Upgrade your plan.")
            
            # Rate limited - wait and retry
            retry_after = error.get("retry_after", 2 ** attempt)
            time.sleep(retry_after)
            continue
        
        response.raise_for_status()
    
    raise Exception("Max retries exceeded")

Next Steps