Two Types of Limits
Ardie enforces two types of limits on API usage:
Rate Limits — Requests per minute
Usage Caps — Total queries per billing period
Rate Limits
To ensure fair usage and system stability, API requests are rate-limited:
Limit Value Requests per minute 60 Per API key Yes
Rate limits are applied per API key, not per account. If you have multiple keys, each has its own rate limit.
Every response includes headers showing your rate limit status:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1704067200
Header Description X-RateLimit-LimitMaximum requests per minute X-RateLimit-RemainingRequests remaining in current window X-RateLimit-ResetUnix timestamp when the limit resets
Rate Limit Exceeded
When you exceed the rate limit:
{
"error" : "rate_limit_exceeded" ,
"message" : "Too many requests. Please wait before retrying." ,
"retry_after" : 15
}
Best practice: Implement exponential backoff when you receive a 429 response.
Usage Caps
Your plan includes a fixed number of queries per billing period. This is a hard cap —when reached, requests are rejected until the next period or you upgrade.
Checking Your Usage
Monitor usage via the Dashboard or check the response headers:
X-Usage-Used: 450
X-Usage-Cap: 500
X-Usage-Remaining: 50
Cap Exceeded Response
When you hit your query cap:
{
"error" : "query_cap_exceeded" ,
"message" : "You have exceeded your monthly query cap of 500 queries." ,
"usage" : {
"used" : 500 ,
"cap" : 500
},
"upgrade_url" : "https://ardie.ai/billing"
}
Handling Limits in Code
import time
import requests
def query_with_retry ( kb_id , query , api_key , max_retries = 3 ):
url = f "https://api.ardie.ai/kb/ { kb_id } /query"
headers = {
"Authorization" : f "Bearer { api_key } " ,
"Content-Type" : "application/json"
}
for attempt in range (max_retries):
response = requests.post(url, headers = headers, json = { "query" : query})
if response.status_code == 200 :
return response.json()
if response.status_code == 429 :
error = response.json()
if error.get( "error" ) == "query_cap_exceeded" :
raise Exception ( "Query cap exceeded. Upgrade your plan." )
# Rate limited - wait and retry
retry_after = error.get( "retry_after" , 2 ** attempt)
time.sleep(retry_after)
continue
response.raise_for_status()
raise Exception ( "Max retries exceeded" )
Next Steps
Error Handling Handle all error types gracefully
Pricing Plans Upgrade for more queries