The Replyful API rate-limits requests per API key using a sliding-window counter. Stay under the limit by reading the response headers we return on every call.Documentation Index
Fetch the complete documentation index at: https://docs.replyful.com/llms.txt
Use this file to discover all available pages before exploring further.
Limit
| Bucket | Limit |
|---|---|
| Per API key | 100 requests per 60 seconds |
Response headers
Every response (success or failure) includes the IETF RateLimit headers:| Header | Meaning |
|---|---|
RateLimit-Limit | Total requests allowed in the current window. |
RateLimit-Remaining | Requests remaining before throttling kicks in. |
RateLimit-Reset | Seconds until the bucket refills. |
Throttled responses
When you exceed the limit, the API returns429 Too Many Requests with a Retry-After header (in seconds):
Backoff strategy
A robust client honorsRetry-After and adds jitter on repeated throttles to avoid thundering herds:
Practical tips
- Watch
RateLimit-Remaining. Slow down preemptively when it dips below 10 — it’s cheaper than recovering from a429. - Coalesce reads. If you need many conversations, list with
?limit=100and walk pages — that costs one request per page instead of one per conversation. - Spread bursts. Schedule background jobs with small delays between calls rather than firing them in a tight loop.
- One key per workload. Separate keys for separate workloads keeps a noisy job from starving a critical path.