Rate limits

The Replyful API rate-limits requests per API key using a sliding-window counter. Stay under the limit by reading the response headers we return on every call.

Limit

Bucket	Limit
Per API key	100 requests per 60 seconds

If you need a higher ceiling, contact support — limits are per plan and can be raised.

Response headers

Every response (success or failure) includes the IETF RateLimit headers:

Header	Meaning
`RateLimit-Limit`	Total requests allowed in the current window.
`RateLimit-Remaining`	Requests remaining before throttling kicks in.
`RateLimit-Reset`	Seconds until the bucket refills.

Example:

HTTP/1.1 200 OK
RateLimit-Limit: 100
RateLimit-Remaining: 87
RateLimit-Reset: 30
Content-Type: application/json

Throttled responses

When you exceed the limit, the API returns 429 Too Many Requests with a Retry-After header (in seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 30
RateLimit-Limit: 100
RateLimit-Remaining: 0
RateLimit-Reset: 30
Content-Type: application/json

{
  "error": {
    "type": "rate_limit_error",
    "code": "too_many_requests",
    "message": "Rate limit exceeded. Slow down and try again shortly.",
    "docUrl": "https://docs.replyful.com/errors/too_many_requests",
    "requestId": "req_..."
  }
}

Backoff strategy

A robust client honors Retry-After and adds jitter on repeated throttles to avoid thundering herds:

async function withRetry<T>(
  fn: () => Promise<Response>,
  maxAttempts = 5,
): Promise<T> {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const res = await fn();
    if (res.status !== 429) {
      if (!res.ok) {
        throw new Error(`Request failed: ${res.status}`);
      }
      return res.json() as Promise<T>;
    }

    const retryAfter = Number(res.headers.get("Retry-After") ?? "1");
    const backoff = retryAfter * 1000 * 2 ** attempt;
    const jitter = Math.random() * 250;
    await new Promise((r) => setTimeout(r, backoff + jitter));
  }

  throw new Error("Rate limit retries exhausted");
}

Practical tips

Watch RateLimit-Remaining. Slow down preemptively when it dips below 10 — it’s cheaper than recovering from a 429.
Coalesce reads. If you need many conversations, list with ?limit=100 and walk pages — that costs one request per page instead of one per conversation.
Spread bursts. Schedule background jobs with small delays between calls rather than firing them in a tight loop.
One key per workload. Separate keys for separate workloads keeps a noisy job from starving a critical path.

Documentation

Conversations

Limit

Response headers

Throttled responses

Backoff strategy

Practical tips

Documentation

Conversations

Documentation Index

​Limit

​Response headers

​Throttled responses

​Backoff strategy

​Practical tips

Limit

Response headers

Throttled responses

Backoff strategy

Practical tips