Humanizer API: Rate Limits and Best Practices

Q: What if my burst exceeds the per-minute cap?

The 429 response includes Retry-After. Most teams find that bursting up to 2x the cap briefly is fine - sustained over-limit triggers temporary blocks. For consistent burst patterns, upgrade your plan.

TL;DRFree tier allows 10 requests per minute, Starter 60 RPM, Pro 300 RPM. When you hit a 429 response, respect the Retry-After header and apply exponential backoff with jitter. Batch requests count as one against your limit, so a Pro plan can process 6,000 humanizations per minute via batch.

Every API has limits. The Humanizer API is no exception. Understanding how rate limits work, how to handle errors gracefully, and which patterns keep your integration reliable at scale is the difference between a production-ready system and one that breaks under load.

This guide covers everything you need: rate limit tiers, error handling patterns, retry strategies, and the best practices that experienced integrators follow.

Rate Limit Tiers

Rate limits depend on your plan tier. The free tier allows 10 requests per minute and 10,000 words per month. The Pro tier increases to 60 requests per minute with higher monthly word limits. Enterprise plans offer custom rate limits based on your volume needs.

Rate limits apply to both individual and batch endpoints. A single batch request containing 100 texts counts as 1 request against your per-minute limit, not 100. This is why the batch endpoint is so much more efficient for high-volume processing.

Check the API documentation for current limits by plan. Rate limits are enforced per API key, not per IP address. If you’re running multiple services with the same key, they share the same limits.

HTTP 429: Too Many Requests

When you exceed your rate limit, the API returns a 429 status code with a Retry-After header indicating how many seconds to wait before retrying. Your code must handle this response correctly.

response = requests.post(api_url, headers=headers, json=payload)

if response.status_code == 429:
    retry_after = int(response.headers.get('Retry-After', 5))
    print(f"Rate limited. Waiting {retry_after} seconds...")
    time.sleep(retry_after)
    # Retry the request
elif response.status_code == 200:
    result = response.json()
else:
    handle_error(response)

Never ignore 429 responses. If you keep sending requests after being rate limited, subsequent requests may be rejected for longer periods. Respect the Retry-After header and your integration will stay healthy.

Exponential Backoff with Jitter

For transient errors (500, 502, 503, 504, and network timeouts), exponential backoff is the standard retry pattern. Wait 1 second after the first failure, 2 seconds after the second, 4 seconds after the third. Add random jitter to prevent multiple clients from retrying at exactly the same time.

import random
import time

def request_with_backoff(url, headers, payload, max_retries=4):
    for attempt in range(max_retries):
        try:
            response = requests.post(
                url, headers=headers, json=payload, timeout=30
            )

            if response.status_code == 200:
                return response.json()

            if response.status_code == 429:
                wait = int(response.headers.get('Retry-After', 5))
            elif response.status_code >= 500:
                wait = (2 ** attempt) + random.uniform(0, 1)
            else:
                # Client error (4xx), don't retry
                raise APIError(f"Client error: {response.status_code}")

            time.sleep(wait)

        except requests.Timeout:
            wait = (2 ** attempt) + random.uniform(0, 1)
            time.sleep(wait)

    raise APIError("Max retries exceeded")

The jitter (random.uniform(0, 1)) is critical. Without it, if 50 clients all fail at the same time, they all retry at the same time, causing another failure. Jitter spreads retries out and reduces contention.

Circuit Breaker Pattern

If the API is experiencing extended downtime, retrying every request wastes resources and adds latency to your application. The circuit breaker pattern solves this by “opening” the circuit after a threshold of consecutive failures, then periodically testing whether the API has recovered.

class CircuitBreaker:
    def __init__(self, failure_threshold=5, recovery_timeout=60):
        self.failures = 0
        self.threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.last_failure_time = 0
        self.state = "closed"  # closed = normal, open = failing

    def can_request(self):
        if self.state == "closed":
            return True

        # Check if recovery period has passed
        if time.time() - self.last_failure_time > self.recovery_timeout:
            self.state = "half-open"
            return True

        return False

    def record_success(self):
        self.failures = 0
        self.state = "closed"

    def record_failure(self):
        self.failures += 1
        self.last_failure_time = time.time()
        if self.failures >= self.threshold:
            self.state = "open"

Wrap your API calls with the circuit breaker. If the circuit is open, return a cached result or queue the request for later instead of waiting for a timeout.

Error Codes Reference

The API returns specific error codes that tell you exactly what went wrong. Handle each category differently:

400 Bad Request

Your request body is malformed. Check that you’re sending valid JSON with the required fields. Don’t retry these, fix the request.

401 Unauthorized

Your API key is invalid or expired. Check your key and regenerate if needed. Don’t retry.

403 Forbidden

Your plan doesn’t include the feature you’re trying to use, or your account is suspended. Check your account status on the dashboard.

413 Payload Too Large

Your text exceeds the maximum length for a single request. Split it into smaller chunks and process them separately.

429 Too Many Requests

Rate limited. Wait for the Retry-After period and try again.

500 Internal Server Error

Something went wrong on our end. Retry with exponential backoff. If it persists for more than 5 minutes, check the status page.

503 Service Unavailable

The API is temporarily down for maintenance. Retry after the Retry-After period.

Request Validation

Validate your requests before sending them. This prevents wasting API calls on requests that will fail with 400 errors.

def validate_request(text, tone=None):
    if not text or not isinstance(text, str):
        raise ValueError("Text must be a non-empty string")

    if len(text) > 50000:
        raise ValueError("Text exceeds maximum length (50,000 chars)")

    if len(text.strip()) < 50:
        raise ValueError("Text too short for meaningful humanization")

    valid_tones = ["professional", "casual", "conversational", "academic"]
    if tone and tone not in valid_tones:
        raise ValueError(f"Invalid tone. Choose from: {valid_tones}")

    return True

Catching these errors locally is instant. Sending them to the API adds latency and counts against your rate limit.

Logging Best Practices

Log every API interaction. At minimum, capture: request timestamp, response status code, response time, word count processed, and confidence score. This data helps you debug issues, track costs, and optimize performance.

import logging

logger = logging.getLogger("humanizer")

def log_api_call(response, start_time, word_count):
    duration = time.time() - start_time
    logger.info(
        f"status={response.status_code} "
        f"duration={duration:.2f}s "
        f"words={word_count} "
        f"confidence={response.json().get('confidence_score', 'N/A')}"
    )

In production, send these logs to a monitoring system like Datadog, CloudWatch, or your existing observability stack. Set up alerts for elevated error rates or response times above your threshold.

Monitoring and Alerting

Track these metrics over time: request success rate (should be above 99%), p95 response time (should be under 3 seconds for single requests), daily word count (to predict when you'll hit your quota), and error rate by type (to identify patterns).

Set up alerts for: success rate dropping below 98%, p95 response time exceeding 5 seconds, daily word count approaching 80% of your monthly quota, and any spike in 500 errors.

These alerts catch problems before they impact your users. A spike in latency might mean you need to add caching. A spike in errors might mean the API is having issues. Either way, you'll know before your content pipeline stalls.

SDK vs. Raw HTTP

The Humanizer API provides official SDKs for Python and Node.js. These SDKs handle authentication, retries, rate limiting, and error parsing automatically. For most teams, using the SDK is the right choice.

Use raw HTTP when you need maximum control over the request lifecycle, when you're working in a language without an official SDK, or when the SDK doesn't support a feature you need (like custom retry logic or specific timeout configurations).

The SDK handles 90% of use cases correctly out of the box. Start there. Switch to raw HTTP only when you hit a limitation.

Putting It All Together

A production-ready integration combines validation, circuit breaking, exponential backoff, logging, and monitoring. It handles every error gracefully, never loses data, and gives you visibility into what's happening.

Start simple: basic request validation and retry logic. Add the circuit breaker and monitoring as your volume grows. The patterns in this guide scale from 100 requests per day to 100,000.

Check the full API documentation for endpoint details, and explore the features page for capabilities you might want to integrate.

Get your free API key to start building. 10,000 words per month, no credit card required. That's enough to test your integration end-to-end before scaling up.

Backoff patterns by client

Every HTTP client should respect 429 responses. The right pattern is exponential backoff capped at the value in the Retry-After header.

Python

import time, requests
def humanize_with_retry(text, tone, attempts=4):
    for i in range(attempts):
        r = requests.post(URL, headers=HEADERS, json={'text': text, 'tone': tone})
        if r.status_code == 200: return r.json()
        if r.status_code != 429: r.raise_for_status()
        wait = int(r.headers.get('Retry-After', 2 ** i))
        time.sleep(wait)
    raise RuntimeError('rate limit persisted')

Node.js

async function humanizeWithRetry(text, tone, attempts = 4) {
  for (let i = 0; i < attempts; i++) {
    const r = await fetch(URL, { method: 'POST', headers, body: JSON.stringify({ text, tone }) });
    if (r.ok) return r.json();
    if (r.status !== 429) throw new Error(`HTTP ${r.status}`);
    const wait = parseInt(r.headers.get('Retry-After')) || Math.pow(2, i);
    await new Promise(res => setTimeout(res, wait * 1000));
  }
  throw new Error('rate limit persisted');
}

Frequently asked questions

What rate limit applies to me?

Free tier: 10 requests/minute. Starter: 60 RPM. Pro: 300 RPM. Enterprise: custom. See pricing.

Does the batch endpoint share the same limit?

Yes - a batch counts as one request. So Pro at 60 RPM × 100 items per batch = 6,000 humanizations/minute via batch.

What if my burst exceeds the per-minute cap?

The 429 response includes Retry-After. Most teams find that bursting up to 2x the cap briefly is fine - sustained over-limit triggers temporary blocks. For consistent burst patterns, upgrade your plan.

How do I distribute load across multiple API keys?

Each key has its own rate-limit pool. For higher concurrency, generate multiple keys and round-robin (Pro plans typically allow 5+ keys). Beyond that, talk to sales about Enterprise dedicated capacity.

Are there per-IP limits even on auth'd requests?

Yes - to prevent key-sharing abuse, we cap requests per IP at 10x the plan rate limit. For server-to-server traffic from a single egress IP, this isn't an issue. For end-user proxying, consider pooling keys per region.

What happens if I never back off?

Continuous 429 ignoring triggers progressive blocks: 5 min, 30 min, 2 hours. Repeated violations can suspend the API key. Always implement backoff.

Avoid these mistakes

Don't poll waiting for capacity - sleep for Retry-After, not Retry-After / 10.
Don't retry on 4xx errors - only 429 and 5xx. Retrying 400s wastes attempts.
Don't skip the jitter - pure exponential backoff causes thundering-herd retries. Add 10-20% random jitter.
Don't ignore the response logging - request_id in v2 responses helps support trace failed calls.

Rate Limits, Error Handling, and Best Practices for the Humanizer API