We Compare AI

Rate Limit

Infrastructure
Simple Definition

Constraints imposed by AI API providers on how many requests or tokens a user can process per minute, hour, or day.

Full Explanation

Rate limits exist to prevent abuse, manage server capacity, and ensure fair access. They're typically measured in RPM (requests per minute), RPD (requests per day), and TPM (tokens per minute). Enterprise tiers have higher limits. Exceeding rate limits returns 429 errors. Strategies to handle limits include queuing, batching, and exponential backoff.

Last verified: 2026-03-30← Back to Glossary