CKEYLIMIT: per-tenant rate limits that stop the runaway before it starts
A runaway agent or a noisy tenant can torch a budget in minutes. CKEYLIMIT sets per-tenant requests-per-minute and tokens-per-minute ceilings, enforced locally before the spend happens.
Budget tracking tells you about overspend; rate limiting prevents it. A looping agent or a misbehaving integration can burn a month's budget in an afternoon, and a ceiling discovered on the invoice is a ceiling that did nothing. CKEYLIMIT sets the wall in advance — per-tenant requests-per-minute and tokens-per-minute limits — and enforces it locally, before the request reaches the provider.
The granularity is per tenant for the same reason spend is tracked per tenant: abuse is local. One customer's runaway shouldn't be allowed to starve everyone else or torch the shared budget, and a per-tenant ceiling contains the blast radius to the tenant causing it. CKEYLIMIT SET configures the limits; GET reads them; DEL removes them.
The wall is enforced before the invoice, not discovered on it.
Because the wall is enforced at the cache, in front of the upstream, the runaway hits the limit at microsecond cost instead of metered cost — the blocked request never becomes a billed token. Pair it with CBUDGET's visibility and you have both halves: see the spend as it happens, and cap the source before it spends more.
The bottom line
A budget without enforcement is a wish. CKEYLIMIT is the enforcement — the wall that's hit before the invoice is written, not discovered after.