C.W.K.
Stream
Lesson 06 of 10 · published

Rate limiting과 backpressure

~12 min · production, rate-limits

Level 0수련생
0 XP0/100 lessons0/14 achievements
0/120 XP to next level120 XP to go0% complete

너 rate limited; user도

Provider rate limit이 3 layer에 sit — RPM, TPM, concurrency. 너의 application이 그걸 respect, hit하면 gracefully recover, user한테 backpressure propagate해서 request 안 piled up.

Tactic

  • Smoothing — egress에 token bucket / leaky bucket, burst가 너 429 안 시키게.
  • Jitter 박힌 retry — randomization 박힌 exponential backoff, attempt cap.
  • Per-user limit — runaway client 보호; UI에 limit surface.
  • Backpressure — downstream hot이면 caller한테 429 return; silent하게 queue X.
  • Tier-aware routing — heavy user를 higher-tier API key로, light user를 shared pool로.

User한테 노출할 것

  • 유용한 429 ("slow down — try again in 30 seconds").
  • 유료 customer용 quota dashboard.
  • Rate limit 가까울 때 UI affordance (disabled button, banner).

Code

Egress의 token bucket·python
from anyio import sleep
from asyncio import Lock
from time import monotonic

class TokenBucket:
    def __init__(self, rate_per_sec: float, capacity: int):
        self.rate, self.capacity = rate_per_sec, capacity
        self.tokens = capacity
        self.last = monotonic()
        self.lock = Lock()

    async def take(self, n: int = 1):
        async with self.lock:
            while True:
                now = monotonic()
                self.tokens = min(self.capacity, self.tokens + (now - self.last) * self.rate)
                self.last = now
                if self.tokens >= n:
                    self.tokens -= n
                    return
                await sleep((n - self.tokens) / self.rate)

External links

Exercise

한 client에 token-bucket smoother 추가. provider RPM의 10× fire해서 stress-test. Provider에 429 도달 X verify, client가 burst failure 대신 steady latency 봐 verify.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.