Pagination — Cursor 가 offset 이기는 이유 부딪힐 거

"Offset pagination 이 자연스런 첫 패턴. 데이터셋이 몇천 row 넘기 전까지, 페이지 request 사이에 누가 테이블에 안 쓸 때까진 잘 동작. 그 다음 cursor pagination 이 안 깨는 두 가지 구체적 방식으로 깨져."

왜 paginate 해야 해

List 돌려주는 어느 API 도 결국 너무 많이 돌려줘. User 5만 명인 GET /users 는 적대적 response — client 가 메가바이트 download, server 가 전체 테이블 scan, UI 가 render 하다 얼어. Pagination 은 "좀 줘, 그 다음 더 달라고 할게" 의 protocol.

매우 다른 tradeoff 가진 세 패턴이 지배.

패턴 1: Offset/Limit (혹은 Page/Page-Size)

자연스런 첫 설계: GET /users?offset=100&limit=20 이 "처음 100 건너뛰고 20 줘" 의미. 순수 SQL: SELECT ... LIMIT 20 OFFSET 100.

Pros: 구현 trivial, UI "page 5 로 점프" 기능에 직관적, total-count 표시와 잘 어울림.

Cons (둘 다 심각):

Write 에 불안정. Page 1 (offset=0) 과 page 2 (offset=20) 사이에 position 50 에 새 row 삽입되면, page 2 가 row 20 이었던 거에서 시작하는데 지금 row 21 — row 21 의 옛 내용 (지금 index 22) 건너뛰거나 row 20 의 옛 내용 (지금 index 21) 두 번 봐. 활성 데이터셋에선 이거 항상 일어남.
큰 offset 에서 O(N). SELECT ... LIMIT 20 OFFSET 1000000 이 DB 가 1,000,000 row scan 후 skip 필요. Index trick 없음; offset 이 계산적. 몇천 offset 넘으면 성능 급격 저하.

패턴 2: Cursor Pagination

Client 가 "마지막 본 항목" 식별하는 cursor 전달, server 가 그 cursor 직후 N 항목 돌려줌. GET /users?cursor=usr_8x3kPq&limit=20 이 "이 user 바로 다음 정렬된 user 20 줘" 의미.

Pros (둘 다 offset 문제 fix):

Write 에 안정적. Cursor 가 index 아닌 정렬 순서의 특정 position encode. Insert 와 delete 가 cursor 의미 안 shift.
Position 무관 O(log N). Server 가 indexed predicate (WHERE id > 'usr_8x3kPq' ORDER BY id LIMIT 20) 씀, index range scan. cursor=position-1 에서나 cursor=position-1,000,000 에서나 같은 속도.

Cons: 임의 페이지로 점프 못 함 ("page 5 / 20" UI 없음); cursor 는 opaque 해야 (client 가 만들면 안 됨); 안정 정렬 순서 필요 (보통 created_at + id tiebreaker).

패턴 3: Page Token (Google / AWS 스타일)

Cursor 같은데 명시적으로 opaque — server 가 nextPageToken 돌려주고, client 가 다음 request 에 돌려보냄. Server 가 원하는 거 (정렬 position, filter snapshot, anti-tampering 서명) encode 가능. Google Cloud API 와 AWS 가 이거 씀.

Pros: cursor 와 같음 + server 가 client 안 깨고 encoding 변경 가능.

Cons: cursor 와 같음 + opacity 가 debugging 어렵게 만듦 (token 읽고 어디 있는지 못 봄).

새 API 엔 cursor pagination 기본. Offset 이 더 단순하게 느껴지지만 두 흔한 경우에서 깸 — 활성 데이터셋 (insert/delete 불안정) 과 큰 데이터셋 (성능). Cursor 가 둘 다 처리. End-to-end control 하는 경우 (작은 테이블의 admin tool 등) — 비용 안 무는 — 에 offset 예약.

Pagination metadata 돌려주기

Next-page 정보 살 수 있는 곳 셋:

Response body. {"items": [...], "next_cursor": "abc", "has_more": true}. 가장 흔함; 명시적; client 가 항상 봄.
Link header (RFC 8288). Link: </users?cursor=abc>; rel="next". HATEOAS-flavored 접근. GitHub 가 함.
둘 다. Belt and suspenders — SDK 편의용 body, 표준 header parse 하는 tooling 용 Link header.

Total count 는 별개 질문. 필요하면 ("4,500 중 1-20 표시"), body 에 {"total": 4500} 나 X-Total-Count: 4500 로 돌려줘. 근데 모든 request 에 total count 계산하는 게 큰 테이블에서 비쌈 — 많은 API 가 "더 사용 가능" boolean 만 돌려주고 total 은 명시 count endpoint 에 예약.

cwkPippa 의 pagination 현실

cwkPippa session list endpoint 가 현재 offset (?offset=0&limit=50) 써 — 데이터셋 작아서 (아빠가 네 brain 통틀어 대화 아마 500 개 정도). 그 수가 몇천 넘으면 cursor pagination 매력 — 특히 round 하나씩 누적되는 council list. 전환은 알려진 미래 작업; 현재 offset pagination 이 그때까진 잘 ship. 이게 미루기 맞는 곳: 지금 비용 낮음, 비용 뒤집힐 때 명확한 migration 경로.

Code

세 pagination 패턴 나란히·bash

# Offset pagination — 작은 데이터셋엔 동작, scale 에선 깸
curl 'https://api.example.com/users?offset=0&limit=20'
# {
#   "items": [...],
#   "total": 4500,
#   "offset": 0,
#   "limit": 20
# }

# Cursor pagination — 어느 scale 에서나 안정적 + 빠름
curl 'https://api.example.com/users?limit=20'
# {
#   "items": [...],
#   "next_cursor": "usr_8x3kPq",
#   "has_more": true
# }

# 다음 페이지 따라가기
curl 'https://api.example.com/users?cursor=usr_8x3kPq&limit=20'
# {
#   "items": [...],
#   "next_cursor": "usr_LmN9Op",
#   "has_more": true
# }

# Link header 대안 (RFC 8288, GitHub-스타일)
curl -i 'https://api.example.com/users?limit=20'
# HTTP/1.1 200 OK
# Link: </users?cursor=usr_8x3kPq&limit=20>; rel="next"
# Content-Type: application/json

Cursor pagination — 'has more' 감지에 limit+1 fetch·python

# FastAPI — cursor pagination 구현
from fastapi import FastAPI, Query
from sqlalchemy import select  # 가상 ORM

app = FastAPI()

@app.get('/users')
async def list_users(
    limit: int = Query(20, ge=1, le=100),
    cursor: str | None = Query(None, description='이전 페이지의 opaque cursor'),
):
    # 안정 정렬 순서: created_at + id tiebreaker
    query = select(User).order_by(User.created_at, User.id)

    if cursor:
        # cursor 가 마지막 본 row 의 (created_at, id) encode
        last_created_at, last_id = decode_cursor(cursor)
        query = query.where(
            (User.created_at, User.id) > (last_created_at, last_id)
        )

    # 다음 페이지 있는지 감지하려고 하나 더 fetch
    rows = await db.execute(query.limit(limit + 1))
    items = list(rows.scalars())
    has_more = len(items) > limit
    items = items[:limit]  # probe row 잘라

    next_cursor = None
    if has_more and items:
        last = items[-1]
        next_cursor = encode_cursor((last.created_at, last.id))

    return {
        'items': [u.to_dict() for u in items],
        'next_cursor': next_cursor,
        'has_more': has_more,
    }

Client — paginate_all generator, 절대 전체 안 로드·python

# Client — 모든 페이지 걷기
import httpx

def paginate_all(url: str, params: dict | None = None):
    params = params or {}
    while True:
        resp = httpx.get(url, params=params)
        resp.raise_for_status()
        data = resp.json()
        for item in data['items']:
            yield item
        if not data.get('has_more'):
            break
        params['cursor'] = data['next_cursor']

# 모든 user 걷기, 전체 list 메모리에 절대 안 들고 있음
for user in paginate_all('https://api.example.com/users', {'limit': 100}):
    print(user['id'])

Exercise

DB 에서 row 10,000 나열하는 FastAPI endpoint 만들어 (SQLite + 빠른 faker), offset 과 cursor pagination 나란히: GET /items-offset?offset=N&limit=20 와 GET /items-cursor?cursor=X&limit=20. 각각을 offset 0, 100, 1000, 10000 에서 time curl ... 로 벤치마크. 그 다음 paginated walk 진행 중에 정렬 순서 중간에 새 row 100 삽입 후 어느 paginator 가 항목 건너뛰거나 반복하는지 관찰 (offset 그러고; cursor 안 그래).

Hint

Cursor encoding 엔 base64(json.dumps([created_at, id])) 가 데모에 충분. Offset 이 offset=1000 넘으면 극적으로 느려짐; cursor 평평 유지. 불안정 데모가 더 극적 — paginator 시작 (limit=10, request 사이에 sleep 으로 천천히 iterate), 그 다음 다른 shell 에서 row 100 INSERT. Offset 잘못 세는 거 봐; cursor 일관 유지하는 거 봐.

Pagination — Cursor 가 offset 이기는 이유 부딪힐 거

왜 paginate 해야 해

패턴 1: Offset/Limit (혹은 Page/Page-Size)

패턴 2: Cursor Pagination

패턴 3: Page Token (Google / AWS 스타일)

Pagination metadata 돌려주기

cwkPippa 의 pagination 현실

Code

External links

Exercise

Progress

댓글 0