긴 대화 — compaction 전략

~18 min · conversation, compaction, long-context

Level 0수련생

0 XP0/100 lessons0/14 achievements

0/120 XP to next level120 XP to go0% complete

대화가 context보다 커져

1M-토큰 window에서도 매우 긴 대화는 결국 "full" 느낌 — attention degrade, cost 오름, latency suffer. Compaction은 옛 turn을 summary로 교체해서 active context lean하게 유지하는 deliberate process.

3가지 compaction 전략

Sliding window — 마지막 N turn verbatim 유지; 이전 drop. Cheap; context 잃음.
Summary buffer — 옛 turn을 running summary로 교체. gist 유지; verbatim 잃음.
Hybrid — pinned system + 옛 거 summary + 마지막 N turn verbatim. Serious system의 default.

verbatim 유지할 것

현재 task와 intermediate 결과.
active loop의 tool call.
user가 방금 reference한 거 ("as you said earlier").

요약할 것

Resolved sub-task.
다시 quote 안 될 background context.
더 이상 relevant 안 한 tool 결과.

Code

Hybrid compaction·python

def compact(messages, system, keep_recent: int = 6, summarize_older=True):
    if len(messages) <= keep_recent:
        return messages
    older, recent = messages[:-keep_recent], messages[-keep_recent:]
    if summarize_older and older:
        summary = summarize_messages(older)
        return [{"role": "user", "content": f"[earlier conversation summary]\n{summary}"}] + recent
    return recent

External links

Anthropic — Long-running conversations

Exercise

긴 대화 log에서 turn 30에 hybrid compaction implement. compaction 있고 없고 follow-up 질문에 모델 성능 비교. degradation 메모.

Progress

Progress is local-only — sign in to sync across devices.

← PreviousPersona drift — turn 사이로 catch Next →매 turn re-anchoring

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.