요약 · 메모리 압축 · drift

~12 min · summarization, memory, compression

Level 0Observer

0 XP0/64 lessons0/13 achievements

0/150 XP to next level150 XP to go0% complete

요약이 raw history 이길 때

Long-running 챗이면 raw history가 무한정 성장하고 비용도. 주기적 요약이 옛 턴을 condensed assistant note로 교체, conversational fluff 없이 모델 필요한 fact 보존. 잘하면 quality 높게 토큰 낮게.

뭘 보존할지

좋은 요약 보존 — user-stated facts('I work in pharma'), explicit preferences('always reply in Spanish'), unresolved tasks, reached conclusions. 나쁜 요약 보존 — 인사, banter, 모델 자체 사과. 무엇 keep할지 call out하게 요약 프롬프트 hand-craft.

Drift 진짜

요약 패스마다 fidelity 잃음. 너무 많이 돌리면 모델 effective 메모리가 vague impression으로 degrade. Tier로 요약 — last-N 턴 verbatim, 중간 tier 가볍게 요약, 가장 옛 tier 무겁게 요약 — 한 flat compression 대신.

원칙: 요약은 lossy compression. Compressor 호출 전에 keep 가치 있는 거 결정.

Code

Tiered 요약·python

VERBATIM = 8
SUMMARIZED = 24

def tiered(history: list[dict]) -> list[dict]:
    if len(history) <= VERBATIM:
        return history
    recent = history[-VERBATIM:]
    middle = history[-(VERBATIM + SUMMARIZED):-VERBATIM]
    older = history[:-(VERBATIM + SUMMARIZED)] if len(history) > VERBATIM + SUMMARIZED else []

    middle_summary = summarize(middle, depth="light") if middle else ""
    older_summary = summarize(older, depth="heavy") if older else ""

    notes = []
    if older_summary:
        notes.append({"role": "user", "content": f"<early_summary depth='heavy'>{older_summary}</early_summary>"})
    if middle_summary:
        notes.append({"role": "user", "content": f"<recent_summary depth='light'>{middle_summary}</recent_summary>"})
    return notes + recent

External links

Exercise

Multi-turn 챗에 tiered 요약 추가. 같은 50턴 대화를 with·without 돌리고 총 input 토큰, 초기 컨텍스트에 의존하는 probe 질문의 응답 quality 비교.

Hint

요약 후 probe 질문 fail하면 요약 프롬프트가 보존 필요한 fact 떨어뜨리고 있는 거.

Progress

Progress is local-only — sign in to sync across devices.

← PreviousRAG vs Stuffing: lookup 빈도로 픽 Next →아키텍처로서의 컨텍스트

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.