Tree-of-Thought와 Self-Consistency

~16 min · reasoning, tot, self-consistency

Level 0수련생

0 XP0/100 lessons0/14 achievements

0/120 XP to next level120 XP to go0% complete

한 reasoning chain은 틀릴 수 있어

한 CoT 샘플하고 trust하면 — 확률적 process의 하나의 sample에 답을 stake한 거야. 두 가지 cheap improvement: 여러 sample 뽑고 reconcile (self-consistency), 또는 reasoning을 tree로 branch하고 prune (Tree-of-Thought).

Self-consistency

temperature > 0에서 N개 reasoning chain 생성. 가장 자주 나오는 답 take. answer space 작은 task (math, classification)에 measurable한 정확도 boost.

Tree-of-Thought (ToT)

각 reasoning step에서 multiple candidate 생성, score, prune. self-consistency보다 비싸고; reasoning 중간 backtrack 가능해서 더 flexible. planning, multi-step search, complex agent loop에 유용.

cost reality

N=5 self-consistency는 5x cost. Tree-of-Thought는 더 비싸. cost가 정확도 gain 값할 곳에만 써 — 보통 high-stakes one-shot task, interactive chat 아니야.

Code

Self-consistency vote·python

from collections import Counter

answers = []
for _ in range(7):
    out = client.messages.create(
        model="claude-opus-4-7",
        temperature=0.8,
        max_tokens=512,
        messages=[{"role": "user", "content": prompt}],
    ).content[0].text
    answers.append(extract_final_answer(out))
final = Counter(answers).most_common(1)[0][0]

External links

Exercise

small set hard, classifiable task에 1-sample, 5-sample-vote, 9-sample-vote (temperature 0.8) 비교. 정확도와 cost plot. 동작 점 골라.

Progress

Progress is local-only — sign in to sync across devices.

← PreviousExtended thinking 모드 — Claude, OpenAI, Gemini 차이 Next →Reasoning vs Output — 둘 분리

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.