Base vs Instruct 모델

두 가지 출발점

모든 오픈 웨이트 모델 패밀리는 보통 파인튜닝 출발점이 두 가지야.

Base 모델

거대한 텍스트 코퍼스에서 다음 토큰 예측만으로 학습된 모델. 텍스트 완성은 아름답게 하지만 자연스럽게 지시 따르거나 대화 못 해. 예: meta-llama/Llama-3.1-8B, mistralai/Mistral-7B-v0.3, Qwen/Qwen3-7B. 빈 캔버스 같은 거.

Instruct / chat 모델

Base 모델을 SFT랑 preference 최적화(RLHF, DPO, ORPO)로 추가 학습해서 지시 따르고, 해로운 요청 거절하고, 대화하게 만든 거. 예: meta-llama/Llama-3.1-8B-Instruct, mistralai/Mistral-7B-Instruct-v0.3.

어느 쪽 파인튜닝해?

출발	적합한 경우	트레이드오프
Instruct	대부분 프로젝트. 대화 능력 + 네 전문 영역.	쉽고, chat 능력 보존, 입문자 추천.
Base	최대 컨트롤, 특이 출력 포맷, 비-chat 작업(분류/추출).	어렵고, 데이터 더 필요, 조심 안 하면 alignment 잃을 수 있어.

기본 추천

특별한 이유 없으면 Instruct로 시작해. 모델 제작자가 이미 한 alignment 작업 다 상속받아 — 안전 필터, 시스템 프롬프트 인지, 멀티턴 일관성 — 그 위에 네 전문 영역만 추가하면 돼. Instruct에서 도메인 특화 chat 모델로 가는 길은 Base에서 가는 길보다 훨씬 짧아.

Code

Loading either flavor with the same call·python

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Base model — blank canvas
base = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Instruct model — already chat-tuned
chat = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# The chat-tuned tokenizer ships a chat_template; the Base does NOT.
tok = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
print(tok.chat_template[:200])  # Jinja template that formats messages

Exercise

모델 패밀리 하나 골라(Llama 3.1, Mistral, Qwen 3, Gemma 3 — 자유). Hub에서 Base랑 Instruct 모델 카드 둘 다 읽어. 한 단락 결정 작성: 네가 가장 가능성 높은 파인튜닝 use case에 대해 어느 쪽에서 출발하고 왜? Instruct 학습의 구체적 요소 최소 하나 인용.