GPT 계보 — GPT-1에서 GPT-5까지

OpenAI의 GPT 시리즈가 스케일링 패러다임을 정의. 순서대로 계보 읽는 게 모던 AI의 가장 빠른 역사.

모델	연도	params	컨텍스트	핵심 변화
GPT-1	2018	117M	512	사전학습 + fine-tune 레시피
GPT-2	2019	1.5B	1,024	일관된 다단락 생성
GPT-3	2020	175B	2,048	대규모 in-context learning
GPT-3.5 / ChatGPT	2022	~175B (유사)	4k → 16k	RLHF + chat 포맷으로 접근성 확보
GPT-4	2023	~1.8T (MoE 추정)	8k → 128k	비전 입력 multimodal, 큰 품질 도약
GPT-4o	2024	n/a (native multimodal)	128k	오디오 + 이미지 native, 추론 비용 절감
GPT-4.1	2025년 4월	비공개	1M	1M 컨텍스트, 코딩 집중
GPT-5	2025년 8월	~1.7T 총	400k	fast / thinking / Pro 모드 가로지르는 실시간 router
GPT-OSS-120b	2025	117B (5.1B active)	—	open weight MoE, H100 한 장에 들어감

GPT-5는 각 쿼리를 다른 추론 모드(fast, thinking, Pro)로 dispatch하는 실시간 router 도입. 총 컨텍스트 400K (입력 272K + 출력 128K). open weight 변종 GPT-OSS-120b는 Apache 2.0이고 H100 GPU 한 장에 들어가.

Code

Reading a model card systematically·python

# When you encounter a new model, extract these in order:
fields = [
    "release date",
    "parameter count (total / active for MoE)",
    "architecture shape (d_model, n_layers, n_heads)",
    "vocabulary size and tokenizer",
    "context window (input + output)",
    "training data: source, scale, cutoff date",
    "post-training stack (SFT? DPO? GRPO? CAI?)",
    "license (commercial use? attribution?)",
    "modalities (text only? vision? audio?)",
    "stated benchmarks and known weaknesses",
]
# This is the universal template that lets you place any new model
# into the lineage without getting lost in marketing claims.

GPT 계보 — GPT-1에서 GPT-5까지

Code

External links

Exercise

Progress

댓글 0