Text generation

생성 = next-token 예측을 loop 으로

causal language model 이 하는 건 딱 하나야: 지금까지 주어진 걸로 다음 token 예측. generate() 가 그 한 스텝을 loop 으로 감싼 거야 — token 예측 → 붙임 → 길어진 sequence 다시 입력 → max_length 나 end-of-text token 까지 반복. generator 로드는 track 내내 본 그 한 줄 (GemmaCausalLM.from_preset(...)), prompt 문자열 넣으면 완성 문자열 나와.

personality 는 sampler 에 산다

다들 과소평가하는 부분 — 모델의 raw output 은 매 스텝마다 전체 vocab 에 대한 확률 분포야. 그 분포에서 어떻게 고르냐 (= sampler) 가 전부를 바꿔. temperature 는 분포를 평평하게/날카롭게 해: 0 근처면 늘 가장 확률 높은 token 만 집어 (deterministic, 반복적, 안전), 1.0 근처면 가능성 낮은 token 도 진짜 기회를 줘 (창의적, 가끔 횡설수설). top_k 는 상위 k 후보로만 제한, top_p (nucleus sampling) 는 확률 질량 p 만큼만 덮는 상위 후보를 유지. Code 처럼 sampler 를 compile() 에 넘기거나 generate() 기본값을 써도 돼 — 근데 이 knob 이 있다는 걸 아는 게 쓸만한 데모와 답답한 데모를 가르는 거야. production 에선 보통 temperature=0.7, top_p=0.9 부터 시작.

Code

텍스트 생성 + sampler 튜닝·python

import keras_hub

# Load GPT-2 for text generation
gpt2 = keras_hub.models.GPT2CausalLM.from_preset("gpt2_base_en")

# Generate with various sampling strategies
output = gpt2.generate(
    "The future of AI is",
    max_length=100,
)
print(output)

# Control generation with temperature, top_k, top_p
gpt2.compile(sampler=keras_hub.samplers.TopKSampler(k=50, temperature=0.7))

생성 = next-token 예측을 loop 으로

personality 는 sampler 에 산다

Code

External links

Exercise

Progress

댓글 0