generate_content() 와 response 구조

10,000 번 쓸 호출 하나

client.models.generate_content(model=..., contents=..., config=...) 가 SDK 의 심장. contents 는 plain string (SDK 가 wrap), string list, Content 객체 list, 그 사이 어떤 거든 가능 — SDK 가 normalize.

GenerateContentConfig 로 설정

이전 lesson 의 모든 knob (system_instruction, temperature, max_output_tokens 등) 이 types.GenerateContentConfig 위에 살아. plain dict 도 pass 가능 — SDK 가 변환.

Response 가 실제로 뭔지

반환값은 GenerateContentResponse. 가장 자주 만질 field:

response.text — 모든 part 합친 text. 90% 케이스.
response.parts — raw list. non-text part (image, function call) 필요할 때.
response.function_calls — 모델이 tool 호출했을 때 채워짐.
response.parsed — Pydantic schema 로 JSON 모드 썼을 때 deserialize 된 객체.
response.candidates[0].finish_reason — stop reason.
response.usage_metadata — billing 용 토큰 카운트.

가장 먼저 체크할 것

Gemini 호출이 돌아오면 handler 의 첫 줄이 finish_reason 봐야 돼. STOP 아니면 response.text 신뢰 X — 비어있을 (필터됨) 수도, 잘릴 (max tokens 도달) 수도, recitation-차단 일 수도.

Code

Plain text in, text out·python

from google import genai
from google.genai import types

client = genai.Client()

response = client.models.generate_content(
    model='gemini-2.5-flash',
    contents='Why is the sky blue?',
)
print(response.text)

config 와 함께·python

response = client.models.generate_content(
    model='gemini-2.5-flash',
    contents='Explain quantum entanglement to a curious 10-year-old.',
    config=types.GenerateContentConfig(
        system_instruction='You are a warm, accurate physics tutor.',
        max_output_tokens=400,
        temperature=0.5,
        top_p=0.95,
        top_k=40,
        seed=42,
    ),
)

# Config 는 dict 도 가능 — SDK 가 변환
response = client.models.generate_content(
    model='gemini-2.5-flash',
    contents='Hello',
    config={'temperature': 0.0, 'max_output_tokens': 50},
)

Response 옳게 읽기·python

candidate = response.candidates[0]
reason = candidate.finish_reason

if reason.name != 'STOP':
    # MAX_TOKENS, SAFETY, RECITATION, OTHER
    raise RuntimeError(f'Generation did not finish cleanly: {reason.name}')

text = response.text
usage = response.usage_metadata

print(f'Reply ({usage.total_token_count} tokens):')
print(text)
print(f'  prompt={usage.prompt_token_count}  '
      f'completion={usage.candidates_token_count}')

Exercise

작은 safe_generate(prompt: str) -> str 헬퍼 작성: (1) Flash 에 200 토큰 cap 으로 generate_content, (2) finish_reason 체크 후 STOP 아니면 custom GenerationError raise, (3) 성공 시 text 반환. 세 prompt 로 테스트: 평범한 거, MAX_TOKENS 발동시키는 거 (작은 limit 으로 강제), safety classifier 가 필터할 만한 거. 각 분기 검증.