PDF Understanding + Structured Output

~22 min · structured-outputs, pdf, input_file

Level 0Tokenizer

0 XP0/54 lessons0/10 achievements

0/120 XP to next level120 XP to go0% complete

input_file 이 PDF 를 직접 받아 — chunking, OCR pipeline 따로 X. response_format 의 Pydantic model 과 결합하면 30 줄짜리 invoice/report 추출기 완성.

한 pipeline 이 셋을 대체

Pre-input_file 패턴 — pdfplumber → text extraction → chunking → embedding → vector search → prompt 조립 → JSON parsing. input_file + response_format 으로 한 API call 에 collapse.

Long document caveat

input_file 은 PDF 전체를 한 번에 처리 → 매우 긴 문서 (100+ 페이지) 는 context window limit 에 hit. 그 case 엔 chunking 필요. Invoice, report, contract, single-page form 엔 input_file 이 canonical.

Pydantic 으로 contract

class Invoice(BaseModel): vendor: str; invoice_number: str; total: float; currency: str; line_items: list[LineItem]. response_format=Invoice 로 호출 → 모델이 정확히 그 shape 반환. Defensive 파싱 0.

Code

input_file 로 PDF 보내기·python

completion = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Return a JSON with name and age for 'Alice, 30'"}],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "person",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "age": {"type": "integer"},
                },
                "required": ["name", "age"],
                "additionalProperties": False,
            }
        }
    }
)
import json
person = json.loads(completion.choices[0].message.content)

response_format 으로 structured 추출·python

from pydantic import BaseModel

class InvoiceData(BaseModel):
    vendor: str
    total: float
    items: list[str]
    date: str

response = client.responses.parse(
    model="gpt-5.4",
    input=[{
        "role": "user",
        "content": [
            {"type": "input_text", "text": "Extract invoice data from this PDF:"},
            {"type": "input_file", "file_id": "file-abc123"},
        ]
    }],
    text_format=InvoiceData,
)
invoice = response.output_parsed  # typed InvoiceData object

External links

Exercise

PDF invoice 골라 (또는 생성). input_file + response_format 으로 {vendor, invoice_number, total, currency, line_items[]} 추출. 5 다른 PDF 에 돌려서 어느 필드가 가장 자주 fail 하는지 측정.

Progress

Progress is local-only — sign in to sync across devices.

← PreviousAudio APIs — TTS, 전사, Realtime Next →퀴즈 · 5 questions

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.