Regex 와 Format Validation

Format 실패는 quality issue 로 위장한 bug

prompt 가 JSON 요청하는데 model 이 markdown-wrapped JSON 반환하면 downstream pipeline crash. 그건 quality 실패가 아니야 — format compliance 실패고, 자기 metric 받을 자격 있어.

Format validation 이 잡는 것

JSON 이 expected key 와 type 으로 정확히 parse.
Code block 이 well-formed 하고 listed 언어가 content 와 match.
Citation 이 required pattern 따름 (numbered, bracketed, source 와 함께).
List 가 right cardinality (예: 정확히 5 bullet).
Output 이 length constraint 내 머무름 (max 100단어).
Required token 출현 ("FINAL ANSWER:", "Step 1:" 등).

세 strictness level

Regex match — pattern check. 빠름, 부서지기 쉬움.
Schema validation — Pydantic / JSON Schema / Zod. type 에러와 missing field 잡음.
Semantic validation — schema pass 하지만 content 가 invariant 만족? (예: citation ID 가 실제 source 가리킴)

원칙: Format compliance 는 quality 와 분리된 axis. 독립 추적. 아름다운 prose 만들지만 JSON schema 10% 깨는 model 은 broken.

현대 provider 가 structured-output mode 제공

OpenAI 의 response_format=json_schema, Anthropic 의 tool-use, Google 의 controlled generation. Schema-valid output 보장. Task 가 허용하면 써 — format-failure 비율이 거의 0 으로 떨어져. 그러면 eval 이 semantic correctness 에 집중.

Code

pydantic 으로 JSON schema validation·python

from pydantic import BaseModel, ValidationError, conint, conlist
from typing import Literal

class MovieRecommendation(BaseModel):
    title: str
    year: conint(ge=1900, le=2030)
    genre: Literal["action", "comedy", "drama", "sci-fi", "thriller"]
    score: float
    reasons: conlist(str, min_length=1, max_length=5)

def validate_recommendation(output_str):
    try:
        rec = MovieRecommendation.model_validate_json(output_str)
        return True, rec
    except ValidationError as e:
        return False, e.errors()

ok, _ = validate_recommendation('{"title":"Inception","year":2010,"genre":"sci-fi","score":8.8,"reasons":["mind-bending","great cast"]}')
assert ok

Format-rate 를 자기 metric 으로·python

def format_compliance_rate(outputs, validator):
    total = len(outputs)
    passed = sum(1 for o in outputs if validator(o)[0])
    return passed / total

# Track this per release. A drop from 99% to 92% means the model started
# wrapping JSON in ```json fences again — fix the prompt or add a parser.
rate = format_compliance_rate(eval_outputs, validate_recommendation)
print(f"format compliance: {rate:.1%}")
assert rate >= 0.98, "format-compliance regression"

promptfoo YAML 의 regex assertion·yaml

tests:
  - vars:
      query: 'Give me the latest stock price in JSON.'
    assert:
      - type: is-json
      - type: javascript
        value: |
          output.symbol && typeof output.price === 'number'
      - type: regex
        value: '"timestamp":\s*"\d{4}-\d{2}-\d{2}'

Regex 와 Format Validation

Format 실패는 quality issue 로 위장한 bug

Format validation 이 잡는 것

세 strictness level

현대 provider 가 structured-output mode 제공

Code

External links

Exercise

Progress

댓글 0