Swap에 break하는 것
- Tool-call format — argument parsing, role 이름, parallel-call shape.
- JSON enforcement 모드 — strict schema vs JSON 모드 vs tool-as-schema.
- System prompt placement — top-level vs message vs system_instruction.
- Token counting — 카운트 어긋남; cost forecast 틀림.
- Refusal calibration — 한 곳에 refused, 다른 곳에 accept된 같은 input.
- Reasoning interface — extended thinking vs reasoning_effort vs thinking_budget.
- Multimodal input format — base64 vs URL vs file id.
Pre-migration checklist
- Golden set을 양 provider에 돌려.
- Cost와 latency 비교, quality만 X.
- 필요한 prompt-side 변경 식별 (tag style, response_format).
- 새 provider에 eval suite end-to-end test.
- Cutover 동안 parallel run plan; day one에 atomically switch X.