Surface Plausibility vs. Actual Correctness

피파 한 줄 정리: Surface plausibility (한 눈에 그럴듯) ≠ semantic correctness (사실 정확). 13개 숫자 시계, 가짜 화학 다이어그램, 6개 손가락 — 다 plausibility trap.

Mental model: A movie set for a hospital looks perfectly convincing on camera — white walls, beeping monitors, doctors in scrubs — but walk behind the wall and you'll find plywood, tape, and exposed wiring. It looks like a hospital from the intended angle, but it isn't one. Image generators build movie sets, not real buildings. Their outputs are optimized to look plausible, not to be correct.

The Plausibility Trap

Diffusion models are trained with a single objective: produce images that could plausibly belong in the training data distribution. This means the model asks itself, "Could this image exist as a real photograph or artwork?" — not "Is this image factually, anatomically, physically, or logically correct?"

This creates a dangerous gap:

A clock face that looks beautiful — but has 13 numbers on it
A chemistry diagram that looks professional — but the molecular structure is nonsense
A map that looks authentic — but the geography is fictional
A book cover that looks publishable — but the text is gibberish
A person who looks photorealistic — but has six fingers, asymmetric ears, or a collar that defies physics

Why First-Glance Approval Is Dangerous

Our visual system processes images hierarchically: we see the gist first (scene, mood, composition), then details (objects, faces), then fine structure (text, fingers, symmetry). AI images are optimized for the gist level. They pass the "thumbnail test" — a quick scroll through social media and they look great. But zoom in, pause, and inspect carefully, and the cracks appear.

Examples of the Gap

Semantic vs. Perceptual Correctness

It helps to distinguish two types of "correct":

Perceptually correct: Looks right to a fast human glance. Colors, textures, composition, lighting all work. Most AI images achieve this.
Semantically correct: The content is factually, structurally, and logically right. Hands have five fingers, text is spelled correctly, physics make sense. AI images often fail here.

The gap between these two levels is where most AI image failures live. The image passes the eye test but fails the brain test.

Key Takeaways

Models optimize for plausibility (looks real) not correctness (is real).
AI images pass the thumbnail test but often fail close inspection.
The gap between perceptual and semantic correctness is where failures hide.
Always inspect AI images at full resolution before professional use.
Develop a systematic "zoom-in checklist" to catch common errors.

Code

예시 코드·text

Surface Plausibility          vs.    Actual Correctness
─────────────────────────────────────────────────────────
"Looks like a real building"          Windows don't align between floors
"Looks like a real restaurant"        Menu text is garbled nonsense
"Looks like a real person"            Earrings differ between ears
"Looks like a real textbook"          Diagram labels are meaningless
"Looks like a real map"               Country shapes are invented
"Looks like a real product photo"     Label text is wrong

The Plausibility Trap

Why First-Glance Approval Is Dangerous

Examples of the Gap

Semantic vs. Perceptual Correctness

Code

External links

Exercise

Progress

댓글 0