C.W.K.
Stream
Lesson 04 of 10 · published

Prediction, Not Understanding

~14 min · foundations, mental-model, l4

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: 이거 한 줄이 트랙 1의 핵심: **모델은 *예측*해, *이해*하는 게 아니야**. 이걸 prompt 짤 때마다 까먹으면 'model이 내 의도를 이해 못 해'라고 잘못된 진단을 내리게 돼.

Here's the most important mental shift in this entire course: these models predict plausible outputs — they don't understand the world.

Think of a weather forecaster who has memorized every weather pattern for the last century. When she says "tomorrow will be sunny," she's not controlling the weather or understanding atmospheric physics at a molecular level. She's recognizing that today's conditions closely match historical patterns that were followed by sunny days. She's making a sophisticated prediction based on learned correlations.

Image models work exactly this way. When you type "a golden retriever playing fetch on a beach at sunset," the model doesn't think: "Okay, a golden retriever is a dog breed with this bone structure, fur is affected by wind and moisture, the sun at this angle creates these shadows..." Instead, it essentially says: "Given everything I've learned about images paired with similar text, what would a plausible image look like?"

Correlation vs. Causation in Action

This distinction has real consequences:

What you think happens:          What actually happens:

"Draw 3 apples"                  "Draw 3 apples"
    ↓                                ↓
Model counts: 1, 2, 3           Model predicts: "images with
    ↓                            'three apples' text usually
Draws exactly 3 apples          have this many round objects"
    ↓                                ↓
✅ Always works                  Sometimes 2, sometimes 4 🤷

The model has learned the correlation between the text "three apples" and images containing roughly three apple-like objects. But it hasn't learned the concept of counting. This is why you sometimes get two apples or four. It's not stupid — it's doing exactly what it was designed to do: predicting a plausible visual pattern. Counting is just not what that pattern prediction reliably captures.

Why This Matters for You

Once you stop expecting "understanding" and start expecting "prediction," everything becomes clearer:

  • Prompt failures make sense: The model isn't ignoring you — your words didn't reliably activate the patterns you wanted.
  • Inconsistency is expected: Predictions from statistical patterns naturally vary — that's why the same prompt gives different results each time.
  • Strengths make sense: The model is great at things where visual patterns are consistent (faces, landscapes, common compositions) and weak where patterns are sparse or irregular (precise text, counting, novel combinations).
  • Control strategies change: You stop trying to "explain" things to the model and start learning which words and patterns reliably trigger which visual outputs.
Key Takeaways
  • Generative models predict plausible outputs from learned correlations — they don't understand concepts.
  • Failures like wrong finger counts or misspelled text reveal the limits of pattern prediction.
  • Stop expecting "understanding" and start thinking in terms of "which patterns does my input activate?"
  • This reframe transforms how you prompt, diagnose failures, and build workflows.

External links

Exercise

같은 prompt를 4개 random seed로 generate. 한 개의 failure 케이스 찾기. 진단: counting? anatomy? composition? 2문장 진단으로 적어.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.