C.W.K.
Stream
Lesson 03 of 10 · published

Forward Process vs. Reverse Process

~18 min · diffusion, latent-space, l3

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: Forward는 단순한 수학 (noise를 더해), reverse는 학습 (noise를 예측해서 빼). 이 비대칭이 학습 가능성을 만들어.

Diffusion models have two phases, and understanding the distinction is key to understanding how they learn and how they generate.

Think of it like learning to restore antique furniture. The forward process is deliberately damaging furniture in controlled stages so you can study what damage looks like at each level. The reverse process is using that knowledge to repair damaged furniture — because you've seen every stage of degradation, you know exactly how to undo each step.

The Forward Process (Training Time)

During training, the model is shown real images and watches them get progressively destroyed by adding Gaussian noise. This happens in many small steps (typically 1000 steps of increasing noise).

Forward Process (happens during training):

Step 0    Step 250    Step 500    Step 750    Step 1000
  🖼️   →   🖼️+🌫️   →   🌫️🌫️    →   🌫️🌫️🌫️  →    🎲
Clean    Light noise   Heavy     Very heavy    Pure
image                  noise     noise         noise

At each step, the model sees:
- The noisy image (input)
- How much noise was added (the noise level / timestep)
- The original clean image (target to predict)

The model's training objective is simple: given a noisy image and the noise level, predict the noise that was added (or equivalently, predict the clean image underneath). Over billions of training examples, it gets extraordinarily good at this prediction.

The Reverse Process (Generation Time)

Generation is the forward process run backward. Start with pure noise (step 1000), and ask the model: "What noise was added here? Remove it." Do this step by step, and you gradually walk from pure chaos to a coherent image.

Reverse Process (happens during generation):

Step 1000   Step 750    Step 500    Step 250    Step 0
   🎲    →   🌫️🌫️🌫️  →   🌫️🌫️    →   🖼️+🌫️  →   🖼️
  Pure     Shapes     Structure   Nearly     Clean
  noise    hint       emerges     clear      image!

At each step, the model:
1. Looks at the current noisy state
2. Predicts "what noise is here?"
3. Subtracts that predicted noise
4. Moves one step closer to a clean image

Why This Is Clever

Here's the elegant part: the forward process is fixed and mathematically simple — it's just adding known amounts of Gaussian noise. No learning needed. All the intelligence goes into the reverse process — learning to predict and remove noise. This asymmetry makes the training problem tractable: you don't need to learn how to destroy images (that's trivial), you need to learn how to restore them.

And because the model has seen billions of images being destroyed at every noise level, it has learned an incredible amount about image structure: what natural images look like, how they're composed, what features emerge at different scales, and how to reconstruct plausible details from partial information.

Key Takeaways
  • Forward process: Gradually add noise to real images during training (easy, fixed, no learning).
  • Reverse process: Gradually remove noise during generation (hard, learned, this is where the intelligence lives).
  • The model is trained to denoise — generation is an emergent consequence of applying denoising from pure noise.
  • The asymmetry (simple destruction, complex restoration) makes the learning problem tractable.

External links

Exercise

같은 prompt를 timesteps=10·25·50·100으로 (timestep 노출 도구에서) 돌리기. 나란히 비교: quality plateau 어디? 너의 모델의 diminishing-returns point 찾기.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.