C.W.K.
Stream
Lesson 08 of 10 · published

Diffusion vs. Autoregressive vs. Flow-Based Approaches

~14 min · diffusion, latent-space, l8

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: Diffusion → Flow matching으로 이행 중. FLUX와 SD 3.5가 flow matching 기반. User 입장에서 거의 차이 없지만, 더 효율적이고 더 stable해.

Diffusion isn't the only way to generate images — it's just the one that currently dominates. Let's zoom out and see the landscape of approaches, so you understand why diffusion won and what alternatives exist.

Think of three different ways to create a painting:

  • Diffusion: Start with a canvas of random splatter, then carefully clean and refine the whole canvas simultaneously until a painting emerges. (Sculptor approach)
  • Autoregressive: Paint one pixel (or patch) at a time, left-to-right, top-to-bottom, each one informed by all previous pixels. (Typewriter approach)
  • Flow-based: Smoothly morph a random blob into the final painting through a continuous, learned transformation — like a time-lapse of a painting appearing in one smooth motion. (Morphing approach)

Diffusion Models

How: Learn to reverse a noise-adding process. Generate by iteratively denoising from random noise.

Examples: Stable Diffusion 1.5/XL/3.5, DALL-E 3, Imagen 3

Strengths: High quality, good diversity, well-understood training, strong ecosystem of tools (ControlNet, LoRA, inpainting)

Weaknesses: Slow (many denoising steps), architecture choices affect quality ceiling

Autoregressive Models

How: Predict the next image token (or patch) based on all previous ones, similar to how language models predict the next word.

Examples: DALL-E 1 & 2 (partially), Parti, some aspects of newer multimodal models

Strengths: Natural fit for combined text+image generation, can leverage scaling insights from language models

Weaknesses: Sequential generation is slow, can accumulate errors, historically lower image quality than diffusion

Flow Matching / Flow-Based Models

How: Learn a continuous transformation (flow) from a simple noise distribution to the image distribution. Instead of discrete denoising steps, the model learns a smooth, direct path from noise to image.

Examples: FLUX, Stable Diffusion 3.5 (MMDiT architecture uses flow matching principles)

Strengths: More efficient sampling, cleaner theoretical foundation, better training stability, can produce high-quality results in fewer steps

Weaknesses: Newer approach, some techniques from the diffusion ecosystem (like standard CFG) needed adaptation

Why Diffusion (and Flow Matching) Won

The short answer: quality and scalability. Diffusion-family approaches produce the best images, scale well with more compute and data, and have a rich ecosystem of control tools. Autoregressive approaches had their moment but couldn't match diffusion quality for standalone image generation (though they're making a comeback in multimodal/video contexts).

Image Generation Approach Timeline:

2021-2022: GANs dominant → Diffusion overtakes
2022-2023: Diffusion (U-Net) dominates → Stable Diffusion era
2023-2024: Diffusion Transformers (DiT) emerge → DALL-E 3, SD 3
2024-2025: Flow matching + DiT → FLUX, SD 3.5
2025-2026: Flow matching matures → FLUX.2, continued evolution
Key Takeaways
  • Diffusion (noise→image through denoising), autoregressive (pixel-by-pixel), and flow matching (smooth transformation) are the three main approaches.
  • Diffusion dominated 2022-2024; flow matching (its successor) is taking over in 2024-2026.
  • FLUX and SD 3.5 use flow matching — think of it as "diffusion 2.0" with faster, more stable generation.
  • For users, the practical behavior is similar across diffusion and flow matching — the differences matter more for researchers.

External links

Exercise

Flow matching 모델 (FLUX) 하나 + classical diffusion (SD 1.5) 하나 찾기. 같은 prompt·같은 step. Quality·속도 비교. 어느 게 워크플로우에 useful?

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.