피파 한 줄 정리: Generation·Editing·Transformation·Upscaling 네 가지 모드는 다른 작업이야. 이걸 헷갈리면 inpainting으로 끝낼 일을 처음부터 다시 generate하면서 시간 버려.
Not all AI image operations are the same thing. Understanding the differences is like understanding the difference between building a house from scratch, renovating a room, converting a house into a restaurant, and adding a second floor. Same building — completely different operations.
The Four Modes
Generation: Nothing ──▶ [Model] ──▶ New image from scratch
"Build a house"
Editing: Existing image + mask ──▶ [Model] ──▶ Modified region only
"Replace the kitchen countertop"
Transformation: Existing image + instruction ──▶ [Model] ──▶ Same structure, new style
"Convert this house to Mediterranean style"
Upscaling: Low-res image ──▶ [Model] ──▶ Higher-res image with added detail
"Add a second floor with matching architecture"
Generation (Text-to-Image)
The model creates an image from scratch, guided only by your text prompt (and its learned priors). This is the most "creative" mode — maximum freedom for the model, but also maximum unpredictability. You describe what you want; the model produces its interpretation.
Editing (Inpainting / Local Edits)
You provide an existing image and indicate a specific region to change. The model replaces only that region while keeping everything else intact. This is powerful because you can keep 95% of a great image and fix just the part that went wrong — a hand, a face, a background element.
Transformation (Image-to-Image / Style Transfer)
You provide an existing image as a structural guide, and the model generates a new image that follows the same basic composition but with a different style, mood, or treatment. Think: same scene, different artistic interpretation. The "strength" parameter controls how much the model deviates from the original — low strength preserves more of the input, high strength gives the model more creative freedom.
Upscaling (Super-Resolution)
The model takes a low-resolution image and generates a higher-resolution version, inventing plausible fine details that weren't in the original (skin texture, fabric weave, leaf detail). This isn't just "making pixels bigger" (that's interpolation) — it's generating new detail that's consistent with the image content.
Why the Distinction Matters
Professional workflows almost always combine these modes:
- Generate a batch of images from a prompt
- Select the most promising one
- Edit (inpaint) any flawed regions
- Transform if you want a style change
- Upscale the final result to production resolution
Treating generation as a single step ("prompt → done") is the beginner trap. Treating it as a pipeline of complementary operations is what produces professional results.
- Generation creates from scratch. Editing modifies a region. Transformation changes style/treatment. Upscaling adds resolution and detail.
- Professional results almost always combine multiple modes — generate, select, edit, upscale.
- Don't expect one-shot perfection from generation alone. Think in workflows, not single prompts.