피파 한 줄 정리: Style transfer (look 바꿈)와 identity preservation (얼굴 유지)는 자주 *경쟁*해. 강한 style 적용 = 얼굴 변형. 두 task를 *분리*해서 처리하는 게 정석.
Mental model: Think of three different requests to a photo editor: (1) "Make this photo look like a watercolor" (style transfer). (2) "Put this person in a different scene but keep their face exactly the same" (identity preservation). (3) "Change only the jacket in this photo to a leather one" (local control via masking). These are all forms of controlled editing, but they pull in different directions — and often compete with each other.
Local Control Through Masking
Masking is the foundation of precise editing. By selecting specific regions, you control where changes happen:
- Foreground/background separation: Change the background while keeping the subject intact.
- Garment-specific edits: Change shirt color without affecting skin tone or hair.
- Object-level control: Add, remove, or replace individual objects in a scene.
- Layer-by-layer editing: Modern tools like FLORA now support splitting an image into semantic layers (using models like Qwen's layered editor), letting you edit each layer independently.
Style Transfer: Changing Look, Keeping Content
Style transfer takes the content of one image and renders it in the style of another. Classic examples:
- Your photograph → rendered as an oil painting
- Your sketch → rendered as a photorealistic image
- Your product photo → rendered in an anime style
The challenge is controlling how much style transfers. Too little and it looks like a filter. Too much and the content becomes unrecognizable. The sweet spot depends on your use case.
Identity Preservation: Keeping the Person, Changing Everything Else
This is the flip side of style transfer. Here, you want to keep a specific person's face and appearance identical while changing the scene, pose, lighting, or context. This is critical for:
- Brand campaigns with consistent AI characters
- Comic or storyboard series
- Social media content with a recurring persona
The Competition Between Style and Identity
When you ask a model to "render this person in a Pixar 3D animation style," you're asking it to simultaneously:
- Apply Pixar's visual language (smooth skin, exaggerated features, specific lighting)
- Preserve the person's identity (specific face shape, nose, eyes, skin tone)
These goals conflict. Pixar style requires exaggerated proportions — bigger eyes, smoother skin, rounder features. Identity preservation requires keeping the exact proportions. Something has to give. Different tools handle this tradeoff differently:
- Masking enables precise, region-specific editing — change what you want, keep what you don't.
- Style transfer changes the visual language; identity preservation keeps the person recognizable.
- These goals often compete — strong style application alters identity, and strict identity limits style.
- Best results often come from separating the tasks: generate in style first, then fix identity.