C.W.K.
Stream
Lesson 02 of 10 · published

Reference Images: Visual Anchors for Everything

~15 min · control, editing, l2

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: Reference image는 face·pose·style·composition·wardrobe·mood — 어떤 dimension이든 visual anchor를 줘. Tool마다 weight 슬라이더 다루는 법이 다름.

Mental model: A film director doesn't just tell the cinematographer "make it look moody." She shows a stack of reference stills: a frame from Blade Runner for the lighting, a Vermeer painting for the color palette, a fashion editorial for the pose. These references collapse an enormous space of possible interpretations down to a specific neighborhood of visual intent. That's exactly what reference images do for AI generators.

What Reference Images Anchor

A reference image can guide the model on many different dimensions, depending on the tool and technique:

  • Identity/Face: "Make this person look like the person in this photo" — face anchoring for character consistency.
  • Pose: "Use this body position" — structural guidance without copying appearance.
  • Style/Aesthetic: "Make it look like this painting" — transferring color palette, brushwork, mood.
  • Composition: "Arrange elements like this photo" — preserving layout while changing content.
  • Wardrobe/Props: "This character should wear this outfit" — product or costume reference.
  • Mood/Lighting: "Light it like this scene" — atmospheric guidance.

How Reference Systems Work (High Level)

Different platforms implement references differently, but the core idea is the same: the reference image is encoded into the model's latent space (usually via a vision encoder like CLIP or a specialized adapter), and its features are injected into the generation process alongside the text prompt.

┌───────────────┐     ┌───────────────┐
  │  Text Prompt   │     │Reference Image│
  │ "a warrior in │     │  [photo.jpg]  │
  │  a forest"    │     │               │
  └───────┬───────┘     └───────┬───────┘
          │                     │
          ▼                     ▼
  ┌───────────────┐    ┌───────────────┐
  │ Text Encoder  │    │ Image Encoder │
  │ (CLIP / T5)   │    │ (CLIP / IP)   │
  └───────┬───────┘    └───────┬───────┘
          │                     │
          └────────┬────────────┘
                   ▼
          ┌───────────────┐
          │   Diffusion   │
          │   Process     │
          │ (cross-attn)  │
          └───────┬───────┘
                  ▼
          ┌───────────────┐
          │ Output Image  │
          │ (text-guided  │
          │  + ref-guided)│
          └───────────────┘

Platform-Specific Reference Systems

Major platforms each have their own flavor:

  • Midjourney V7 (--oref): Omni-Reference system with --ow (Omni-Weight) parameter. Weights 0–50 = face only; 200–400 = balanced identity + style; 500–1000 = near-exact copy. Reports up to 95% consistency.
  • DALL-E (Gen_ID): Maintains character identity within a conversation thread. 75–80% consistency. Tied to session — doesn't transfer between conversations.
  • Stable Diffusion + IP-Adapter: Lightweight adapter modules that inject reference image features via cross-attention. Highly customizable with adjustable influence strength.
  • Leonardo AI: Character reference sheets with 92% consistency across 50+ pose variations.
Key Takeaways
  • Reference images anchor identity, pose, style, composition, wardrobe, and mood.
  • They work by encoding visual features and injecting them alongside text into the diffusion process.
  • Different platforms implement references differently; each has strengths and limitations.
  • Good references are clear, well-lit, and high-resolution. Reference weight is a balance, not binary.

External links

Exercise

독특한 look의 사람 reference image 찾거나 찍기. (Midjourney --oref·FLUX IP-Adapter·DALL-E session 통해) 그 사람을 3 다른 scene에 generate. Consistency 1-10 점수.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.