C.W.K.
Stream
Lesson 04 of 10 · published

Descriptive Prompts vs. Keyword Stacks vs. Natural Language

~14 min · prompting, control, l4

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: Keyword stack은 SD 1.5 시절. FLUX한테 keyword stack 쓰면 T5-XXL 언어 이해 능력을 *낭비*하는 거야. 모델에 맞춰 prompt style을 갈아.

Over the past few years, three distinct prompting styles have emerged, and which one works best depends heavily on which model you're using. This is one of the most practical lessons in the entire track — it can save you hours of frustration.

The Three Styles

1. Keyword Stacking (Tag-Based)

Comma-separated descriptors with no grammar. Born from the Stable Diffusion 1.5 / Danbooru era where the CLIP text encoder processed prompts more like search queries than sentences.

Keyword Stack Style

"portrait, woman, red hair, freckles, green eyes, soft lighting, bokeh, 85mm, professional photography, 8k, masterpiece, best quality"

When This Works

SD 1.5 models and fine-tunes trained on tag-based captions. These models were literally trained on comma-separated tags.

2. Descriptive Prompting

Short, structured sentences that describe the scene with moderate detail. A middle ground between keywords and natural language.

Descriptive Style

"Portrait of a woman with red hair and freckles, soft natural lighting, shallow depth of field, professional photography"

When This Works

Most modern models. Clear, efficient, and doesn't waste tokens on grammar that the model might not need.

3. Natural Language (Conversational)

Full sentences that read like a scene description or photography brief. Takes advantage of models with advanced language understanding.

Natural Language Style

"A young woman with vibrant red hair and light freckles looks directly at the camera with a slight, knowing smile. She's lit by soft window light from the left side, creating gentle shadows. The background is softly blurred. Shot on an 85mm lens at f/1.8."

When This Works

FLUX, FLUX.2, and other models with powerful language encoders (T5-XXL, Mistral). These models understand syntax and relationships between concepts.

The Critical Insight: Model Architecture Determines Style

ModelText EncoderBest Prompt Style
SD 1.5 / fine-tunesCLIP (77 tokens)Keyword stacking
SDXLCLIP + OpenCLIPDescriptive
SD 3.5CLIP + T5-XXLDescriptive or natural language
FLUX / FLUX.2T5-XXL / MistralNatural language
MidjourneyProprietaryShort descriptive + parameters
DALL-E 3GPT-basedNatural language (auto-enhanced)

Prompt Anchoring

Regardless of style, concrete nouns and specific scene descriptions almost always matter more than abstract adjectives. This is called prompt anchoring.

❌ Abstract Adjective Soup

"beautiful, stunning, magnificent, breathtaking, incredible, gorgeous landscape"

✅ Concrete Anchoring

"A glacial lake reflecting snow-capped peaks, morning mist hovering over turquoise water, wildflowers in the foreground, Patagonia"

The abstract version gives the model almost nothing specific to work with — "beautiful" is too vague. The concrete version gives it actual visual targets: glacial lake, snow-capped peaks, turquoise water, wildflowers, Patagonia. Each noun anchors the image to specific learned patterns.

Key Takeaways
  • Three prompt styles exist: keyword stacking, descriptive, and natural language.
  • The right style depends on the model's text encoder. FLUX wants natural language; SD 1.5 wants keywords.
  • Concrete nouns anchor the image far more effectively than abstract adjectives.
  • Match your prompting style to your model — it's one of the highest-impact, lowest-effort improvements you can make.

External links

Exercise

같은 prompt를 keyword stack·descriptive·natural language 3 스타일로. 가장 많이 쓰는 모델에서 generate. 그 모델의 권장 스타일이 실제 best output 만든 스타일과 일치하는지 확인.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.