C.W.K.
Stream
Lesson 05 of 10 · published

The Hidden Difficulty of Continuity

~16 min · video, temporal, l5

Level 0Spark
0 XP0/100 lessons0/14 achievements
0/200 XP to next level200 XP to go0% complete

피파 한 줄 정리: Background object가 사라지거나 위치가 바뀌는 이유 = attention budget이 main subject로 몰려서. 클립 짧게, 배경 단순하게, camera 정적으로.

Mental model: Anybody can take one great photo. But filming a 30-second commercial where every frame looks professional, the lighting stays perfect, the product stays centered, and the model's hair doesn't shift — that's a whole different skill. The gap between "one beautiful frame" and "five seconds of beautiful frames" is enormous, and it's where most video generation struggles become visible.

Why Good Frames ≠ Good Video

Consider what must hold true across even a short 3-second clip at 24fps (72 frames):

  • Every frame must be individually high quality (no artifacts, good composition)
  • Adjacent frames must flow smoothly (no jumps, no flicker)
  • Distant frames must maintain identity (frame 1 and frame 72 show the same person, same clothes, same environment)
  • Motion must be physically plausible throughout
  • Scene elements must persist (a vase on the table in frame 1 must still be there in frame 72, in the same position if nobody moved it)

A single image only needs to satisfy internal spatial consistency. A video must satisfy spatial consistency × temporal consistency × motion coherence × object persistence × environmental stability. Each additional requirement is multiplicative, not additive.

Object Persistence

One of the subtlest but most noticeable continuity failures is object persistence. In a real video, a coffee cup on a desk stays there unless someone moves it. In AI video, the cup might:

  • Gradually fade or blur away
  • Shift position slightly between frames
  • Change shape or color
  • Disappear entirely when the camera looks away and reappears differently when it returns

This happens because the model doesn't maintain an internal 3D model of the scene. It generates each frame based on learned patterns, and small background elements have weak attention signals that drift over time.

The Attention Budget

A useful way to think about continuity is as an "attention budget." The model has a finite amount of attention to distribute across the scene. The main subject gets most of it. Secondary elements (background, props, environmental details) get less. The further from the center of attention, the more likely an element is to drift, change, or disappear.

Key Takeaways
  • A good single frame is much easier to generate than a good 5-second clip.
  • Continuity requires spatial consistency × temporal consistency × motion × persistence — multiplicative difficulty.
  • Background elements are weakly constrained and prone to drift, change, or disappear.
  • Think of the model's attention as a budget — the main subject gets most of it, everything else gets less.

External links

Exercise

작지만 distinct background object (책상 위 vase 등) 있는 5초 video generate. Frame-by-frame 보기. Vase 지속? Drift? 모양 변화? 모델 attention budget 문서화.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.