GradientTape 기초 — 첫 gradient

~13 min · autodiff, gradient-tape, backprop

Level 0Level 0

0 XP0/78 lessons0/17 achievements

0/100 XP to next level100 XP to go0% complete

딥러닝 마법이 진짜로 일어나는 곳

Autodiff가 training의 엔진이야. tf.GradientTape는 forward pass 중에 watch된 tensor를 포함한 모든 op를 (테이프에) 녹화해. tape.gradient(target, source) 호출하면 테이프를 거꾸로 재생하면서 chain rule 자동 적용 → gradient 반환.

기본 규칙: tf.Variable은 자동 watch. 일반 tensor랑 tf.constant는 안 됨 — gradient 계산하려면 tape.watch(c) 명시적으로 호출해야 해. 이 default는 Variable이 trainable parameter라서 — 거의 항상 gradient 원하는 대상이거든.

tape.gradient(loss, weight)가 None을 반환하면 그 variable이 forward pass에 안 쓰인 거야 — loss로 가는 경로가 없음. layer 얼렸거나, tf.stop_gradient 어딘가 쓰였거나, model 코드에 버그 있어.

Code

First gradient·python

import tensorflow as tf

# d(x²)/dx = 2x
x = tf.Variable(3.0)

with tf.GradientTape() as tape:
    y = x ** 2

dy_dx = tape.gradient(y, x)
print(dy_dx)   # 6.0 (= 2 * 3)

Multiple variables at once·python

import tensorflow as tf

x = tf.Variable(2.0)
y = tf.Variable(3.0)

with tf.GradientTape() as tape:
    z = x * x + y * y    # z = x² + y²

# Pass a dict or list
grads = tape.gradient(z, {'x': x, 'y': y})
print(grads['x'])    # dz/dx = 2x = 4.0
print(grads['y'])    # dz/dy = 2y = 6.0

Watching constants explicitly·python

import tensorflow as tf

c = tf.constant(3.0)

with tf.GradientTape() as tape:
    y = c ** 2
print(tape.gradient(y, c))   # None — constant not watched

with tf.GradientTape() as tape:
    tape.watch(c)            # explicit
    y = c ** 2
print(tape.gradient(y, c))   # 6.0

External links

Gradient와 autodiff

Exercise

GradientTape로 x = 1.0에서 f(x) = sin(x²)의 gradient 계산. chain rule로 손계산해서 답이 2 * cos(1) ≈ 1.0806인지 확인.

Progress

Progress is local-only — sign in to sync across devices.

← Previous행렬곱과 Broadcasting Next →Persistent tape와 higher-order gradient

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.