Computational Graph 와 grad_fn

모든 op 가 node 를 만들어

requires_grad=True 인 tensor 에 op 를 적용하면 PyTorch 가 두 가지를 아는 graph node 를 만들어: 어떤 op 가 이 tensor 를 만들었는지 (grad_fn) 와 어느 input 이 op 에 들어갔는지 (backward 가 그것들로 걸어 돌아갈 수 있게).

직접 inspect 가능:

Leaf tensor — 직접 만든 (예: nn.Parameter, torch.tensor(..., requires_grad=True)). grad_fn 은 None, is_leaf 는 True. gradient 가 land 하는 곳.
Intermediate tensor — op 의 output. grad_fn 이 backward 함수 (AddBackward0, MulBackward0 등) 가리키고, leaf 아님. default 로 자기 .grad 안 저장 — leaf 만 저장.

graph 를 backward 로 걷기

loss.backward() 호출하면 autograd 가 loss 에서부터 연결된 모든 node 를 grad_fn link 로 추적. 각 node 가 local Jacobian (정확히는 vector-Jacobian product) 계산법 알고, chain rule 이 그것들을 모든 leaf 의 final gradient 로 조합.

한 가지 미묘함: default 로 graph 가 backward 직후 해제 메모리 절약 위해. 같은 graph 에 backward 다시 호출 필요하면 retain_graph=True 넘겨. 보통 안 필요.

Code

graph inspection·python

import torch

x = torch.tensor(2.0, requires_grad=True)
w = torch.tensor(3.0, requires_grad=True)
b = torch.tensor(1.0, requires_grad=True)

z = w * x          # MulBackward0
y = z + b          # AddBackward0
loss = y ** 2      # PowBackward0

# Leaves
print(x.is_leaf, x.grad_fn)   # True None
print(w.is_leaf, w.grad_fn)   # True None

# Intermediates
print(z.is_leaf, z.grad_fn)   # False <MulBackward0>
print(y.is_leaf, y.grad_fn)   # False <AddBackward0>
print(loss.grad_fn)           # <PowBackward0>

linear regression 같은 작은 graph 통한 backward·python

import torch

x = torch.tensor(2.0, requires_grad=True)
w = torch.tensor(3.0, requires_grad=True)
b = torch.tensor(1.0, requires_grad=True)

z = w * x          # 6
y = z + b          # 7
loss = y ** 2      # 49
loss.backward()

# Gradients via chain rule
# dloss/dy = 2y       = 14
# dy/dz   = 1         → dloss/dz = 14
# dy/db   = 1         → dloss/db = 14
# dz/dw   = x = 2     → dloss/dw = 28
# dz/dx   = w = 3     → dloss/dx = 42
print(x.grad, w.grad, b.grad)
# tensor(42.) tensor(28.) tensor(14.)

retain_graph — backward 두 번 필요할 때·python

import torch

x = torch.tensor(2.0, requires_grad=True)
y = x ** 3

# First backward — frees graph by default
y.backward()
print(x.grad)         # 12.

# Second backward without retain_graph errors:
# y = x ** 3
# y.backward()  -- works because we built a fresh graph

# If you wanted to backward TWICE on the SAME graph:
x.grad = None
y = x ** 3
y.backward(retain_graph=True)
y.backward()          # second pass on the same retained graph
print(x.grad)         # 24. (gradients accumulate)

Computational Graph 와 grad_fn

모든 op 가 node 를 만들어

graph 를 backward 로 걷기

Code

External links

Exercise

Progress

댓글 0