PyTorch Mental Model

실제 쓰는 PyTorch primitive 4 개

Tensor — device 위 n-차원 array, dtype, shape, optional gradient tracking 가짐. Atomic unit.
Module — parameter 들고 forward pass 정의하는 stateful network piece. Attribute 통해 compose (self.conv = nn.Conv2d(...)), state_dict() 와 parameters() 가 자동 발견.
Optimizer — parameter list 알고, gradient 주어진 그것들 update.
Autograd engine — forward 동안 computation graph 빌드, gradient 계산하려 backward 로 walk.

읽는 모든 PyTorch program 이 둘 중 하나: Module 정의, Module 통해 tensor 통과, scalar loss 계산, .backward() 호출, opt.step() 호출. 그 loop 내면화하면 나머지는 라이브러리 지식.

팁: PyTorch 가 numpy + automatic differentiation 위 얇은 convention layer. Framework 작고 ecosystem (torchvision, transformers, lightning) 거대. Core 깊이 학습, ecosystem piece 는 필요로 골라.

Module 이 Lego 처럼 compose

Module 이 다른 Module 포함 가능. state_dict() 가 모든 parameter recursively flatten, to(device) 가 recursively 옮겨. 계층이 API — attribute 로 assign 하는 한 parameter 수동 register 안 해도 됨.

2026 torch.compile 디테일

어떤 Module 이든 torch.compile(model) 로 wrap 하면 PyTorch 가 forward pass 를 modern accelerator 에서 significantly 빠른 fused graph 로 trace. 첫 호출은 느림 (compilation), 이후 호출 1.3-3x 빠름. CUDA 에 pure win, CPU/MPS 에 작은 win.

원칙: PyTorch 가 2026 년 deep learning 의 lingua franca. Fluent PyTorch 읽고 쓰기가 이 quest 의 single 가장 leverage 높은 skill.

Code

Modules compose, parameters are auto-discovered·python

import torch
import torch.nn as nn

class Block(nn.Module):
    def __init__(self, dim):
        super().__init__()
        self.fc = nn.Linear(dim, dim)
        self.norm = nn.LayerNorm(dim)
    def forward(self, x):
        return self.norm(torch.relu(self.fc(x)))

class Net(nn.Module):
    def __init__(self, dim, depth):
        super().__init__()
        self.blocks = nn.ModuleList([Block(dim) for _ in range(depth)])
        self.head = nn.Linear(dim, 10)
    def forward(self, x):
        for blk in self.blocks:
            x = x + blk(x)            # residual
        return self.head(x)

m = Net(dim=128, depth=4)
print(sum(p.numel() for p in m.parameters()))
m.to("cuda" if torch.cuda.is_available() else "cpu")

torch.compile for free 1.3-3x speedup·python

import torch
model = MyModel().to("cuda")
compiled = torch.compile(model, mode="reduce-overhead")
# Use compiled exactly like model — same API
out = compiled(x)

실제 쓰는 PyTorch primitive 4 개

Module 이 Lego 처럼 compose

2026 torch.compile 디테일

Code

External links

Exercise

Progress

댓글 0