Artificial Neuron

3 가지와 이름 하나

Artificial neuron 은 neural network 의 가장 단순한 unit 이고 이름보다 훨씬 덜 신비해. 3 가지를 해: input vector x 받기, learned weight vector w 와 곱하기, learned bias b 더하기, non-linear activation f 적용. 끝. Deep learning 의 나머지 전부는 '많이 쌓고 backprop 한테 맡기기' 야.

수식: y = f(w · x + b). PyTorch 식으로: 단일 output nn.Linear(in_features, 1) + activation. 생물학적 비유 (dendrite, synapse, axon) 는 느슨해 — 실제 unit 은 그냥 learned linear projection + non-linearity 야.

팁: 'matrix multiply, bias 더하기, non-linearity 적용' 을 안 떨고 읽을 수 있으면, 모든 neural network architecture 를 읽을 수 있어. 이 quest 의 모든 layer 가 그 3 step 의 변형이야.

왜 activation 이 중요한가

Non-linear activation 없으면 모든 layer 가 이전 layer 의 linear function 이라, layer stack 이 single linear function 으로 collapse 해. Linear layer 1 개 또는 20 개로 XOR 못 풀어. ReLU 더한 1 개로는 풀어. 이게 activation 이 존재하는 가장 중요한 이유 — depth 를 의미 있게 만드는 거야.

Shape 읽기

Forward pass 의 모든 변수 옆에 shape 적는 습관 들여. x: [B, in_features], w: [out_features, in_features], y: [B, out_features]. PyTorch 첫 달 버그의 대부분이 shape mismatch 고, 그 버그의 대부분이 comment 에 shape 적는 순간 사라져.

원칙: Shape 는 layer 사이의 contract 야. Shape 읽고, 예측하고, assert. Train 되는 model 은 보통 shape 맞는 forward pass 가 있고, crash 나는 model 은 보통 없어.

Code

A single neuron, by hand and with PyTorch·python

import torch, torch.nn as nn

x = torch.randn(3, 4)                      # [B=3, in=4]

w = torch.randn(4, 1, requires_grad=True)  # [in, out]
b = torch.zeros(1, requires_grad=True)
y_manual = torch.relu(x @ w + b)           # [B, 1]

neuron = nn.Linear(in_features=4, out_features=1)
y_pytorch = torch.relu(neuron(x))          # [B, 1]

print(y_manual.shape, y_pytorch.shape)

3 가지와 이름 하나

왜 activation 이 중요한가

Shape 읽기

Code

External links

Exercise

Progress

댓글 0