Classic CNN 계보

5 architecture, 15 년, 하나의 through-line

LeNet-5 (1998) — Yann LeCun 의 digit recognition network. Convolution 둘, fully-connected 둘, sigmoid activation. 작동했지만 시대 컴퓨터가 scale 못 함.

AlexNet (2012) — convolution 다섯, fully-connected 셋, ReLU activation, dropout, GPU training. ImageNet 을 huge margin 으로 이기고 deep learning fuse 점화.

VGG (2014) — small (3×3) convolution stack 통한 depth. Simple, regular, 읽기 쉬움. Parameter heavy, '더 깊으면 더 좋다' 외에 inductive bias 가벼움.

GoogLeNet / Inception (2014) — layer 안 multi-scale (1×1, 3×3, 5×5 parallel). Cheap channel mixing 위한 1×1 convolution 도입.

ResNet (2015) — 100+ layer network 를 trainable 하게 만든 residual connection. 2026 년에도 흔한 backbone — design idea 가 잘 유지.

EfficientNet (2019), ConvNeXt (2022) — width/depth/resolution 의 careful scaling, transformer practice 에서 ported design 선택 (LayerNorm, GELU, larger kernel).

팁: CNN paper 한 편 end-to-end 읽으면 ResNet 으로 해. Residual connection 이 지난 10 년의 가장 많이 복사된 design idea, 모든 modern architecture 에서 보게 돼.

2026 년에 왜 중요한가

아마 from scratch CNN train 안 할 거야 — torchvision 또는 timm 의 pretrained backbone 사용. 근데 계보 알면 어떤 backbone 고를지 알려줘: 일반 use 에 ResNet50, modern strong baseline 에 ConvNeXt-Tiny, mobile/edge 에 EfficientNet, transformer 원하면 ViT-B/16.

원칙: 2026 년 CNN architecture 결정은 대부분 'task 에 올바른 pretrained backbone 골라서 fine-tune'. 각 backbone 이 왜 그렇게 design 됐는지 알면 random 이 아니라 competent 하게 pick.

Code

Pretrained backbones via torchvision·python

import torch.nn as nn
import torchvision.models as tvm
from torchvision.models import (
    ResNet50_Weights, ConvNeXt_Tiny_Weights, EfficientNet_B0_Weights,
)

resnet = tvm.resnet50(weights=ResNet50_Weights.IMAGENET1K_V2)
convnext = tvm.convnext_tiny(weights=ConvNeXt_Tiny_Weights.IMAGENET1K_V1)
effnet = tvm.efficientnet_b0(weights=EfficientNet_B0_Weights.IMAGENET1K_V1)

# Replace the head for your N-class task
def replace_head(model, n_classes):
    if hasattr(model, "fc"):
        model.fc = nn.Linear(model.fc.in_features, n_classes)
    elif hasattr(model, "classifier"):
        model.classifier[-1] = nn.Linear(model.classifier[-1].in_features, n_classes)
    return model

5 architecture, 15 년, 하나의 through-line

2026 년에 왜 중요한가

Code

External links

Exercise

Progress

댓글 0