실전 Vision Task

4 task shape

Image classification — image 당 한 label. Output: class logit vector. Loss: cross-entropy.
Object detection — multiple bounding box + box 당 class. Output: (box, class, score) list. Model: YOLO family, DETR, Faster R-CNN.
Semantic segmentation — pixel 당 class label. Output: same-size class mask. Model: U-Net, DeepLab, SegFormer.
Instance segmentation — instance 당 pixel mask. Output: detected object 당 mask. Model: Mask R-CNN, SAM, YOLO-seg.

각 task 가 canonical metric 가짐: classification 에 Top-K accuracy, detection 에 mAP (IoU threshold 위 mean Average Precision), segmentation 에 mIoU, joint problem 에 panoptic quality.

팁: 이거 다시 발명하지 마. torchvision.models.detection, ultralytics, segformer, SAM — 각각이 'task 50 줄에 fit' API 가짐. Latency / accuracy budget 기반으로 하나 골라서 돌려.

Train 만 아니라 ship 하는 lifecycle

Production 의 vision model 필요한 거: 도메인 적절 data labelling (CVAT, Roboflow), 본인 distribution 에 tune 된 augmentation, per-class metric 평가 (aggregate 만 아니라), deploy 후 drift monitoring. Model 이 pipeline 의 작은 slice.

새 vision project 의 2026 default

Foundation model backbone (DINOv2, SAM, CLIP) → small task-specific head → small labeled set → ship. 백만 image 에 vision model from scratch train 하는 시대는 application work 에 거의 끝남.

원칙: Vision task 가 이제 대부분 labeling 문제, foundation-model-pick 문제, metric 문제. 'Modeling' 부분은 점점 작아져.

Code

Object detection in three lines (YOLO via Ultralytics)·python

from ultralytics import YOLO

model = YOLO("yolov8n.pt")          # nano model, ~6MB
results = model("image.jpg", conf=0.25)
for r in results:
    print(r.boxes.xyxy, r.boxes.cls, r.boxes.conf)

Semantic segmentation with SAM·python

from segment_anything import sam_model_registry, SamPredictor

sam = sam_model_registry["vit_b"](checkpoint="sam_vit_b_01ec64.pth")
predictor = SamPredictor(sam)
predictor.set_image(image)              # numpy array
masks, scores, _ = predictor.predict(point_coords=[[300, 200]],
                                      point_labels=[1])

4 task shape

Train 만 아니라 ship 하는 lifecycle

새 vision project 의 2026 default

Code

External links

Exercise

Progress

댓글 0