Lightweight Deployment

~30 min · deployment, fastapi

Level 0Scout

0 XP0/48 lessons0/11 achievements

0/120 XP to next level120 XP to go0% complete

Kubernetes 안 필요할 거야

대부분 고전 ML use case는 load balancer 뒤 단일 FastAPI server면 충분. 모델은 몇 백 MB, latency는 몇 ms, 팀은 엔지니어 둘. Traffic, latency, 컴플라이언스가 요구할 때만 무거운 인프라 reach.

최소 surface area

Raw row 받아 확률 + 결정 + version 반환하는 HTTP endpoint.
Load balancer 위한 health endpoint.
모델 metadata 반환하는 version endpoint.
모든 prediction의 structured logging (input, output, version, latency).
모델 실패 시 rule-based fallback으로 가는 circuit breaker.

deployment 의식

Shadow-deploy 먼저: traffic 10%를 새 모델로 route, prediction log 하지만 결정은 old 모델 사용. 분포 비교. shadow 통과 후만 promote. 항상 이전 artifact를 rollback 위해 한 클릭 거리에 유지.

Code

sklearn pipeline용 minimal FastAPI server·python

import joblib, json
import pandas as pd
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()
pipe = joblib.load("artifacts/churn_v1.joblib")
meta = json.load(open("artifacts/churn_v1.json"))

class Row(BaseModel):
    payload: dict

@app.get("/health")
def health():
    return {"ok": True, "version": meta["version"]}

@app.post("/score")
def score(row: Row):
    X = pd.DataFrame([row.payload])[meta["features"]]
    p = float(pipe.predict_proba(X)[0, 1])
    return {"prob": p, "label": int(p >= meta["threshold"]), "version": meta["version"]}

structured prediction logging·python

import json, datetime

def log_prediction(payload, prob, label, version, latency_ms):
    print(json.dumps({
        "ts": datetime.datetime.utcnow().isoformat(),
        "version": version,
        "latency_ms": latency_ms,
        "prob": prob,
        "label": label,
        "payload": payload,
    }))

External links

Exercise

Saved pipeline을 30줄 FastAPI app으로 wrap. curl로 raw row를 /score에 hit. response가 확률, label, 모델 version 포함하는지 검증. 모든 request의 structured logging 추가.

Progress

Progress is local-only — sign in to sync across devices.

← PreviousPipeline Artifact Next →Rule vs ML vs DL vs LLM

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.