Vertex AI와 클라우드 배포

Infra 관리 없는 managed ML

Vertex AI는 Google Cloud의 managed ML 플랫폼. 자기 Kubernetes 클러스터랑 GPU 인스턴스 운영 대신 원하는 거 설명하면 Vertex가 infra 처리 — training job, serving endpoint, autoscaling, logging, 버전 관리.

쓰게 될 세 주요 capability:

Custom training job — managed GPU/TPU 머신에서 training 스크립트 실행, experiment 추적
Model registry — SavedModel 업로드, 버전 관리, lineage 연결
Online endpoint — 등록된 model을 한 호출로 autoscaling REST endpoint 뒤에 배포

Serving 컨테이너는 prebuilt: us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-{version}. Training TF 버전이랑 정확히 일치시켜 — training과 serving 버전 불일치가 "incompatible SavedModel" 에러의 흔한 원인.

Code

Vertex AI: 훈련, 등록, 배포·python

from google.cloud import aiplatform

aiplatform.init(
    project='my-gcp-project',
    location='us-central1',
    staging_bucket='gs://my-bucket',
)

# 1. Custom training job
job = aiplatform.CustomJob.from_local_script(
    display_name='my-tf-training',
    script_path='trainer/task.py',
    container_uri='us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-12:latest',
    requirements=['tensorflow-datasets'],
    args=['--epochs=20', '--batch-size=256'],
)
job.run(
    machine_type='n1-standard-8',
    accelerator_type='NVIDIA_TESLA_T4',
    accelerator_count=1,
)

# 2. Upload SavedModel to Model Registry
model = aiplatform.Model.upload(
    display_name='flower-classifier-v2',
    artifact_uri='gs://my-bucket/models/flower_classifier/saved_model/',
    serving_container_image_uri=(
        'us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-12:latest'
    ),
    sync=True,
)

# 3. Deploy to autoscaling endpoint
endpoint = aiplatform.Endpoint.create(display_name='flower-endpoint')
deployed = endpoint.deploy(
    model=model,
    machine_type='n1-standard-4',
    traffic_percentage=100,
)

# 4. Online prediction
instances = [{"dense_input": [0.1, 0.2, 0.3, 0.4]}]
prediction = endpoint.predict(instances=instances)
print(prediction.predictions)

Vertex AI와 클라우드 배포

Infra 관리 없는 managed ML

Code

External links

Progress

댓글 0