C.W.K.
Stream
Lesson 04 of 06 · published

Edge Deployment — ExecuTorch, CoreML, MLX

~14 min · executorch, coreml, mlx, edge

Level 0Tensor 호기심
0 XP0/62 lessons0/13 achievements
0/120 XP to next level120 XP to go0% complete

GPU 떠나는 세 path

server inference 가 한 deployment target. mobile 과 edge 가 매우 다름. PyTorch 생태계에 dedicated 도구:

  • ExecuTorch — PyTorch 의 mobile / edge runtime. iOS, Android, microcontroller 타겟. 옛 PyTorch Mobile 후계자.
  • CoreML — Apple 의 on-device ML framework. iOS / macOS 위 max 성능. coremltools 다리 통해 PyTorch 에서 변환.
  • MLX — Apple Silicon 위 Apple 의 native ML framework. unified memory architecture 둘러 짓기. Mac 또는 iPhone 의 마지막 한 비트 짜낼 때 옳은 선택.

흐름

ExecuTorch 와 CoreML 의 modern path 는 같음: torch.export → backend-specific lowering. ExecuTorch 가 너 app 과 ship 하는 .pte 파일로 lowering. CoreML 이 .mlpackage 로.

MLX 위 두 옵션: PyTorch weight 를 MLX format 으로 변환 (많은 architecture 작동) 또는 model 을 MLX 직접 다시 짜기 (best perf, 근데 porting 노력).

Code

ExecuTorch — mobile 위 export·python
# pip install executorch
import torch
from executorch.exir import to_edge

class TinyMLP(torch.nn.Module):
    def __init__(self): super().__init__(); self.fc = torch.nn.Linear(10, 4)
    def forward(self, x): return self.fc(x)

model = TinyMLP().eval()
example = torch.randn(1, 10)

# 1. Export with torch.export
exported = torch.export.export(model, (example,))

# 2. Lower to ExecuTorch's edge IR
edge = to_edge(exported)

# 3. Optimize and serialize
et_program = edge.to_executorch()
with open('/tmp/tiny.pte', 'wb') as f:
    f.write(et_program.buffer)

# .pte ships with your iOS / Android app
CoreML — Apple device deployment·python
# pip install coremltools
import torch
import coremltools as ct

class TinyMLP(torch.nn.Module):
    def __init__(self): super().__init__(); self.fc = torch.nn.Linear(10, 4)
    def forward(self, x): return self.fc(x)

model = TinyMLP().eval()
example = torch.randn(1, 10)

# Trace the model (CoreML's converter still uses tracing under the hood)
traced = torch.jit.trace(model, example)

mlmodel = ct.convert(
    traced,
    inputs=[ct.TensorType(shape=example.shape, name='x')],
    convert_to='mlprogram',                # modern MLProgram format
    minimum_deployment_target=ct.target.macOS14,
)
mlmodel.save('/tmp/tiny.mlpackage')

# Drop the .mlpackage into Xcode and you have a CoreML model
MLX — native Apple Silicon, 두 path·python
# pip install mlx mlx-lm
import mlx.core as mx
import mlx.nn as nn

# Path 1: re-implement in MLX (best performance)
class MLXMLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(10, 4)
    def __call__(self, x):
        return self.fc(x)

model = MLXMLP()
x = mx.random.normal((1, 10))
y = model(x)                                  # eager-style execution
mx.eval(y)                                     # force evaluation
print(y.shape)                                 # (1, 4)

# Path 2: load PyTorch weights into MLX
# Many community projects (mlx_lm) support direct loading of HF checkpoints.
# from mlx_lm import load
# model, tokenizer = load("mlx-community/Llama-3.2-3B-Instruct-4bit")
hardware 별 deployment target 고르기·python
# A quick decision table:
#
# Target               | Recommended path
# --------------------- | ----------------------------------------------------
# iOS / iPadOS         | CoreML (best Apple integration) or ExecuTorch
# Android              | ExecuTorch (with NNAPI / Vulkan delegate)
# macOS (Apple Silicon)| MLX (native) or CoreML
# Linux server (GPU)   | torch.compile + bf16, or vLLM for LLMs
# Linux server (CPU)   | torch.compile + ONNX Runtime, OpenVINO
# NVIDIA Jetson / edge | TensorRT (via ONNX export)
# Browser              | ONNX Runtime Web, transformers.js

External links

Exercise

TinyMLP model 잡기. 세 방법으로 export: ExecuTorch (.pte), CoreML (.mlpackage), 그리고 (Apple Silicon 이면) MLX 에 다시 구현. 파일 size 비교하고 각각 같은 input 에 같은 output 생산 검증. 연습이 'PyTorch 에서 model 빼기' 전체 표면 가르침.

Progress

Progress is local-only — sign in to sync across devices.
이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

댓글 0

🔔 답글 알림 (로그인 필요)
로그인댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.