mlx-lm 이 이미 알아듣는 architecture 들

지원 리스트, 짧게

mlx-lm 이 모델의 config.json 의 architecture 이름 기반으로 맞는 코드 path 로 dispatch. 2026-05 기준 (mlx-lm 0.31.3), 패키지가 100 개 넘는 architecture 구현 출하 — 네가 기대할 모든 mainstream 오픈-weight LLM 가족, 더하기 variant 와 fork 의 긴 꼬리.

이 리스트 외울 필요 없어. 두 가지 알아야 해 — 주어진 Hugging Face 모델이 mlx-lm 와 그냥 동작할지 어떻게 체크하나, 답이 no 일 때 뭐 하나.

실제로 쓸 가장 큰 가족들

llama 가족 — llama, llama3, llama4_text. Meta 의 오픈-weight 라인, 그리고 많은 파생작이 재사용하는 사실상 baseline architecture. 모델이 "Llama-shape" 라고 주장하면, mlx-lm 이 아마 로드.
qwen 가족 — qwen, qwen2, qwen2_vl, qwen3, qwen3_vl, qwen3_moe, qwen3_next. Alibaba 의 경쟁력 있는 오픈-weight 라인; 매우 활발히 지원.
mistral 가족 — mistral, mistral3, mixtral. Mistral AI 의 모델들, MoE variant 포함.
phi 가족 — phi, phi3, phi3small, phimoe, phixtral. Microsoft 의 작지만-강한 라인.
gemma 가족 — Google 의 오픈-weight 라인, 여러 사이즈.
deepseek — frontier-quality reasoning 모델.
mamba / mamba2 / ssm / rwkv7 — transformer 의 state-space 와 RNN-style 대안. 더 작은 커뮤니티지만 지원.

다운로드 전에 체크하는 법

config.json 가진 Hugging Face repo 에 대해, model_type 필드가 architecture 이름 말해줘. 그 이름이 mlx-lm 의 models/ 디렉토리에 있으면, 모델 로드. 아래 코드 블록이 inspector — mlx_lm.models 들여다봐서 mlx-lm 이 출하하는 모든 architecture 리스트, 그 다음 후보 모델을 그 리스트에 체크.

Architecture 가 아직 지원 안 될 때 뭐 하나

ml-explore/mlx-lm 에 최근 issue 나 PR 있는지 체크. 새 mainstream 모델은 보통 며칠 안에 PR 받아.
mlx-community org 에 MLX-format conversion 찾기 — 가끔 커뮤니티가 이미 architecture 이름을 지원되는 거로 적응시켜.
기다리거나 기여. mlx-lm 의 릴리스 리듬 빨라 (foundations 의 lesson 6); 누락된 architecture 는 거의 오래 누락 안 돼.

Code

mlx-lm 이 현재 출하하는 모든 architecture 리스트·python

import os
import mlx_lm.models as m

ARCH_DIR = os.path.dirname(m.__file__)
NON_ARCH = {"base", "cache", "switch_layers", "rope_utils"}

archs = sorted(
    f.replace(".py", "")
    for f in os.listdir(ARCH_DIR)
    if f.endswith(".py") and not f.startswith("_") and f.replace(".py", "") not in NON_ARCH
)

print(f"mlx-lm supports {len(archs)} architectures (as of {m.__file__.split('/')[-3]}):")
for a in archs:
    print(f"  - {a}")

# Verified count (2026-05-03, mlx-lm 0.31.3): 114 architectures

다운로드 전에 Hugging Face 모델의 config.json 체크·python

# Read the architecture name from a model's config.json on Hugging Face
# without downloading the weights. Uses the public HF API.
from huggingface_hub import hf_hub_download
import json

def model_arch(repo_id):
    path = hf_hub_download(repo_id=repo_id, filename="config.json")
    with open(path) as f:
        cfg = json.load(f)
    return cfg.get("model_type"), cfg.get("architectures", [])

print(model_arch("mlx-community/Llama-3.2-1B-Instruct-4bit"))
# → ('llama', ['LlamaForCausalLM'])

print(model_arch("mlx-community/Mistral-7B-Instruct-v0.3-4bit"))
# → ('mistral', ['MistralForCausalLM'])

Exercise

Architecture-listing 블록 돌려. 그 다음 다른 데서 추천받은 본 적 있는 Hugging Face 모델 셋 골라 — mainstream 또는 niche — 그리고 model_arch helper 써서 그 model_type 을 지원 리스트에 체크. 각각에 대해, 실제로 시도하기 전에 load() 가 성공할지 예측. 운동의 포인트는 "이거 MLX 에서 동작할까?" 질문을 30 초 답으로 은퇴시키는 것.