실전: 전체 TFLite 변환 pipeline

실제로 출시할 pipeline

Production-ready TFLite workflow 전체 — transfer learning classifier 훈련, 여러 TFLite variant로 변환, 각각 정확도 검증.

패턴: float32 정밀도로 훈련, SavedModel 저장, 그 다음 세 TFLite variant (baseline / dynamic-range / int8) 생성. 타깃 배포에서 정확도 기준 만족하는 가장 작은 variant 골라.

제품에 왜 중요하냐면: Quantized MobileNetV2 classifier가 ~14MB (float32)에서 ~3.5MB (int8)로 줄면서 CPU 3× 빠르게 돌아. Mobile 앱에서 작은 바이너리는 빠른 다운로드, 적은 storage 사용, WiFi 필수냐 셀룰러로 출시 가능이냐의 차이. 학술 숫자가 아니라 실제 비즈니스 제약이야.

Code

Full pipeline — train, convert, compare·python

import tensorflow as tf
import numpy as np

# 1. Build a transfer-learning classifier
IMG_SIZE = 224
base = tf.keras.applications.MobileNetV2(
    input_shape=(IMG_SIZE, IMG_SIZE, 3),
    include_top=False, weights='imagenet')
base.trainable = False

model = tf.keras.Sequential([
    base,
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(5, activation='softmax'),
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
# ... model.fit(train_ds, validation_data=val_ds, epochs=10)
model.save('flower_model/saved_model/')

# 2. Convert to three TFLite variants
def representative_dataset():
    for _ in range(200):
        yield [np.random.rand(1, 224, 224, 3).astype(np.float32)]

variants = {}

c = tf.lite.TFLiteConverter.from_saved_model('flower_model/saved_model/')
variants['baseline'] = c.convert()

c = tf.lite.TFLiteConverter.from_saved_model('flower_model/saved_model/')
c.optimizations = [tf.lite.Optimize.DEFAULT]
variants['dynamic'] = c.convert()

c = tf.lite.TFLiteConverter.from_saved_model('flower_model/saved_model/')
c.optimizations = [tf.lite.Optimize.DEFAULT]
c.representative_dataset = representative_dataset
variants['int8'] = c.convert()

# Compare sizes
for name, model_bytes in variants.items():
    print(f"{name:12s}: {len(model_bytes) / 1024:.1f} KB")

Variant 별 정확도 검증·python

import tensorflow as tf
import numpy as np

def run_tflite(model_bytes, test_images):
    interpreter = tf.lite.Interpreter(model_content=model_bytes)
    interpreter.allocate_tensors()
    input_idx  = interpreter.get_input_details()[0]['index']
    output_idx = interpreter.get_output_details()[0]['index']

    results = []
    for img in test_images:
        interpreter.set_tensor(input_idx, img[None].astype(np.float32))
        interpreter.invoke()
        results.append(interpreter.get_tensor(output_idx)[0])
    return np.array(results)

# Compare baseline vs quantized — accuracy drop should be small
baseline_preds = run_tflite(variants['baseline'], test_images)
int8_preds     = run_tflite(variants['int8'],      test_images)
agreement = np.mean(np.argmax(baseline_preds, 1) == np.argmax(int8_preds, 1))
print(f"baseline vs int8 agreement: {agreement*100:.2f}%")

실전: 전체 TFLite 변환 pipeline

실제로 출시할 pipeline

Code

Exercise

Progress

댓글 0