프로젝트: 커스텀 Instruction Follower

~28 min · project, structured-output, json, instruction-follower

Level 0관찰자

0 XP0/43 lessons0/11 achievements

0/120 XP to next level120 XP to go0% complete

목표

제품 리뷰 분석 도구용 구조화 JSON을 일관되게 출력하는 작은 모델 학습. 모델이 첫 시도에 매번 옳은 필드 가진 파싱 가능한 JSON 생성해야 해.

레시피

베이스 모델: Llama 3.1 8B Instruct (instruction following 잘함).
학습 데이터: 500~1,000 제품 리뷰랑 손으로 만든 JSON 출력 쌍, 긍정/부정/혼합/풍자/매우 짧은/매우 긴 입력 커버.
메서드: QLoRA, r=16, all-linear, 3 epoch.
기대 결과: 프롬프트만으로 ~90% vs 99%+ JSON 포맷 준수.

'Adversarial' 다이어트

견고성의 단일 최대 결정 요인 — 학습 데이터에 'adversarial' 예제 포함: 모호한 리뷰, 매우 짧은 거, 풍자 포함, 여러 감정 혼합, 예상 외 포맷. 모델은 학습 중 실제 변동을 봐야만 실제 변동에 견고해져.

Code

Training data shape for the JSON-output instruction follower·json

{"messages": [
  {"role": "system", "content": "Analyze product reviews. Always respond in JSON with: sentiment (positive|negative|mixed), confidence (0..1), key_points (list), summary (one sentence)."},
  {"role": "user", "content": "This laptop is amazing! Great battery life and the screen is gorgeous. Only downside is the keyboard could be better."},
  {"role": "assistant", "content": "{\"sentiment\": \"positive\", \"confidence\": 0.85, \"key_points\": [\"excellent battery life\", \"gorgeous screen\", \"mediocre keyboard\"], \"summary\": \"Highly positive review praising battery and display, with minor keyboard criticism.\"}"}
]}
{"messages": [
  {"role": "system", "content": "Analyze product reviews. Always respond in JSON with: sentiment (positive|negative|mixed), confidence (0..1), key_points (list), summary (one sentence)."},
  {"role": "user", "content": "meh"},
  {"role": "assistant", "content": "{\"sentiment\": \"mixed\", \"confidence\": 0.4, \"key_points\": [\"insufficient detail\"], \"summary\": \"Review too short to determine clear sentiment.\"}"}
]}

External links

Exercise

회사에서 구조화 출력 작업 하나 골라. 200예제 큐레이션(happy-path 섞고 최소 30% adversarial). 레시피 따라 Llama 3.1 8B QLoRA 파인튜닝. 50 held-out 리뷰에서 전후 JSON 파싱 가능성 측정. Lift 리포트.

Progress

Progress is local-only — sign in to sync across devices.

← 🏋️ 학습 테크닉퀴즈 · 5 questions Next →프로젝트: 도메인 전문가

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

🔔 답글 알림 (로그인 필요)

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.