사람 승인과 Audit Trail

두 운영 습관이 security 를 진짜로 만들어: 위험한 가장자리에 human-in-the-loop 와 append-only audit log. 화려하지 않음; ship 가능 agent 와 못 하는 agent 의 경계.

승인 은 side-effect lesson 의 패턴: 되돌릴 수 없는 write 를 propose_X 와 execute_X 로 split, host 가 사이에 user 명시적 OK 받음. MCP 가 — tool annotation (destructiveHint, readOnlyHint) 통해 — 기계 검증 가능하게 만듦; 잘 만든 host 가 destructive 주장 annotation 가진 tool 에 다른 confirmation UI 표시. Protocol 이 annotation 정직성 강제 안 함, host 가 가능 — 평판이 server 간 강제.

Audit log 가 — 모든 tool call 의 append-only 구조화된 record: 어느 client, 어느 tool, 어느 인자, 어느 result, 언제, 누구 (알면). Server 쪽 산다; client 가 다시 못 씀. 6 개월 후 뭔가 잘못되면 ('정말 그 주문에 $400 환불했나?') audit log 가 답이고 유일한 답. JSON-line 디스크에, 일별 파일, 기존 log 인프라로 ship — 비용 작고 구조 가치 거대.

Audit log 가 abuse 알아챌 자리이기도. 특정 tool spike, scraping 시사하는 인자 패턴, 의심스러운 IP 의 OAuth token — 다 log 에 먼저. Log 를 — 컴플라이언스 서류 아니고 — security 망원경으로 다뤄.

Code

승인-gate 된 execute tool·python

@app.tool(annotations={"destructiveHint": True})
async def execute_refund(proposal_id: str, approved_by_user: bool):
    if not approved_by_user:
        return [TextContent(type="text",
                  text="Refund execution requires explicit user approval (approved_by_user=true).")]
    audit.log("execute_refund", proposal_id=proposal_id, user=current_user())
    receipt = await stripe.refund(...)
    audit.log("execute_refund.done", receipt_id=receipt.id)
    return [TextContent(type="text", text=f"Refunded. Receipt: {receipt.id}")]

Append-only audit log helper·python

import json, time, pathlib

LOG = pathlib.Path("/var/log/mcp-server/audit.jsonl")

def audit(event: str, **fields):
    rec = {"ts": time.time(), "event": event, **fields}
    with LOG.open("a") as f:
        f.write(json.dumps(rec, separators=(',', ':')) + "\n")

사람 승인과 Audit Trail

Code

External links

Exercise

Progress

댓글 0