Chunked Transfer Encoding — HTTP/1.1 가 크기 모르고 stream 하는 법

"HTTP/1.1 가 response body 가 얼마나 긴지 (Content-Length) 알거나 body 가 연결 닫힘에 끝난다 (HTTP/1.0 스타일) 아는 거 둘 중 하나 필요. Chunked transfer 가 세 번째 옵션: 모르는 최종 크기의 self-delimiting body. 영원히 연결 안 열어두고 streaming 가능하게 하는 것."

길이 문제

HTTP/1.1 이 한 response 끝나는 때를 알아야 다음 거 읽기 시작 가능 (keep-alive 연결 재사용). 고정 크기 body 위한 옵션 둘:

Content-Length — header 로 정확한 byte 수 보냄. Client 가 N byte 읽고, response 끝났음 알음.
Connection: close — Server 가 연결 닫으면 body 끝남. Keep-alive 패배.

Body 가 즉시 생성되고 최종 크기 끝까지 모르면 둘 다 동작 안 함. AI streaming, log tailing, 큰 query 결과 — 다 총량 모르고 쓰기 시작 필요. 그게 Transfer-Encoding: chunked 가 해결.

Chunked format

각 chunk 가 hex 크기 앞에, 그 다음 CRLF, 그 다음 chunk byte, 그 다음 CRLF. 끝이 크기 0 chunk 로 신호:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Transfer-Encoding: chunked

2A\r\nevent: message\ndata: {"content":"Hi"}\n\n\r\n
2C\r\nevent: message\ndata: {"content":"아빠"}\n\n\r\n
0\r\n\r\n

2A hex = chunk content 42 byte. 2C hex = 44 byte. 마지막 CRLF 가진 0 가 끝 표시. 대부분 client 와 server 가 투명 처리 — async generator 의 yield 쓰고, HTTP 라이브러리가 chunk 로 frame.

Chunked 가 가능하게 하는 것

SSE — body 가 모르는 개수와 크기의 event stream.
큰 파일 download — Server 가 총 크기 계산 전 보내기 시작 가능.
Real-time log — Server 가 파일 tail 하고 새 줄 도착 시 chunk 씀.
Pipelined query 결과 — DB 가 row stream; server 가 각 row 를 chunk 로 씀.

HTTP/2 와 HTTP/3 가 chunked encoding 명시 노출 안 함 — 자체 binary framing 있음 — 근데 semantic 동일: 모르는 총 길이의 streaming body.

Chunked transfer 가 HTTP 의 request/response 모델 위반 없이 'HTTP 위 streaming' 동작하게 하는 substrate. SSE 가 위에 타, 파일 download 가 위에 타, AI token stream 이 위에 타. 대부분 시간 chunked-encoded byte 손으로 안 씀 — async generator 에서 yield 하면 HTTP 라이브러리나 framework 가 함. 근데 거기 있다는 거 알면 HTTP/1.1 위 streaming 이 왜 동작하는지 설명.

Gotcha

1. Trailer. Chunked response 가 body 뒤 header 포함 가능, 선두 header 의 Trailer: 로 선언. Streamed content 위 HMAC 서명에 유용. Intermediary 가 드물게 지원; 조심해 써.

2. Buffering 이 streaming 깸. SSE 와 같은 gotcha — chunked response buffer 하는 proxy 가 forward 전 모든 거 모음, streaming 패배. Streaming response 에 항상 X-Accel-Buffering: no 나 등가 설정.

3. Chunked response 에 Content-Length 설정 안 함. 상호 배타. 일부 옛 server 가 둘 다; 일부 옛 client 가 헷갈림.

cwkPippa 의 chunked 현실

cwkPippa 의 FastAPI/Starlette StreamingResponse 가 자동 chunked transfer 씀. Async generator 의 모든 yield 가 wire 위 chunk 됨. Healing 층이 이거 의존: chunk 가 response 에 yield 되기 전 JSONL log 가 쓰여, client 가 stream 중간 끊겨도 durable 기록 존재. HTTP/1.1 chunked encoding 이 또 Uvicorn (앞에 nginx 없이) 가 SSE 에 잘 동작하는 이유 — chunked framing 이 streaming 임.

Code

Wire 위 chunked encoding — hex 크기 + CRLF + content + CRLF·http

# Wire 위 chunked response (글자 그대로 CRLF 와 hex 크기 보여줌)
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked

14\r\n
The first part body\r\n
B\r\n
 second part\r\n
0\r\n
\r\n

# Body 로 번역: 'The first part body second part'
# 14 hex = 20 byte; B hex = 11 byte; 0 가 끝 표시.

FastAPI: StreamingResponse + async generator 가 chunked byte yield·python

# FastAPI — StreamingResponse 쓸 때 chunked encoding 자동
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
import asyncio

app = FastAPI()

async def streamed_content():
    # 각 yield 가 wire 위 chunk 하나 됨 (Starlette 의 framing 후)
    for i in range(10):
        yield f'chunk {i}\n'
        await asyncio.sleep(0.5)

@app.get('/stream')
async def stream():
    # Body 길이 모르면 Starlette 가 자동 Transfer-Encoding: chunked 씀
    return StreamingResponse(streamed_content(), media_type='text/plain')

# curl -N (no buffering) 이 chunk 단위로 보게 해줌:
# curl -N http://localhost:8000/stream
# chunk 0   (0.5s 후 나타남)
# chunk 1   (1.0s 후 나타남)
# ... 등

Client: httpx.stream + iter_text 가 chunk 도착 시 보여줌·python

# Client 쪽 — httpx 와 curl 가 chunked 투명 처리
import httpx

with httpx.stream('GET', 'http://localhost:8000/stream') as resp:
    print('headers:', dict(resp.headers))  # Transfer-Encoding: chunked
    for chunk in resp.iter_text():
        print(f'받은 chunk: {chunk!r}')
# 받은 chunk: 'chunk 0\n'
# (0.5s)
# 받은 chunk: 'chunk 1\n'
# ...

# Curl: -N 가 curl 자체 buffering 비활성
# curl -N http://localhost:8000/stream

Exercise

StreamingResponse 써서 1 부터 100 까지 초 당 하나씩 stream 하는 FastAPI endpoint /count 만들어. 세 방식 테스트: (1) chunk 가 real-time 도착 보려고 curl -N http://localhost:8000/count, (2) curl 가 어떻게 buffer 하는지 보려고 curl http://localhost:8000/count (-N 없이), (3) 프로그램으로 stream 소비 위해 iter_text 가진 Python httpx.stream. 그 다음 dict(resp.headers) print 추가하고 Transfer-Encoding: chunked 보는지 검증.

Hint

-N 없으면 curl 가 line buffer 채울 충분 출력 가질 때까지 print 기다림 — streaming 숨김. -N 으로 각 chunk 가 sleep interval 에 도착하는 SEE. httpx.stream 의 iter_text 가 byte 도착 즉시 decoded text yield. Transfer-Encoding header 가 response body 가 iterator/generator 이고 Content-Length 계산 안 되면 Starlette 가 자동 설정.

Chunked Transfer Encoding — HTTP/1.1 가 크기 모르고 stream 하는 법

길이 문제

Chunked format

Chunked 가 가능하게 하는 것

Gotcha

cwkPippa 의 chunked 현실

Code

External links

Exercise

Progress

댓글 0