퀴즈 · 4 questions

📐 Embedding과 위치

토큰 ID에서 순서 있는 dense 벡터로

Level 0Token

0 XP0/94 lessons0/10 achievements

0/120 XP to next level120 XP to go0% complete

01vocab=128,000, d_model=4,096인 Transformer에서 embedding 행렬의 shape는?

Hint

Rows index over vocabulary; columns index over hidden dimensions.

02Llama 3, Mistral, Qwen이 공통으로 쓰는 위치 인코딩 스킴은?

Hint

It's applied as a rotation inside attention rather than as an addition at the input.

03왜 위치 인코딩 없는 self-attention은 순열에 동변(equivariant)인가?

Hint

Look at where position would have to appear in the dot-product formula.

04Llama 3가 컨텍스트 윈도우를 8K에서 128K로 확장하는 데 쓴 기법은?

Hint

It's a clever rescaling of RoPE's frequencies, not a brand-new architecture.

이 페이지에서 버그를 발견하셨거나 피드백이 있으세요?문제 신고

로그인 — 댓글을 남기려면 로그인해 주세요.

아직 댓글이 없어요. 첫 댓글을 남겨보세요.