multi head attention

Literature Review

[논문 리뷰] Attention Is All You Need

논문 링크: https://arxiv.org/abs/1706.03762 Attention Is All You NeedThe dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a newarxiv.org 1. 서론1.1 논문 선정 이유"Attention Is All You Need"는 Transformer 모델을 제안한 기념..

AlienCoder
'multi head attention' 태그의 글 목록
loading