Attention is all you need

A Vaswani, N Shazeer, N Parmar… - Advances in neural …, 2017 - proceedings.neurips.cc
The dominant sequence transduction models are based on complex recurrent
orconvolutional neural networks in an encoder and decoder configuration. The best
performing such models also connect the encoder and decoder through an attentionm
echanisms. We propose a novel, simple network architecture based solely onan attention
mechanism, dispensing with recurrence and convolutions entirely. Experiments on two
machine translation tasks show these models to be superiorin quality while being more …

[PDF][PDF] Attention is all you need

A Vaswani, N Shazeer, N Parmar… - Advances in neural …, 2017 - data.polarishare.com
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best performing
models also connect the encoder and decoder through an attention mechanism. We
propose a new simple network architecture, the Transformer, based solely on attention
mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two
machine translation tasks show these models to be superior in quality while being more …
Показани са най-добрите резултати за това търсене. Показване на всички резултати