NIPS 2017 paper by Vlad Niculae and Mathieu Blondel.
Vlad comes on to tell us about his paper. Attentions are often computed in neural networks using a softmax operator, which maps scalar outputs from a model into a pro…
Home
Feed
Search
Library
Download