NLP




Visualizing Adaptive Sparse Attention Models

A visual description of the adaptive sparse attention technique.



Visualizing Banded Sparse Matrices

A visual description of banded sparse matrices, a really useful and underused form of sparsity.



The Annotated Transformer

In this post I present an "annotated" version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout.