Articles tagged with #NLP


A course taught at Cornell Tech where students build up the infrastructure behind vector autodifferentiation.

Visualizing Adaptive Sparse Attention Models

A visual description of the adaptive sparse attention technique.

Visualizing Banded Sparse Matrices

A visual description of banded sparse matrices, a really useful and underused form of sparsity.

The Annotated Transformer

In this post I present an "annotated" version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout.