Part 2 of a series of notebooks for teaching ML to early college students in an 8 week summer lab session. Covers Pandas.
Part 1 of a series of notebooks for teaching ML to early college students in an 8 week summer lab session.
Literate blog serving as a tutorial for most of the series of Dex posts .
A visual description of the adaptive sparse attention technique.
A visual description of banded sparse matrices, a really useful and underused form of sparsity.
In this post I present an "annotated" version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout.