Research
Based: Simple Linear Attention Language Models Balance the Recall-Throughput Tradeoff
Simran Arora, Sabri Eyuboglu, Michael Zhang, Aman Timalsina, Silas Alberti, Dylan Zinsley, James Zou, Atri Rudra, Christopher Ré
arXiv
,
2024
Zoology: Measuring and Improving Recall in Efficient Language Models
Simran Arora, Sabri Eyuboglu, Aman Timalsina, Isys Johnson, Michael Poli, James Zou, Atri Rudra, Christopher Ré
arXiv
,
2024
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Albert Gu*, Tri Dao*
arXiv
,
2024
Mamba-3B-SlimPJ: State-space models rivaling the best Transformer architecture
Albert Gu*, Tri Dao*
Cartesia Blog
,
2023
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu*, Tri Dao*
arXiv
,
2023
How to Train Your HiPPO: State Space Models with Generalized Orthogonal Basis Projections
Albert Gu, Isys Johnson, Aman Timalsina, Atri Rudra, Christopher Ré
NeurIPS
,
2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces
Eric Nguyen, Karan Goel, Albert Gu, Gordon W. Downs, Preey Shah, Tri Dao, Stephen A. Baccus, Christopher Ré
NeurIPS
,
2022
It's Raw! Audio Generation with State-Space Models
Karan Goel, Albert Gu, Chris Donahue, Christopher Ré
ICML
,
2022
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu, Karan Goel, Christopher Ré
ICLR
,
2022
Domino: Discovering Systematic Errors with Cross-Modal Embeddings
Sabri Eyuboglu, Maya Varma, Khaled Saab, Jean-Benoit Delbrouck, Christopher Lee-Messer, Jared Dunnmon, James Zou, Christopher Ré
ICLR
,
2022
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation
Karan Goel, Albert Gu, Yixuan Li, Christopher Ré
ICLR
,
2021
HiPPO: Recurrent Memory with Optimal Polynomial Projections
Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, Christopher Ré
NeurIPS
,
2020