Friday Apr 12, 2024
arxiv preprint - Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
In this episode, we discuss Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention by Tsendsuren Munkhdalai, Manaal Faruqui, Siddharth Gopal. The paper presents a novel method for enabling Transformer-based Large Language Models to process extremely long inputs while keeping memory and computational requirements fixed. The technique introduced, called Infini-attention, blends a new form of memory-augmented attention with local and linear long-term attention within a single Transformer layer. The effectiveness of this method is demonstrated through impressive performance on long-context challenges, including a one million length sequence task and a half-million word book summarization, while maintaining efficient streaming capabilities and a minimal increase in memory parameters.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.