Thursday Sep 28, 2023
arxiv Preprint - DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
In this episode we discuss DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models by Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Leon Song, Samyam Rajbhandari, Yuxiong He. DeepSpeed-Ulysses is a methodology for efficient and scalable training of large language models with long sequence lengths. It addresses the limitations of existing sequence parallelism approaches by partitioning input data and using efficient all-to-all collective communication for attention computation. Experimental evaluations show that DeepSpeed-Ulysses trains 2.5 times faster with sequence lengths four times longer than existing methods, highlighting its importance for generative AI and AI for science.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.