Tuesday Sep 26, 2023
arxiv Preprint - PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training
In this episode we discuss PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training by Dawei Zhu, Nan Yang, Liang Wang, Yifan Song, Wenhao Wu, Furu Wei, Sujian Li. The paper presents a training method called PoSE for adapting large language models to longer context windows. It addresses the challenge of extending the context window of pre-trained models without disrupting performance. The method simulates long inputs using a fixed context window with manipulated position indices, reducing memory and time overhead while maintaining performance.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.