Friday Mar 15, 2024

arxiv preprint - Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

In this episode, we discuss Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking by Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman. The paper presents Quiet-STaR, an improved self-reasoning language model that internally generates rationales to enhance text prediction abilities. This approach mitigates challenges associated with computational costs and limitations in token prediction by using a new tokenwise parallel sampling algorithm and an extended teacher-forcing method. The enhanced model demonstrates improved zero-shot performance on reasoning benchmarks and a reduction in perplexity without task-specific fine-tuning, indicating a more scalable and general reasoning capability in language models.

Comment (0)

No comments yet. Be the first to say something!