Friday Sep 15, 2023
arxiv Preprint - Neurons in Large Language Models: Dead, N-gram, Positional
In this episode we discuss Neurons in Large Language Models: Dead, N-gram, Positional by Elena Voita, Javier Ferrando, Christoforos Nalmpantis. In this paper, the authors analyze a family of language models called OPT models and focus on the activation of neurons in the feedforward blocks. They find that there are many inactive "dead" neurons in the early part of the network and that active neurons in this region primarily act as token and n-gram detectors. The authors also identify positional neurons that are activated based on position rather than textual data.
Comments (0)
To leave or reply to comments, please download free Podbean or
No Comments
To leave or reply to comments,
please download free Podbean App.