Friday Sep 15, 2023

arxiv Preprint - Neurons in Large Language Models: Dead, N-gram, Positional

In this episode we discuss Neurons in Large Language Models: Dead, N-gram, Positional by Elena Voita, Javier Ferrando, Christoforos Nalmpantis. In this paper, the authors analyze a family of language models called OPT models and focus on the activation of neurons in the feedforward blocks. They find that there are many inactive "dead" neurons in the early part of the network and that active neurons in this region primarily act as token and n-gram detectors. The authors also identify positional neurons that are activated based on position rather than textual data.

Comment (0)

No comments yet. Be the first to say something!