Wednesday Mar 06, 2024

arxiv preprint - Asymmetry in Low-Rank Adapters of Foundation Models

In this episode, we discuss Asymmetry in Low-Rank Adapters of Foundation Models by Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon. The paper presents an analysis of Low-Rank Adaptation (LoRA), revealing an asymmetry in the roles of the matrices (denoted B and A) involved in updating neural network parameters. It is found that fine-tuning the B matrix is more critical than fine-tuning the A matrix, to the extent that an untrained A can suffice. This insight leads to better parameter efficiency and generalization bounds when only B is trained, with experimental validation on models like RoBERTa and BART-Large, among others, with resources shared on GitHub.

Comment (0)

No comments yet. Be the first to say something!