
Share
Researchers delve into how Low-Rank Adaptation (LoRA) stacks up against full finetuning, revealing its memory efficiency but also its limitations in preserving performance across various tasks.
In a recent study titled "LoRA Learns Less and Forgets Less," researchers from various institutions explored the performance of Low-Rank Adaptation (LoRA) compared to full finetuning in large language models. LoRA, known for its parameter efficiency, trains only low-rank perturbations to selected weight matrices, significantly reducing memory usage. However, this efficiency comes with trade-offs, especially when it comes to maintaining the base model's performance on tasks outside the target domain.
Performance Comparison: In standard low-rank settings, LoRA underperforms full finetuning in both programming and mathematics domains.
Catastrophic Forgetting: LoRA better maintains the base model's performance on tasks outside the target domain compared to full finetuning.
Diverse Generations: LoRA helps maintain more diverse generations of text, which is crucial for creative and varied outputs.
Programming Domain:
Mathematics Domain:

Choose Appropriate Rank:
Monitor Base Model Performance:
Combine with Regularization Techniques:
Experiment with Data Regimes:
While LoRA is a powerful tool for parameter-efficient finetuning, it comes with trade-offs in terms of performance on target tasks. However, its ability to maintain base model performance and generate diverse outputs makes it a valuable technique, especially when combined with other regularization methods. Researchers and practitioners should carefully consider these factors when deciding between LoRA and full finetuning for their specific use cases.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
20 May 2024
133 articles
Related Articles
Related Articles
More Stories