
Share
Researchers at Stanford University and Google unveil Emulated Fine-Tuning, a technique that isolates the effects of pre-training and fine-tuning in large language models, enhancing their accuracy without full-scale retraining.
In a recent paper, researchers from Stanford University and Google introduce a novel technique called "Emulated Fine-Tuning" (EFT) that decouples the knowledge gained during pre-training and fine-tuning stages of large language models (LLMs). This method allows for a more nuanced understanding of how these two stages contribute to model performance, particularly in terms of helpfulness and factuality.
Traditionally, LLMs are built using a two-stage pipeline: pre-training on vast amounts of diverse text data and fine-tuning (or alignment) on targeted examples to refine specific behaviors. The assumption has been that pre-training imparts broad knowledge and skills, while fine-tuning filters and refines this knowledge. However, this hypothesis hasn't been thoroughly tested.
To address this gap, the researchers developed EFT, a technique that uses reinforcement learning (RL) to sample from a distribution that approximates the results of pre-training and fine-tuning at different scales. This allows for a direct comparison of how scaling up or down each stage affects model performance.
EFT operates by creating an emulator that approximates the behavior of a large LLM fine-tuned on a specific task. Here’s how it works:

One of the most exciting applications of EFT is "LM up-scaling," a special case where small, fine-tuned models are ensembled with large pre-trained models to emulate the result of fine-tuning the large model. This approach offers several benefits:
For practitioners working with LLMs, EFT and LM up-scaling offer several practical advantages:
The introduction of Emulated Fine-Tuning (EFT) marks a significant step forward in understanding and optimizing the behavior of large language models. By decoupling pre-training and fine-tuning, researchers can gain deeper insights into how these stages contribute to model performance, leading to more efficient and effective LLMs.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
30 October 2023
88 articles
Related Articles
Related Articles
More Stories