
Share
Researchers have developed Contrastive Preference Optimization, a technique that enhances moderate-sized language models for machine translation, bridging the performance gap with larger models without requiring extensive computational resources.
In a recent study, researchers from Johns Hopkins University and the University of Maryland have introduced Contrastive Preference Optimization (CPO), a novel approach that significantly improves the performance of moderate-sized large language models (LLMs) in machine translation (MT). The paper, titled "Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation," was accepted at ICML 2024 and addresses a critical gap between the performance of smaller LLMs and state-of-the-art conventional encoder-decoder models or larger-scale LLMs like GPT-4.
The key innovation lies in shifting from supervised fine-tuning (SFT) to CPO. SFT, which is the standard approach for training LLMs on specific tasks, involves using a dataset of human-generated reference translations. However, this method has limitations:
CPO addresses these issues by training models to avoid generating suboptimal translations. This is achieved through a contrastive learning framework where the model learns from pairs of translations, one better than the other. The goal is to push the model towards generating higher-quality translations by penalizing it for producing lower-quality ones.

CPO operates by:
For practitioners in the field of machine translation and natural language processing (NLP), CPO offers several advantages:
The introduction of Contrastive Preference Optimization marks a significant step forward in improving the performance of moderate-sized LLMs in machine translation. By addressing the limitations of SFT and leveraging contrastive learning, CPO demonstrates that with the right approach, smaller models can achieve state-of-the-art results, making it a valuable addition to the NLP toolkit.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
24 January 2024
133 articles
Related Articles
Related Articles
More Stories