
Share
A study reveals that DeBERTa, a model usually confined to understanding context rather than generating it, can unexpectedly produce text through a straightforward tweak, challenging the dominance of generative models in this area.
In a surprising twist, a recent study by David Samuel demonstrates that masked language models (MLMs) like BERT can perform generative in-context learning tasks. This capability is typically associated with causal language models (CLMs) such as GPT. By employing an embarrassingly simple inference technique, Samuel shows that DeBERTa, a popular MLM, can generate text without any additional training or architectural modifications. The findings suggest that the field's focus on CLMs for in-context learning might be overly narrow, as both architectures have distinct strengths and complementary capabilities.
Samuel's technique involves a straightforward modification during inference:
The evaluation of this technique on DeBERTa revealed several interesting findings:

David Samuel's research opens up new avenues for leveraging masked language models in generative tasks. By using a simple but effective inference technique, DeBERTa demonstrates that MLMs can match or even surpass CLMs in certain domains. This work highlights the importance of exploring diverse model architectures and techniques to advance the field of natural language processing.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
13 June 2024
88 articles
Related Articles
Related Articles
More Stories