
Share
Cappy, a tiny yet powerful pre-trained scorer, boosts the efficiency and versatility of colossal language models like T0 and FLAN, making advanced NLP tasks more manageable without sacrificing performance.
March 14, 2024
By Yun Zhu and Lijuan Liu, Software Engineers, Google Research
Large language models (LLMs) have revolutionized natural language processing (NLP) by unifying various tasks within an instruction-following framework. Models like T0, FLAN, and OPT-IML have demonstrated impressive task-wise generalization capabilities, but their massive size-ranging from several billion to hundreds of billions of parameters-presents significant operational challenges. Enter Cappy, a small pre-trained scorer model that not only enhances the performance of these large multi-task LLMs but also outperforms them on complex tasks.
Cappy introduces a novel approach by leveraging a smaller, more efficient model to score and refine the outputs of larger LLMs. This method addresses the computational overhead associated with deploying massive models while maintaining or even improving performance. Here’s how it works:
Cappy is significantly smaller than its counterparts:

Cappy was evaluated against several state-of-the-art multi-task LLMs:
Cappy represents a significant step forward in the field of multi-task language modeling. By introducing a small pre-trained scorer, it addresses the computational challenges associated with large models while enhancing their performance. This innovation not only improves the efficiency of deploying LLMs but also opens up new possibilities for further research and application.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
18 March 2024
88 articles
Related Articles
Related Articles
More Stories