
Share
Meta’s new tool, TestGen-LLM, uses large language models to automatically improve unit tests, addressing coverage gaps and code hallucinations, as detailed in a paper by Nadia Alshahwan and colleagues.
Meta has introduced a novel tool called TestGen-LLM that leverages large language models (LLMs) to automatically improve existing unit tests. This innovation aims to address common issues in test suite quality, such as coverage gaps and LLM-generated code hallucinations. The paper, authored by Nadia Alshahwan and colleagues, provides a detailed account of TestGen-LLM's deployment at Meta, specifically during test-a-thons for Instagram and Facebook platforms.
TestGen-LLM introduces an automated process to enhance human-written unit tests using LLMs. The key technical advancements include:
For software engineers and practitioners, this tool offers several benefits:
TestGen-LLM operates in a multi-step process:

The tool was evaluated on the Reels and Stories products for Instagram, with notable results:
During Meta's Instagram and Facebook test-a-thons, TestGen-LLM was applied to a significant portion of the codebase:
The authors highlight that this is the first report on industrial-scale deployment of LLM-generated code with such assurances of improvement. The deployment at Meta demonstrates the tool's effectiveness and reliability in a real-world setting, providing valuable insights for other organizations considering similar approaches.
TestGen-LLM represents a significant step forward in automated test generation, leveraging the power of large language models to enhance unit tests while ensuring high-quality standards. For software engineers, this tool offers a practical solution to improve test coverage and code quality with minimal manual effort.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
16 February 2024
133 articles
Related Articles
Related Articles
More Stories