
Share
A Stanford study of six commercial chatbots reveals significant regional accuracy gaps, highlighting the need for better training data and more robust models.
In a groundbreaking new study, researchers from Stanford University’s Human-Centered Artificial Intelligence (HAI) lab have conducted a real-time audit of six popular AI chatbots to evaluate their performance in answering questions about current news events. The findings are both illuminating and concerning, especially as the reliance on AI for news consumption continues to grow.
About 10% of Americans now turn to AI chatbots for news at least sometimes, with this share increasing to nearly 15% among news consumers under 25 worldwide. However, trust in these systems is outpacing their reliability. Approximately half of U.S. Adults who get news from AI reported encountering inaccurate information, and about a third struggled to distinguish true claims from false.
The study, published as a preprint on arXiv, involved evaluating six commercial AI chatbots across 2,100 same-day news questions, resulting in 12,600 model responses. These questions were sourced from BBC News articles in six regional services: U.S. & Canada, Afrique, Arabic, Hindi, Russian, and Turkish. Over a 14-day period (February 9-22, 2026), researchers posed 25 multiple-choice questions per region each day, totaling 150 distinct questions daily.
While many chatbots achieved over 90% accuracy on multiple-choice questions, the aggregate scores masked three critical patterns:

The study's findings have important implications for practitioners and policymakers alike:
As AI continues to reshape various aspects of society, from work and energy grids to economic futures, the reliability of these systems becomes increasingly critical. The study's insights serve as a call to action for the AI community to address these challenges and ensure that AI chatbots can be trusted sources of information.
Tags
Original Sources
Reading Today’s Headlines Through AI: A Real-Time Audit of Six Commercial Chatbots | Stanford HAI
↗ https://hai.stanford.edu/news/reading-todays-headlines-through-ai-a-real-time-audit-of-six-commercial-chatbots
AI Hiring Tools Can Yield Racial Bias and Systemic Rejection
↗ https://hai.stanford.edu/news/ai-hiring-tools-can-yield-racial-bias-and-systemic-rejection
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
8 June 2026
67 articles
Related Articles
Related Articles
More Stories
© 2026 Cedar & Bloom. All rights reserved.