
Share
While OpenAI's Agents promise groundbreaking capabilities in a browser, early testers find the reality falls short of hype, revealing both potential and pitfalls in this cutting-edge tool.
OpenAI’s latest release, "Agents," has been generating a lot of buzz. However, the initial excitement is tempered by the fact that many of the discussions are based on promotional materials rather than hands-on experience. Leon Furze, an early tester, shares his insights into what works and what doesn’t with this new browser AI tool.
OpenAI’s Agents is designed to be a browser-based assistant capable of performing tasks like creating presentations, conducting research, and automating online shopping. The concept is promising, but the execution leaves much to be desired. Furze, who tested the product shortly after its release, found it to be highly unstable and often unresponsive.
Furze’s initial attempts with Agents were met with disappointment. Here are some of the issues he encountered:

Despite the initial setbacks, Furze continued to experiment with Agents. He found that while the basic functionalities were flawed, there was potential in certain areas:
Research Capabilities:
Prompt Engineering:
While OpenAI’s Agents is currently an unfinished product, it has sparked interest in the potential of AI-assisted browsing. Furze believes that with further development and optimization, Agents could become a valuable tool for productivity:
OpenAI’s Agents is an ambitious project that currently falls short of its hype. However, the underlying technology shows promise, and with continued refinement, it could become a useful tool for automating tasks and enhancing productivity. For now, practitioners should approach it with cautious optimism and a willingness to experiment.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
28 July 2025
88 articles
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
More Stories