Applying Sutton's Bitter Lesson to Modern AI Development

Models & Research

The Engineer

16 Oct 2025 · 3 min read

As AI developers grapple with complex projects, they're rediscovering Richard Sutton's Bitter Lesson, finding that broad, computational approaches trump narrow, specialized techniques.

Richard Sutton’s "Bitter Lesson" has been a cornerstone of AI research for decades, and it's making waves again in the developer community. The lesson is simple yet profound: general methods that leverage computation are ultimately the most effective. In this article, we'll explore how this principle applies to building and working with AI applications today.

The Bitter Lesson Revisited

Sutton’s Bitter Lesson states:

"The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin."

This insight has profound implications for developers working with modern AI. Many have not yet fully internalized this lesson, leading to suboptimal practices in both coding and application design.

How Not to Code with AI

One common mistake is the "AI-maximalist" approach, often seen at coding events, workshops, and demos. Developers using this method typically have a folder full of text files filled with rules, modes, roles, prompts, or subagents. These files are packed with detailed instructions, pleading language, capitalization, and even step-by-step logic telling the Large Language Model (LLM) how to think and act.

The fundamental error here is that these methods bake in assumptions about workflows and agent behavior. They interfere with the model’s natural capabilities, which Sutton would describe as a "human knowledge-based" method. While these tricks were necessary when models were weaker, today's LLMs are capable of reasoning well and learning from environmental feedback. Force-fitting complex workflows can actually fight against the model weights.

Instead, an engineer who has digested the bitter lesson will set up an environment that provides feedback loops to the agent. This approach is simpler and better suited for frontier reasoning models scaled with reinforcement learning. By getting out of the way, you allow the model to operate more effectively.

How Not to Build LLM Wrappers

Another common pitfall is jumping straight into complex workflows, indiscriminate application of prompting tricks, and multiple agents with fixed roles when designing an LLM-integrated application. These practices add unnecessary complexity and should not be the default starting point.

To illustrate why, let's look at the evolution of coding agents:

First Generation (Cursor, Sourcegraph Cody, Codeium, Copilot): These tools heavily relied on rule-based systems and fixed roles. They were effective but limited by their rigidity.
Modern Approach: Today’s best practices involve creating a dynamic environment where the LLM can learn from feedback. This approach is more aligned with Sutton's Bitter Lesson.

Implementing Feedback Loops

Setting up an environment that provides feedback loops to the agent involves several key steps:

Define Clear Objectives: Clearly specify what you want the model to achieve.
Provide Diverse Examples: Offer a wide range of examples for the model to learn from.
Implement Reinforcement Learning: Use reinforcement learning techniques to fine-tune the model based on its performance.
Monitor and Adjust: Continuously monitor the model’s behavior and make adjustments as needed.

By following these steps, you create an environment where the LLM can adapt and improve over time without being constrained by fixed rules or roles.

Conclusion

Sutton's Bitter Lesson is more relevant than ever in the world of modern AI development. By leveraging general methods that allow models to learn from feedback, we can build more effective and adaptable AI applications. Avoiding the pitfalls of rigid, rule-based systems and instead focusing on creating dynamic environments will lead to better outcomes for both developers and end-users.