
Share
The `ultrathink` keyword that boosted Claude's reasoning power is now obsolete, but a new default setting and a hidden trick can double your thinking tokens on 64K models.
If you’ve been using Anthropic's Claude, you might remember the ultrathink keyword that unlocked its maximum reasoning power. Well, it’s no longer needed. But there’s a new default and even a hidden trick to get twice the thinking budget on 64K output models. Let’s dive in.
ultrathink KeywordFor months, adding ultrathink to your prompt would give you 31,999 thinking tokens. Power users often combined it with "Opus + Ultrathink + Plan Mode" for optimal results. Under the hood, Claude’s code detected the keyword and set the thinking budget:
// Simplified (other keywords like "megathink" and "think" existed too)
const thinkingBudget = prompt.includes("ultrathink") ? 31999 : 0;
// Passed to the Anthropic API
await client.messages.create({
model: "claude-sonnet-4-...",
messages: [...],
thinking: thinkingBudget > 0 ? {
type: "enabled",
budget_tokens: thinkingBudget // This is what mattered
} : undefined
});
On January 16, 2026, Anthropic deprecated ultrathink:
“Closing as ultrathink is now deprecated and thinking mode is enabled by default.” , Sarah Deaton, Anthropic
If you type "ultrathink" now, you’ll see a deprecation message.
Extended thinking is now automatically enabled for supported models with a default budget of 31,999 tokens. Here are the supported models:

The new API call flow looks like this:
// 1. Determine thinking budget
let budgetTokens = 31999; // Default: max
if (process.env.MAX_THINKING_TOKENS) {
budgetTokens = parseInt(process.env.MAX_THINKING_TOKENS);
}
// 2. Auto-enabled for supported models
const thinkingEnabled = isSupportedModel(model); // Opus 4.5, Sonnet 4/4.5, Haiku 4.5, Opus 4
// 3. Passed to Anthropic API on every request
await client.messages.create({
model: "claude-sonnet-4-...",
messages: [...],
thinking: thinkingEnabled ? {
type: "enabled",
budget_tokens: budgetTokens // Default is 31,999
} : undefined
});
Here’s a lesser-known fact: the 31,999 default exists for backward compatibility with Opus 4 (which has a 32K output limit). However, 64K output models can support much more:
| Model | Max Output | Max Thinking Budget | | --- | --- | --- | | Opus 4.5 | 64,000 | 63,999 | | Sonnet 4.5 | 64,000 | 63,999 | | Sonnet 4 | 64,000 | 63,999 | | Haiku 4.5 | 64,000 | 63,999 | | Opus 4 | 32,000 | 31,999 |
The thinking budget is capped at max_tokens - 1 because budget_tokens must be strictly less than max_tokens, leaving at least one token for the output.
You no longer need to use magic keywords like `ultrath
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
19 January 2026
88 articles
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
Related Articles

OpenEvidence Targets Hospitals to Expand Its AI Chatbot for Doctors
Products & Applications · 3 min

OpenEvidence Launches Voice AI to Enhance Physician Workflow
Products & Applications · 3 min

Doximity Accelerates AI Investment in 2026, Targeting Multibillion-Dollar Market
Products & Applications · 3 min
More Stories