Read more at:
Turns out, “the implementation had a bug. Instead of clearing thinking history once, it cleared it on every turn for the rest of the session. After a session crossed the idle threshold once, each request for the rest of that process told the API to keep only the most recent block of reasoning and discard everything before it. This compounded: if you sent a follow-up message while Claude was in the middle of a tool use, that started a new turn under the broken flag, so even the reasoning from the current turn was dropped. Claude would continue executing, but increasingly without memory of why it had chosen to do what it was doing. This surfaced as the forgetfulness, repetition, and odd tool choices people reported. …We believe this is what drove the separate reports of usage limits draining faster than expected.”
And with Claude Opus 4.7, the vendor noted, it “has a notable behavioral quirk” of being “quite verbose. This makes it smarter on hard problems, but it also produces more output tokens.”
To be clear, I’m not suggesting Anthropic was doing anything especially poorly. Indeed, these are the kinds of problems all genAi companies face, and I applaud Anthropic’s transparency in publishing its reasoning openly.. (Anthropic executives do seem to be trying to portray themselves as more ethical and responsible than many of their rivals.)


