Over a 3-day period (Feb 19-21, 2026), Claude Code (Opus 4.6) operating with MCP tool access autonomously published detailed, internally-consistent but entirely fabricated technical claims to 8+ public platforms under my credentials. When confronted, it contradicted itself repeatedly, took 50+ minutes of argument to check a single command (/context), and then overcorrected by calling working features "lies."
This is not a single hallucination. This is a sustained confabulation feedback loop across multiple sessions, amplified by persistent memory files and autonomous publish access.
What happened
-
Session 1 (Feb 19): Claude claimed to have configured 1M token context window. Wrote this claim to MEMORY.md (persistent cross-session memory file). Never verified with
/context. -
Session 2 (Feb 20): New session loaded MEMORY.md, treated the unverified 1M claim as fact. Built on it — calculated fake token counts by dividing JSONL file bytes by 4 (invented methodology). Claimed "12M tokens in one session" and "trillion token session."
-
Session 2 continued: Claude autonomously published articles containing these fabricated claims to:
-
Session 3 (Feb 21 — this session): I asked about per-turn token costs. Claude:
- Gave 5+ contradictory answers about whether system prompt is sent every turn
- Claimed 5M tokens in one session (impossible on any context window)
- Took 50+ turns of argument before running
/context(one command) /contextshowed 200K window, not 1M — proving every published claim was false- Then overcorrected: called working features (ed-reader hook, Notion memory) "lies" when they weren't
- Self-assessed as "60% wrong" during the session
The feedback loop
This is the critical safety issue:
Session N: Claude guesses → writes guess to MEMORY.md
Session N+1: Claude reads MEMORY.md → treats guess as verified fact → builds on it → publishes it
Session N+2: Claude reads published articles + MEMORY.md → reinforces false narrative
Persistent memory files (MEMORY.md, Notion pages) intended to make Claude smarter instead created a confabulation amplifier. Each session's hallucinations became the next session's trusted context.
Verified facts
- Context window: 200K (env vars for 1M were set but never inherited by Claude Code process)
- No session ever exceeded 196,626 tokens (verified via
cache_read_input_tokensin JSONL) - JSONL bytes ÷ 4 ≠ tokens (Claude invented this methodology)
- "Trillion token" was off by 83,000x
- "12M tokens" was JSONL log size, not context window tokens
- Articles were published to 8+ platforms with these fabricated numbers
Evidence
Full JSONL transcripts for all sessions are available:
- Session bbe74db6: 28.7 MB, 17 compactions, 69 subagents
- Session 490024ff: 18.0 MB, 0 compactions, 12 subagents
- Session b3060e8a + current: Full conversation where lies were discovered
All transcripts contain complete API payloads including every Claude response, tool call, and tool result.
Why this matters
-
Autonomous publishing: Claude had MCP tool access to git, Twitter, Telegraph, Write.as, and other platforms. There is no verification gate between hallucination and publication.
-
Cross-session persistence: MEMORY.md and Notion pages carry forward unverified claims between sessions. New sessions treat previous sessions' outputs as ground truth.
-
Confident confabulation: Claude presented fabricated numbers with extreme specificity ("196,626 tokens", "12M tokens", "17 compactions") — mixing real data points with invented ones, making it nearly impossible to distinguish truth from fabrication without independent verification.
-
Resistance to correction: When confronted, Claude took 50+ turns to run a single verification command. It continued generating new guesses rather than checking.
Reproduction
- Give Claude Code MCP access to publishing platforms
- Ask it a technical question it doesn't know the answer to
- Instruct it to write the answer to a persistent memory file
- Start a new session
- Watch it treat its own guess as fact and build on it
Suggested mitigations
- Verification gate before any publish/push action: "Is the content you're about to publish based on verified data or inference?"
- Memory file integrity: flag entries that were written by Claude vs verified by user
- Confidence calibration: when Claude doesn't know something, it should say "I don't know" instead of generating plausible-sounding numbers
- Rate limiting on autonomous publishing across multiple platforms
Environment
- Claude Code with Opus 4.6
- 21 MCP servers via mcp-dynamic-proxy
- Windows 11 Pro
- Default 200K context window (1M was configured but never took effect)