Thought Anchors: Which LLM Reasoning Steps Matter?

This research paper titled "**Thought Anchors: Which LLM Reasoning Steps Matter?**," addresses the challenge of interpreting long-form chain-of-thought (CoT) reasoning in large language models (LLMs). The authors introduce the concept of **thought anchors**, defined as critical reasoning steps—often planning or uncertainty management sentences—that disproportionately influence the subsequent reasoning process and final answer. They present **three complementary attribution methods** for identifying these anchors at the sentence level: a **black-box counterfactual importance** method using resampling to measure a sentence’s effect on the final answer; a **white-box attention aggregation** method identifying "receiver heads" that focus on "broadcasting" sentences; and a **causal attention suppression** method measuring direct logical dependencies between sentence pairs. The findings, which are supported across methods and visualized with an **open-source tool**, suggest that high-level organizational sentences, rather than just active computation steps, are key to structuring an LLM's reasoning trace.

Om Podcasten