Language Models

How LLMs Model Mental States

Large language models exhibit surprising capabilities in reasoning about beliefs, intentions, and knowledge states. We explore what this means for building systems that truly understand context.

Shep Bryan
Shep Bryan
Founder
Abstract visualization of neural network patterns
The geometry of belief representation in transformer architectures.

When you read a story about Sally who puts a marble in a basket, leaves the room, and returns after Anne has moved it—you effortlessly know that Sally will look in the basket. This is theory of mind: the ability to attribute mental states to others and use those attributions to predict behavior.

The Surprising Competence of Language Models

Recent large language models pass classic theory of mind tests with remarkable consistency. They correctly predict that Sally will look where she last saw the marble, not where it actually is. They distinguish between what characters know and what they believe. They track how knowledge transfers between agents.

The question isn't whether LLMs can solve theory of mind tasks. They clearly can. The question is whether they're doing something like what humans do, or something entirely different that produces the same outputs.

Representation vs. Simulation

Two hypotheses dominate the debate:

  1. LLMs build genuine representations of mental states—internal structures that track who knows what
  2. LLMs simulate surface patterns—they've seen enough stories about false beliefs that they predict the next token correctly without 'understanding' belief

Our research suggests a third possibility: LLMs develop something functionally equivalent to mental state tracking, even if the mechanism differs from human cognition. The geometry of their internal representations shows systematic structure when processing belief-related content.

Implications for Knowledge Systems

If LLMs can model mental states, they can potentially model what users know, believe, and need. A knowledge system that understands not just what information exists, but what the user's mental model looks like, could surface precisely the right context at the right time.

This is the research direction we're pursuing at Penumbra: systems that don't just retrieve information, but understand the cognitive state of the person seeking it.

Open Questions

  • How robust are these capabilities to distribution shift?
  • Can we extract and examine the mental state representations directly?
  • What happens when mental state reasoning conflicts with other objectives?
  • How do we build systems that maintain accurate user models over time?

This is part of our ongoing research into cognitive architectures for knowledge systems. Follow our research updates for more on how we're building systems that truly understand context.

Research by

Shep Bryan
Shep Bryan
Founder

Shep is the founder of Penumbra, building knowledge systems that transform how teams capture, connect, and leverage institutional intelligence for strategic decisions.

Continue reading

Related Research