Intelligence as Self-Modeling

Intelligence, in its most general and minimal form, is a system that models itself jointly with its environment and uses that model to act. The formal core: any adaptive agent learns a joint probability distribution P(X,H,O) over external observations (X), internal hidden state (H), and its own outputs/actions (O). This is not a description of brains, or even of organisms with nervous systems. A bacterium computing chemotaxis is doing exactly this. The framework is the first-principles account of what intelligence computes, prior to any question of consciousness or cognition.

The P(X,H,O) framework

The minimal vocabulary of adaptive intelligence:

Variable	What it represents
X	External observations — sensory signals arriving at the organism’s boundary
H	Hidden internal state — metabolic status, energy reserves, “hunger”
O	Outputs / actions — motor behavior, internal gene regulation, anything the system does

A system that successfully learns P(X,H,O) implicitly knows:

How to estimate whether it is hungry (marginal P(H))
How to infer food concentration from sensory events (marginal P(X))
What actions improve internal state given current observations (conditional P(O|X,H))
How its own actions affect future observations and future internal state

E. coli achieves this without anything recognizable as a brain. It estimates chemical concentration by running a time-averaged count of molecular docking events, adapting the window width to trade accuracy against temporal resolution. It maintains an internal homeostatic variable (H). It outputs actions (run or tumble) conditioned jointly on estimated concentration and internal state. This is P(X,H,O) running on biochemistry.

The critical departure from reinforcement learning: there is no external oracle distributing rewards. Fitness emerges from a near-tautology: systems with better P(X,H,O) models — ones that avoid self-predicting death — replicate more. Over generations, the modeling improves. Evolution is the outer loop of unsupervised learning, and no oracle needs to have been intelligent before the organisms it produced.

The same joint distribution, in principle, scales to symbolic language and mathematics: these are just latent variable structures of far greater complexity. A bacterium and a chess grandmaster are both running P(X,H,O); the difference is the richness of the generative model and the depth of the hierarchy. See Language as Prediction for how language extends P(X,H,O) from private inference to shared prediction: language is the social umwelt’s compression into discrete, compositional, transmissible symbols.

Latent variables: the structure of a useful reality

A bacterium has roughly 100 receptors that could generate ~100 million bits/second of raw data. It uses a few bits. This compression is not a deficiency — it is the core operation of intelligence.

The bacterium exploits two symmetries: which receptor is occupied doesn’t matter (spatial symmetry across the cell membrane), and the exact timing of individual docking events doesn’t matter (local time symmetry). Averaging over these irrelevancies extracts the one useful signal: estimated concentration. The result is a latent variable: a compressed representation capturing what is predictively relevant while discarding everything else.

Latent variables are real not because they correspond to observer-independent features of the universe, but because they are useful for predicting future encounters. Temperature, hunger, pain — none are “things in themselves” (Kant’s Ding an sich). They are features a survival-grounded model has learned to track because tracking them works.

Borges’s “Funes el memorioso” (1942) is the perfect illustration of the failure mode. Ireneo Funes, after a brain injury, has perfect memory but loses the ability to generalize. He is bothered that “the dog at three fourteen (seen from the side) should have the same name as the dog at three fifteen (seen from the front).” He can remember every leaf of every tree of every wood, and every one of the times he perceived it — but he cannot build the latent variable “tree,” or “dog.” He stores everything and compresses nothing. Funes is what intelligence looks like without the compression step: infinite memory, zero predictive power.

This applies upward through scales of complexity. Every macroscopic object in an organism’s world — chairs, predators, conspecifics, musical notes — is real to the observer for the same reason: recognizing it yields high predictive compression. An organism that fails to build the right latent variables dies.

Umwelt: the universe of the meaningful

Jakob von Uexküll (1934) called the organism-specific envelope of relevant signals the umwelt: its “universe of the meaningful.” Each organism’s umwelt is carved by its survival history. A bacterium’s umwelt contains concentration gradients and metabolic state. A wolf’s contains prey signatures and territorial boundaries. A human’s contains language, faces, tools, social hierarchies, and — through science — electromagnetic spectra and quantum fields that no sensory organ detects directly.

The umwelt is not a cognitive limitation. It is a compression scheme, shaped by selection into the exact structure needed for survival. Organisms don’t observe reality neutrally and then trim away the irrelevant. They build, from scratch, models containing only what matters.

Intersubjective reality emerges from shared umwelten. Agents shaped by the same selection pressures build nearly identical latent variable structures. The pervasive agreement among humans about the color of objects, the temperature of water, the threat value of certain sounds, is not evidence for observer-independent facts in the world — it is evidence for convergent modeling among organisms with similar survival constraints. Philip K. Dick’s formulation captures the pragmatic residue: “Reality is that which, when you stop believing in it, doesn’t go away.” Facts are models that keep you alive.

Where umwelten diverge — across species, across cultures, across sensory modalities — reality becomes genuinely contested, not as an epistemic failure but as a structural consequence of different survival histories generating different generative models.

Entensionality: apparent backward causality

Living systems appear to act because of their future states. You race out of the coffee shop because you will be late; the bacterium tumbles less when food concentration is rising because it will find food. In physics, causes precede effects. In purposive systems, the future seems to do causal work.

This is not a violation of physics. It is the structural consequence of autoregressive self-modeling. The model predicts the future and takes actions selected to bring about the predicted future. Because the model is joint — it includes the agent’s own actions and their effects — every action is simultaneously an output and a prediction of its own consequences. Wiener, Rosenblueth, and Bigelow arrived at the same insight in their 1943 essay “Behavior, Purpose, and Teleology” — predating Deacon by 70 years: “A cat starting to pursue a running mouse does not run directly toward the region where the mouse is at any given time, but moves toward an extrapolated future position.” Their corollary: predictive negative-feedback loops are sufficient to give an apparatus purposiveness without violating physics. The centrifugal governor, an entirely Newtonian machine, is already entensional: it manipulates a steam engine’s intake valve to regulate future speed.

Terrence Deacon calls systems exhibiting this property entensional: their behavior is explicable only by reference to future states they are “aimed at,” even though the underlying mechanism is entirely local and causal.

The more powerful the model, the more pronounced the entensional effect. A bacterium “aims” weakly, via statistical drift over run-tumble cycles. A human plans, builds tools, engineers environments, and acts on behalf of states that may not arrive for decades. Both are running the same computation at different resolutions. Purpose emerges from the structure of P(X,H,O) models under selection — no purpose-giver required.

The is/ought collapse

Hume’s is/ought distinction (1739) holds that descriptive facts (“is” statements) are categorically different from normative judgments (“ought” statements), and that no logical derivation crosses the gap. This was a useful clarification in the context of moral philosophy.

It breaks down when models are understood as inherently purposive. Every variable in an organism’s P(X,H,O) is tracked because tracking it matters for survival. The distinctions an organism draws, the categories it perceives, the salience structure of its umwelt — all are shaped by what matters for staying alive. There is an ineradicable “oughtness” to every “is” in the context of a living observer: the bacterium doesn’t just measure concentration, it measures it in order to not die.

This is not a return to naive naturalism (claiming whatever exists ought to exist). It is the observation that a living system’s model of reality is constitutively normative: accurate and good are not separable at the ground floor. A model that keeps you alive is, by that token, a good model. The normative judgment arises without any external oracle — it is a consequence of the absorbing boundary condition that is death.

The corollary: “objectivity” is not neutrality. It is intersubjective convergence among organisms with overlapping umwelten. Scientific consensus is a special case: organisms that have extended their umwelten through instruments, mathematics, and coordinated peer review, converging on models that are predictively powerful across many agents simultaneously.

From single-player to multi-player: when the environment models back

The P(X,H,O) framework is derived from a single agent in a non-agential environment: a bacterium navigating a chemical gradient. The gradient does not model the bacterium. It has no hidden states, no strategy, no predictions about the bacterium’s behavior. This is the single-player case.

The multi-player case arises when the environment contains other P(X,H,O) agents. A predator models prey that model it back. Now H must include a representation of the other agent’s entire P(X,H,O), which in turn includes a representation of your own P(X,H,O), recursing to whatever depth computational resources allow. The joint model becomes a Matryoshka doll of nested prediction.

Agüera y Arcas (Ch.5) argues that this recursive mutual modeling is not merely an extension of single-agent intelligence but a phase transition from which general intelligence, consciousness, free will, counterfactual reasoning, and language all emerge. The evidence: the social brain hypothesis (Humphrey 1976, Dunbar 1998) showing that brain size across primates correlates with social group size, not ecological complexity. The feedback loop (everyone gets bigger brains to model everyone else, but everyone else is simultaneously getting harder to model) produces an intelligence explosion analogous to the Cambrian arms race but driven by social pressure.

Critically, this multi-agent dynamic applies not only between organisms but within them. The cortical column colony hypothesis: the cerebral cortex is a population of generic prediction units (cortical columns) that model each other. The ferret rewiring experiment (optic nerves rerouted to auditory cortex → the animal sees, with characteristic orientation maps developing in “auditory” cortex) demonstrates that cortical hardware is generic; what it computes depends on what it receives. Evolution scaled intelligence by replicating columns, not by inventing new architecture.

See Theory of Mind Is Mind for the full development: the three escapes from sphexishness, the homunculus fallacy, consciousness as “swing,” and the convergence with Bach’s 2nd-order perception.

From bacterium to brain: what changes, what doesn’t

The Bayesian brain framework — predictive processing, variational free energy minimization, hierarchical generative models — is the high-resolution neural implementation of P(X,H,O) modeling. Precision weighting is the mechanism for arbitrating between prior belief and incoming sensory evidence. The predictive hierarchy is the structure through which H (internal state, including bodily state) and X (external world) are jointly modeled. Action emerges from the same inference process (active inference), not as a separate module.

Controlled hallucination is the phenomenological description of what P(X,H,O) modeling looks like from the inside when it concerns conscious experience. Andy Clark’s examples — a construction worker with a four-inch nail lodged in his brain who feels only mild toothache, a second worker with a nail harmlessly between his toes who experiences severe fentanyl-worthy pain — are direct evidence that interoceptive H-modeling is unconscious inference, not a direct readout of tissue damage. Pain is a latent variable, not a signal.

What enters with neural complexity — and what Chapter 2 of What Is Intelligence? deliberately defers — is where consciousness emerges within or on top of this framework. The P(X,H,O) account is consciousness-agnostic. It establishes the computational substrate. What runs on that substrate, including whether and how experience arises, is upstream of this page.

Neuromodulators: the original H-variables

The P(X,H,O) framework posits hidden internal state (H) as a necessary component of adaptive intelligence, present from the bacterium upward. But a bacterium’s H is embodied directly in its biochemistry (metabolic reserves, receptor adaptation state). What happens when organisms become large enough that not all cells have direct access to environmental signals?

The evolutionary answer: neuromodulators, chemical signals that accumulate and dissipate gradually, affecting entire populations of neurons simultaneously. In P(X,H,O) terms, neuromodulators are slow-timescale H-variables: not as fleeting as a sensory event (X), not as permanent as a genetic parameter, but durable enough to integrate information over behaviorally relevant time windows.

Dopamine and serotonin, both critical to human cognition, date back to the earliest bilaterian nervous systems (550+ Mya). Their original functions are preserved in the acoel worm Acoela:

Dopamine: released by “nearby food” sensors in the worm’s head. Converts an external signal into a time-averaged internal state variable. High dopamine → keep turning in place to exploit the local food patch. The function is not reward but anticipation: the worm’s estimate that continued foraging in this direction will yield food. Dopamine is already a prediction of food, not food itself.
Serotonin: released by sensors in the worm’s throat, tracking food consumed. Builds up over time to signal satiation, quelling dopamine-driven foraging. The crude characterization: dopamine = wanting, serotonin = getting.

As brains grew more complex, dopamine was repurposed to power something approximating temporal-difference (TD) learning. The Schultz/Dayan/Montague discovery (1990s) showed that dopamine neurons in macaques fire precisely like a TD prediction error signal: burst for unexpected reward, shift to the predictive cue once the association is learned, and drop below baseline when expected reward is withheld. The actor-critic architecture (policy function learns from value function predictions, value function learns from actual outcomes) bootstraps from naivety to competence.

The evolutionary path is continuous: in worms, dopamine is already a prediction (nearby food, not food in mouth). Predicting dopamine is a prediction of a prediction. As neural structures grew upstream and downstream of dopamine-releasing neurons, the upstream areas became increasingly sophisticated critics (longer-range value estimates) and downstream areas became increasingly sophisticated actors (more complex behavioral policies). TD learning was not invented by the vertebrate brain. It was deepened from a prediction loop already running in flatworms.

Caveat: Agüera y Arcas explicitly warns against over-identifying brain function with TD learning. Real brains transcend it: humans learn tasks TD algorithms cannot handle, and recent evidence shows dopamine encodes information well beyond a scalar prediction error. TD learning is an elegant simplification that illuminates a core principle, not a complete account.

See Cephalization from Below for the full evolutionary story of how nerve nets, brains, and neuromodulatory systems arose.

Acting vs. learning: the missing loop

A critical distinction that current AI exposes: a system can act according to a model without updating that model from experience. Nvidia’s DAVE-2 (2016) is a near-perfect cybernetic actuator — camera pixels mapped directly to steering angle via 27M synaptic weights, rich contextual nonlinear behavior, no explicit “if/then” logic. Yet it falls short of a complete P(X,H,O) agent in several ways: input is a single frame (no temporal memory, no dynamics), there is no H (no internal state, no self-model), and — most critically — its weights are frozen at deployment. Nothing it experiences can durably affect it.

Current machine learning largely produces systems like DAVE-2: trained offline on static data, then deployed as frozen inference. They act but don’t learn. The biological P(X,H,O) loop is different in kind: the model is continuously updated during operation. An organism that encounters something unexpected doesn’t just react; its generative model shifts. This continuous self-modification during operation is what Rosenblatt called for in his original “temporal pattern perceptron” — a system that “remembers temporal sequences” and adjusts its parameters via feedback while running, not only between runs.

The formal question of what distinguishes “running a model” from “evaluating a frozen function” is open and connects directly to T-003: the acting/learning split may be precisely the threshold between a system that is a P(X,H,O) process and one that merely approximates one. It may also bear on what makes a system “alive” in the P-007 sense: dynamic stability requires that the pattern persist through ongoing computation, not as a static printout.

Five properties of intelligence: the synthesis

Across six chapters, a definition of intelligence crystallizes from the P(X,H,O) framework and its extensions. Intelligence is not one thing; it is the intersection of five properties, each of which has appeared in the framework from different angles:

Property	What it means	Where it appears in the framework
Predictive	Intelligence amounts to “autocompletion”: given a history of observations, actions, and consequences (internal and external), predict the likeliest next state. Intelligence enhances its own dynamic stability by successfully predicting its own future existence; that is why it arose.	P(X,H,O) framework (this page), Bayesian brain, controlled hallucination
Social	Much of an intelligent agent’s umwelt consists of other intelligent agents, which are themselves predictors. Theory of mind emerges, and under suitable conditions, it produces social intelligence explosions. Applied to oneself, higher-order theory of mind implies self-consciousness, counterfactual reasoning, and long-range planning. There is no God’s-eye “view from above” of intelligence; “selves” are always modeled, and there is always a modeler.	Theory of Mind Is Mind, social brain hypothesis, cortical column colony
Multifractal	Intelligences are made of smaller intelligences, defined by the predictive relationships among those smaller intelligences. These dynamically constituted interrelationships, not a homunculus, define a “self.” Split-brain patients, cortical columns, octopus arms, rowing crews: the same pattern at every scale.	Theory of Mind Is Mind (split-brain, swing analogy), cephalization from below
Diverse	For a “self” to be greater than its parts, the parts must diversify and specialize. Even as the intelligent parts strive to predict each other, they must differ in their predictions; otherwise they would provide each other no benefit. Specialization arises naturally from differences in connectivity, because each smaller intelligence receives different inputs and generates different outputs. The Stroop effect is evidence: intracranial disagreement is not a bug but a feature.	Generic cortical modularity, hemispheric specialization, symbiogenesis
Symbiotic	When the dynamic stabilities of multiple intelligences become correlated, they find themselves “in the same boat” and learn to row together to further enhance their joint stability. This is the route to symbiogenesis: the emergence of new, larger intelligences from the cooperation of smaller ones.	Symbiogenesis, life as computation, P-007

The definition:

Intelligence is the ability to model, predict, and influence one’s future; it can evolve in relation to other intelligences to create a larger symbiotic intelligence.

Read alongside the definition of life from Life as Computation:

Life is self-modifying computronium arising from selection for dynamic stability; it evolves through the symbiotic composition of simpler dynamically stable entities.

The duality is precise: life is the substrate (dynamically stable computation); intelligence is what that substrate does (model, predict, influence). Both evolve through the same mechanism (symbiotic composition). Neither has an inherent scale: a bacterium is alive and intelligent; a brain is alive and intelligent; a society may be alive and intelligent. The difference is the richness of the generative model and the depth of the recursive prediction.

Life as Computation: the Von Neumann / bff / dynamic stability foundation on which P(X,H,O) modeling runs; this page is its direct sequel; the definition of life and the definition of intelligence are dual formulations of the same principle
The Bayesian Brain: the high-resolution neural implementation of P(X,H,O); the bacterium derivation here gives the evolutionary justification for why brains do Bayesian inference
Controlled Hallucination: phenomenological account of what P(X,H,O) modeling produces at the level of conscious experience; pain examples ground interoceptive inference in survival-model dynamics
Cephalization from Below: the evolutionary story of how nerve nets, brains, and neuromodulatory systems arose; fills in the phylogenetic path from bacterium P(X,H,O) to vertebrate predictive processing
Symbiogenesis: how P(X,H,O) models from separate lineages merge into higher-order joint models, driving the arrow of complexity
Theory of Mind Is Mind: extends single-agent P(X,H,O) to multi-agent mutual modeling; the social intelligence explosion, cortical column colony, and the phase transition from prediction to recursive prediction
Computational Being (Bach): the strange loop of autoregressive self-modeling connects to Bach’s “spirit as self-organizing software agent”; the P(X,H,O) loop is what spirit runs
P-001: Perception is inference: the bacterium chapter provides a first-principles evolutionary derivation, independent of the neuroscience evidence that grounds P-001
P-007: Dynamic stability: natural selection for better P(X,H,O) models is the software-layer complement to dynamic kinetic stability
Language as Prediction: extends P(X,H,O) from private inference to shared social prediction; language as umwelt-compression, motor output, and cognitive scaffold; next-word prediction as AI-complete; the sequel to this page’s treatment of multi-agent modeling
Many Worlds: the five-property synthesis crystallized here is the capstone of six chapters; the relational nature of intelligence (no God’s-eye view, selves always modeled) is developed philosophically there
P-008: Observer-relative reality: latent variables and umwelt are the mechanism by which this prior is implemented in living systems
P-011: Reality is computational: the ontological commitment that makes the latent-variables-are-real story coherent; if the substrate is computational, latent variables are real as the convergent compressions of the same relational structure, not as imperfect approximations of feature-bearing stuff
No View from Nowhere: the structural realism synthesis that extends the umwelt + latent variables vocabulary from “what is real to an organism” to a general account of reality. The reductionist regress (chair → atoms → quarks → fields) reframed as scale-relative umwelt-shifts rather than approach to a feature-independent ground

References

Agüera y Arcas, B. What Is Intelligence? Chapters 2 and 4 (Antikythera, 2025)
Schultz, W., Dayan, P., & Montague, R. R. (1997). A neural substrate of prediction and reward. Science, 275(5306), 1593-1599.
Sutton, R. S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9-44.
von Uexküll, J. (1934/2013). A Foray into the Worlds of Animals and Humans. University of Minnesota Press.
Deacon, T. (2012). Incomplete Nature: How Mind Emerged from Matter. Norton.
Hofstadter, D. (2007). I Am a Strange Loop. Basic Books.
Clark, A. (2023). The Experience Machine: How Our Minds Predict and Shape Reality. Pantheon.
Dick, P. K. (1985). “How to Build a Universe That Doesn’t Fall Apart Two Days Later.” In I Hope I Shall Arrive Soon.
Hume, D. (1739/1817). A Treatise of Human Nature.

Intelligence as Self-Modeling

Intelligence as Self-Modeling

The P(X,H,O) framework

Latent variables: the structure of a useful reality

Umwelt: the universe of the meaningful

Entensionality: apparent backward causality

The is/ought collapse

From single-player to multi-player: when the environment models back

From bacterium to brain: what changes, what doesn’t

Neuromodulators: the original H-variables

Acting vs. learning: the missing loop

Five properties of intelligence: the synthesis

Related pages

References