The new LLMs are so sophisticated: they can speak, translate, and do many other things that often impel us to attribute intelligence to them. And to ask the question of whether we might also attribute consciousness to them in the near future. This question immediately leads to the idea that a deep understanding of the computational side of thinking machines could be beneficial for understanding human consciousness as well. So, can consciousness be understood from an engineering perspective?
Under what conditions could a machine have what we call subjective experience—a phrase that probably best describes what it means to be conscious? Today, let’s talk about Michael Graziano’s Attention Schema Theory.
David Chalmers, a philosopher who coined a term without which no discussion about consciousness can proceed—the hard problem of consciousness. From this perspective, Graziano’s ideas answer the meta-question of consciousness—why we believe we have subjective experience—rather than directly solving the hard problem itself.
Graziano responds to Chalmers:
“Chalmers has now also offered the term meta-problem, which refers to why we think we have a hard problem. Maybe we don’t. Maybe there’s no fundamentally inexplicable, nonphysical essence in us. Maybe our task as scientists is to explain why people tend to believe in a hard problem in the first place. The attention schema theory fits this second approach. When you boil down the theory to its essence, it is an explanation for how a biological machine falsely believes in a hard problem. When the machine accesses its attention schema—a simplified, cartoonish account of its own internal processes—it is informed that it contains a private, ghostly, inner property of consciousness.” (1,p. 92)
Attention Schema Theory
After many years of research, Graziano formulated the Attention Schema Theory. The central role belongs to attention—a special feature of the brain that focuses its limited resources on some object in order to process it in greater depth.
This ability evolved gradually, starting as overt attention: selecting an item of interest, filtering out other information, and triggering automatic responses. A responsible brain structure here is the tectum.
Animals like crabs and frogs exhibit this ability.A simple principle from how a crab’s eye works illustrates the idea well: signals compete, and the winning signal determines the animal’s movement. This computationally simple process offers insight into consciousness (1,p. 10).
Which animals are conscious? The underlying brain structures began to emerge in reptiles, are probably present in birds, and are definitely present in mammals. Many animals, especially those with complex behavior like octopuses and mammals, likely have internal awareness of themselves and the world. But do they have subjective experience?
A major step came with the appearance of the cerebral cortex. Its emergence marked a huge shift:
“It’s not just a bigger processor stacked on top of smaller, earlier models like the tectum from the frog brain. It’s a fundamentally different kind of processor. It’s a machine honed to sift a vast amount of information and winnow it down to a small subset. That subset is processed in a deep, thorough manner and ultimately guides behavior.” (1,p. 30)
This is covert attention, in contrast to overt attention. Overt attention grasps the object in a direct way; covert attention uses the massive computational machinery of the cortex.
However, attention—whether overt or covert—must be controlled to be effective. For this purpose, the brain has an internal model that monitors attention: the attention schema.
So, the brain creates a schema that represents attention. It focuses on high-level properties and deeply processed information, leaving out the non-essential physical features of the object of attention. It says nothing about physical aspects.
The information within it provides a cartoonish account of how the highest levels of cortical attention take possession of items. There is no simple physical, roving eyeball, as in the case of overt attention. Instead, the cartoonish account describes an essence with no specific physical substance but with a location vaguely inside you, which can take temporary possession of items—of apples, sounds, thoughts—and drop others (1,p. 43).
And because it is detail-poor, the attention schema has no physical properties. It is a mental experience.
Artificial Consciousness
If we can engineer consciousness in principle, what conditions must we meet to implement it?
“I think of the attention schema theory as an engineering answer to the alchemical mystery of consciousness. If you build a machine according to the theory, putting in the correct internal models and giving it cognitive and linguistic access to those models, then the machine will have the capabilities that you engineered into it.” (1,p. 119)
First of all, we need to create an internal model. An internal model consists of information, which we can measure objectively. Unlike biological brains, AI allows easier inspection and measurement of this information under the hood.
So let us begin to practically build a conscious machine using the attention schema as a guide:
- The machine needs artificial attention.
- It needs an attention schema—an internal model that describes attention in a general way and informs the machine about subjective consciousness.
- The machine needs the right range of content—likely the hardest part—starting with sensory inputs like vision.
- It needs sophisticated search engineering.
Graziano believes that by giving a machine the necessary parts, we may potentially achieve artificial consciousness within the next decade.
Melanie Mitchell describes today’s large language models as “stochastic parrots.” Yet, at the same time, they represent a compelling example of a complex system composed of very simple parts. We see many such systems in nature and society—examples include insect colonies, the human genome, or the brain. The brain ranks among the most complex systems, but its basic elements—neurons—remain quite simple as they primarily transmit signals. When trillions of neurons interconnect, they create highly sophisticated behaviors.
As Melanie Mitchell writes:
“Yet put half a million of them together and the group as a whole behaves as a hard-to-predict ‘superorganism’ with sophisticated, and sometimes frightening, ‘collective intelligence.’ More is different. Similar stories can be told for neurons in the brain, cells in the immune system, creativity and social movements in cities, and agents in market economies. The study of complexity has shown that when a system’s components have the right kind of interactions, its global behavior—the system’s capacity to process information, to make decisions, to evolve and learn—can be powerfully different from that of its individual components.”
(2, p.1)
From a computational perspective, this becomes a fascinating entry point to understanding how emergent intelligence might arise—not in spite of, but precisely because of, the simplicity of individual components interacting in complex ways.
Bibliography:
- Michael Graziano, Rethinking Consciousness: A Scientific Theory of Subjective Experience, 2021
- Mitchell, M. (2014). How Can the Study of Complexity Transform Our Understanding of the World? Big Questions Online, January 20, 2014.
- John R. Searle, Minds, Brains, and Programs, 1980. Behavioral and Brain Sciences, 3(3), 417–457
- Mukund Sundararajan, Ankur Taly & Qiqi Yan, Gradient-Based Feature Attribution in Explainable AI: A Technical Review, 2024
- Melanie Mitchell, Artificial Intelligence: A Guide for Thinking Humans, 2019
- Douglas R. Hofstadter, Gödel, Escher, Bach: An Eternal Golden Braid, 1979
- David J. Chalmers, Could a Large Language Model be Conscious?, 2023. Boston Review
- David J. Chalmers, Reality+: Virtual Worlds and the Problems of Philosophy, 2022
- Melanie Mitchell, Complexity: A Guided Tour, 2009
- Margaret A. Boden, Mind as Machine: A History of Cognitive Science (2 vols.), 2006
Image: Ants create a bridge.– Illustrated
Credit: Image generated by ChatGPT (OpenAI)