Agency and agents

2,400 years ago, Aristotle noticed something that broke with everything his teacher had taught. Socrates believed that no one does wrong willingly — that if you know what is good, you do what is good. Aristotle looked around and said: no. People know what is good and still choose otherwise. He gave it a name: akrasia (ἀκρασία) — weakness of will. And with that, he opened a question humanity still hasn't resolved: if I know what I should do, why don't I do it?

Aristotle also gave us the word that runs through this chapter: agency. In philosophy, an agent is a being with the capacity to act intentionally in the world — to perceive its surroundings, deliberate, and carry out actions with purpose. For Aristotle, agency was inseparable from freedom: acting voluntarily, with knowledge of the circumstances and for one's own reasons, is what distinguishes us from objects that merely react to external forces.

The intention-action gap

Modern psychology put numbers to what Aristotle saw. Decades of research on the intention-action gap show that your conscious intentions — those decisions you make with clarity and conviction — predict only between 18% and 23% of what you actually end up doing (Armitage & Conner, 2001; McEachan et al., 2011, 2016). The other ~80% of your behavior comes from automatic habits, environmental cues, emotional states, accumulated fatigue, and social context. Forces that operate below your awareness.

A 2023 meta-analysis put it in brutal perspective: the successful translation of a positive intention into actual behavior is nearly at chance level (Rhodes et al., 2023). Even when you manage to make a person want much more to do something, that massive increase in intention barely moves the needle on whether they do it (d = 0.15; Rhodes & Dickau, 2012).

Why? Because your brain is not a coherent unit. It's a committee where multiple systems compete: the automatic system (fast, emotional, reactive), the deliberate system (slow, rational, exhaustible), biochemistry, social context, and — crucially — the design of the environment. Your intention to save money competes against a delivery app designed to make you buy with one tap. Your intention to eat well competes against a supermarket that puts candy at the checkout. Your intention to organize your finances competes against bank statements designed so you don't understand how much you're paying in fees.

Janet Polivy and C. Peter Herman at the University of Toronto formalized this as the False Hope Syndrome (2000, 2002): a cycle where you decide to change, feel good about deciding (the dopamine comes with the decision, not the action), fail at execution, blame yourself, and eventually decide to change again. People repeat the same New Year's resolution an average of 10 times in their lifetime.

The self-help industry sells the wrong solution: more information, more motivation, more willpower. A study from the University of Washington found that among people who voluntarily enrolled in a program, only 20% read the full book and a mere 2-4% followed through on all the suggestions. Reading about change feels like changing. The brain doesn't distinguish well between planning an action and executing it.

The problem was never a lack of information. The problem is that human agency — that capacity to act with purpose — breaks down when the system it operates in is more complex than the mind can manage.

From thought to machine

Philosophy gave us the concept of agency. Computer science extrapolated it.

In 1956, a group of researchers gathered at Dartmouth College and coined the term artificial intelligence. It wasn't a technical achievement — it was a statement of intent: to build machines that think. What followed were decades of prototypes that, seen today, look primitive but asked the right questions.

ELIZA (1964-1966), created by Joseph Weizenbaum at MIT, simulated a psychotherapist using pattern matching. It understood nothing — it merely returned the user's words rephrased as questions. But something unexpected happened: people who interacted with ELIZA attributed genuine understanding to it, confided real problems, became emotional. Weizenbaum was disturbed. They called it the ELIZA effect — the human tendency to project intelligence onto systems that simply respond.

ELIZA - 1966, MIT

> I feel tired all the time.
  Why do you say you feel tired all the time?

> Because I feel like nothing I do is enough.
  Does it concern you that nothing you do is enough?

> Yes, especially with my family.
  Tell me more about your family.

> My mother always expects more from me.
  Your mother expects more from you?

During the decades that followed, AI advanced in closed domains — playing chess, classifying images, recommending products — but kept hitting the same wall: natural language. Understanding what a person says, with its ambiguity, implicit context, and thousand ways of saying the same thing, was a problem no architecture could solve well. And without understanding language, a machine cannot receive open-ended instructions, reason about novel situations, or decide what to do when faced with a problem no one anticipated. It could execute rules — but it couldn't think about which rule to apply.

That changed with a single idea.

The Transformer revolution

On June 12, 2017, eight researchers from Google Brain and the University of Toronto published a paper with a title that read almost like a manifesto: "Attention Is All You Need" (Vaswani et al., 2017). Presented at NeurIPS 2017, the paper introduced the Transformer architecture — a model based entirely on attention mechanisms that eliminated the need for recurrent neural networks (RNNs) and convolutional neural networks (CNNs) to process text sequences.

The core idea was elegant: instead of processing a sentence word by word (as RNNs did), the Transformer looks at all words simultaneously and computes the relevance of each one relative to the others. That operation — self-attention — made it possible to train massively parallel models and, for the first time, scale language processing to previously unthinkable dimensions. The results were immediate. The Transformer architecture became the foundation of virtually every language model that came after.

Model	Date	Milestone
GPT-1	June 2018	OpenAI demonstrates that a pre-trained Transformer can solve diverse tasks. 117 million parameters
ChatGPT	November 2022	GPT-3.5 fine-tuned with reinforcement learning from human feedback (RLHF). 100 million users in two months. AI stops being a lab topic
LLaMA	February 2023	Meta releases open weights. Democratizes access and triggers an explosion of derivative models
Claude 3.5 Sonnet	June 2024	Anthropic ships the model that changed how code is written: 49% on SWE-bench, surpassing larger and more expensive models
DeepSeek-R1	January 2025	Open-source reasoning model (MIT license) trained for ~$6M. Matches OpenAI o1. NVIDIA loses $589B in a single day
Gemini 3 Pro	November 2025	Google launches its most powerful model at the time: 100% on AIME 2025, 76% on SWE-bench
Qwen 3.5	February 2026	Alibaba releases a natively multimodal model with 397B parameters (17B active per inference), 1M token context window, 201 languages. Open-weight and designed for agents

What we're living through today is a geopolitical race. The United States leads with high-performance proprietary models — OpenAI, Anthropic, Google — while China responds with a different strategy: open-source models that are efficient and cheap.

U.S. chip export restrictions, designed to slow China down, are having the opposite effect, forcing Chinese engineers to optimize every token, every operation, every training dollar.

From response to action: what makes an agent an agent

When ChatGPT arrived in late 2022, the world discovered what a language model could do. But for much of 2023, most models still lacked function calling — the capability that turns a language model into an agent. That is, the ability to call functions. That is, tools.

At its core, an agent is exactly that: a language model that runs in an environment with tools. Agents, driven by LLMs, receive the user's prompt, survey the list of tools at their disposal, and decide which to use and how.

Without function calling, an LLM can only respond with text. It can answer history questions, explain physics concepts, write a poem, or summarize a document — but everything it produces comes from what it learned during training, not from real-time data. You ask how much you spent this month and it can't know — it has no access to your bank. You ask about the weather and it answers with what seems right, not with what is right now. And sometimes, with complete confidence, it says things that are partially or entirely false — what's known as hallucinations. It's a statistical model of language: it predicts the next most probable word, it doesn't verify facts. It can say anything, but it can't do anything.

The path toward function calling started with two pieces of research. In October 2022, Shunyu Yao et al. published ReAct, a method where the model alternates between reasoning and acting — for example, querying a Wikipedia API to verify a fact before answering, rather than making one up (published at ICLR 2023). In February 2023, Timo Schick et al. (Meta AI) published Toolformer, a model that learned on its own when it needed an external tool and how to invoke it with the correct arguments.

Today the term AI is everywhere and the concept of agents floats around it. But most of what's sold as an "agent" isn't one. An email filter that moves messages to folders is not an agent — it's automation: fixed rules that follow a predefined script. If X happens, do Y. Always. Without variation.

An agent is something else. It receives a goal and decides how to reach it. It reasons about the situation, chooses which tools to use, executes actions, observes the results, and adjusts its strategy if something fails.

The intention-action gap​

From thought to machine​

The Transformer revolution​

From response to action: what makes an agent an agent​

The intention-action gap

From thought to machine

The Transformer revolution

From response to action: what makes an agent an agent