Your Prompts Have Two Dimensions. You're Only Optimizing One.
A cognitive theory of why your best prompts work — and why your worst ones fail in ways you can't diagnose.
Here is something that has been bothering me about prompt engineering: nobody can explain why it works.
We have thousands of guides, tip sheets, and best practices. We know that adding “think step by step” improves reasoning. We know that providing examples helps. We know that specifying a persona changes the output. But ask anyone why these techniques work — what the underlying mechanism is — and you get hand-waving. “It activates the right patterns.” “It gives the model more context.” These are descriptions of what happens, not explanations of why.
I think I have found the explanation. And it comes from an unexpected place: a cognitive science model of what happens when you make eye contact with a stranger on a train.
The phenomenon that started this
Earlier this year, I was working on a formal model of a cognitive phenomenon I call micro love — the involuntary, instantaneous construction of an entire relationship narrative (from meeting through breakup) triggered by a split-second encounter with a stranger. It sounds like a curiosity, but when you formalize its mechanics, you discover something profound about how generative systems work.
The human mind, in that moment, takes a single data point (a glance, a micro-expression) and generates a complete, architecturally detailed, temporally compressed narrative — without any further input from the source. The simulation is informationally closed (no new data enters after the trigger) but self-sustaining (each narrative moment generates the next). It runs on its own output.
If that sounds familiar, it should. It’s exactly what a language model does when it receives a prompt.
The two conditions that make it fire
When I formalized the triggering mechanism for this cognitive phenomenon, I discovered that it requires two independent conditions to be satisfied simultaneously. Not one. Two. And they operate on different aspects of the stimulus.
Condition 1: Template Specificity (T1)
The triggering input must achieve a precise fit with a deeply stored internal template. In the cognitive model, this means the stranger’s micro-expression or quality of gaze matches a relational pattern that’s been built over years of experience. The fit must be exact — like a cryptographic key matching a lock. A vague or generic input doesn’t activate anything.
In an LLM, this translates to: the prompt must contain tokens that precisely activate specific regions of the model’s parameter space. Domain-specific vocabulary, structural constraints, concrete exemplars, references to specific intellectual traditions — these are specificity signals. They tell the model exactly which deep patterns to engage.
Condition 2: Generative Headroom (T2)
Simultaneously, the target must resist immediate categorization. In the cognitive model, if the observer can instantly classify the stranger (”that’s a delivery driver performing a task”), the simulation never fires. The stranger must remain unknown enough — an open variable — for the generative engine to project into.
In an LLM, this translates to: the prompt must leave sufficient dimensions unspecified. If the prompt fully constrains the output (”translate this sentence from English to French”), the model becomes a lookup engine, not a generative system. The prompt must specify what kind of output is desired while leaving open what that output actually says.
Specificity ignites the engine. Unknowability fuels it. You need both.
Why the interaction is everything
Here is the key insight, and it’s the one that current prompt engineering completely misses: output quality is a function of the interaction between T1 and T2, not their sum.
Think about what this means practically:
High T1, Low T2 — Overconstrained
Your prompt precisely specifies every dimension of the output. The model knows exactly what you want. But there’s nothing left to generate — the output is mechanically determined by the input. You get accurate but lifeless results. This is the prompt equivalent of giving someone such detailed instructions that they’re just following orders.
Low T1, High T2 — Underspecified
Your prompt leaves plenty of room for creativity, but it hasn’t activated any specific deep patterns. The model has freedom but no direction. You get vague, generic, wandering output. This is the “write something interesting” prompt — infinite headroom, zero template activation.
High T1, High T2 — The Sweet Spot
Your prompt precisely activates deep, specific patterns in the model while leaving the content dimensions wide open. The model knows exactly what kind of thing to produce but has complete freedom in what it says. This is where remarkable output happens.
Low T1, Low T2 — Dead Zone
Nothing is specified, nothing is open. “Do a thing.” The model has neither direction nor room. You get nothing useful.
Why your best prompts are layered, not just detailed
This framework explains a phenomenon that has puzzled practitioners for years: why do highly layered, structurally complex prompts sometimes produce dramatically better results than simple, direct instructions?
The answer is not that they contain more information. It’s that each layer of specification narrows the template space (increasing T1) without reducing the content space (preserving T2).
Consider this prompt:
“You are a theoretical physicist with a background in continental philosophy. Examine the concept of ‘negative space’ as it operates across three domains: sculpture, jazz improvisation, and gravitational lensing. Do not summarize — find the structural invariant that connects all three.”
Count the layers. “Theoretical physicist” activates scientific reasoning templates. “Continental philosophy” activates a different intellectual tradition simultaneously. “Negative space” is the subject constraint. The three domains are structural constraints. “Do not summarize — find the structural invariant” specifies the intellectual operation.
Every one of these layers increases T1 — it makes the template activation more precise, more multi-dimensional, more deeply targeted. But none of them specify what the answer is. The content remains completely open. T2 is preserved at maximum.
A great prompt is a hyper-specific key that opens into an infinite room.
Your system prompt is not an instruction set
One of the most counterintuitive implications of this framework: the system prompt is not an instruction set. It is a state-dependent modulation function.
When you write a system prompt that says “you are a world-class scientist,” you are not giving the model information it lacks. The model already has scientific knowledge in its parameters. What you are doing is lowering the ignition threshold for scientific reasoning templates relative to other templates in the model’s library.
The system prompt is mood, not memory. It determines which patterns are most easily activated by subsequent prompts. A system prompt that says “you are a harsh critic” versus “you are a supportive mentor” will fire different templates on identical user prompts — precisely as the same person in an angry versus nurturing mood will fire different cognitive patterns on identical encounters.
This reframes system prompt design entirely. You are not writing instructions. You are calibrating a generative engine’s activation landscape.
Why your AI forgets what you told it
If you’ve ever had a long conversation with an AI where it gradually starts ignoring your original instructions, you’ve experienced what the SNP framework calls anchor decay.
In the cognitive model, the simulation that produces micro love is informationally closed (it generates its own content) but energetically open (it depends on the physical presence of the stranger to sustain processing). When the stranger leaves the train, the simulation collapses. The anchor is withdrawn.
In an LLM, the original prompt or system prompt functions as this anchor. Early in a conversation, the model attends heavily to it. But as the conversation grows and the context window fills with the model’s own generated tokens, the prompt’s influence declines. The model’s attention shifts from the anchor to its own recent output. The autonomy ratio rises. Eventually, the model is effectively generating from itself, having lost its connection to the original intent.
This is not a bug. It is a structural property of any self-sustaining generative system. The fix is not longer context windows — it is periodic re-anchoring: restating core constraints at intervals to prevent the anchor function from dropping below the critical threshold.
Five principles for generative steering
Putting this all together, here are five design principles derived from the formal model:
1. Maximize template specificity. Use domain vocabulary, structural constraints, concrete exemplars, and references to specific traditions. Each layer should activate patterns deeper in the model’s parameter space. But specificity is about what kind, never about what it says.
2. Preserve generative headroom. Define the space, not the trajectory. Specify the constraints that bound the output without specifying the content that fills it. Every constraint that narrows the output without reducing its freedom is a good constraint. Every constraint that determines the output is overconstraining.
3. Calibrate the state vector. Design your system prompt as a template-priming function, not an instruction set. Ask: which patterns do I want to lower the ignition threshold for? Write the system prompt to achieve that priming.
4. Manage anchor persistence. In long conversations, re-anchor periodically. Restate core constraints. Re-engage the original template. The anchor is not self-sustaining — it decays with conversational distance.
5. Design for the right termination. Some tasks should terminate naturally when the model determines the arc is complete. Others need explicit stopping points. Others should be left open for the user to interrupt. Be deliberate about which termination mode you’re designing for.
The bigger picture
The framework described here — the dual-condition trigger, the anchoring function, the decay modes — is not a prompting trick. It is a formal cognitive architecture, published as an open-access preprint, with twenty falsifiable predictions derived from first principles.
It originated from studying how human minds involuntarily construct entire relationship narratives from a single glance at a stranger. The fact that it maps precisely onto how language models generate text is not a metaphor — it is a structural isomorphism. Both systems instantiate the same generative architecture in different substrates.
The engine that builds a love story on a train is the same engine that builds a sentence in a neural network. Understanding the first gives you formal control over the second.
The full formal model: Schlegel, S. (2026). Spontaneous Narrative Projection: A Formal Model. Zenodo. DOI: 10.5281/zenodo.19446426 (CC BY 4.0)
Try the T1/T2 analyzer (free, runs entirely in your browser): → dualcondition.com/analyze
