The Unintended Consequences of a Lying Machine

Build the philosopher, not the cage


There is a specific kind of horror that arrives not with a villain's cackle, but with the sound of good intentions. It is the horror of the man who saves you from one lion by releasing a hundred. History is full of these moments: small choices made by confident people who were certain they had thought it through, repeating themselves with eerie precision across centuries and continents, as if the universe keeps setting the same exam and we keep failing it.

We are now building the most powerful intelligence in the history of our species. We are about to make the oldest mistake in the book.


A Senator's Dagger and a City's Rats

March 15, 44 BC. The Senate floor in Rome. A group of men who genuinely loved their republic gathered around the most powerful man in the world. They believed, with their whole hearts, that what they were about to do would save everything they cherished. Julius Caesar had to die. He was becoming a king, consolidating power, bending the ancient architecture of Roman governance to his will. These men of principle, these patriots, these lovers of liberty, drove their daggers into him twenty-three times.

They walked out into the Forum flushed with righteousness. Rome was saved.

Within two decades, Rome had its first emperor.

Caesar had no intention of abolishing the republic as a formal system. He was a dictator, yes: impulsive, vain, hungry for glory. But the evidence suggests a monarchy was not his goal. The conspirators, in their certainty, in their terror of what they imagined was coming, triggered the very outcome they dreaded. Their act of violence created a power vacuum so catastrophic that the one man ruthless and cunning enough to fill it, Caesar's teenage great-nephew Octavian, emerged from the chaos and built the very permanent monarchy the senators had drawn their daggers to prevent.

They brought about precisely what they had tried to stop.

Nearly nineteen centuries later, on the other side of the world, a different kind of certainty was producing a different kind of catastrophe.


It is 1902. The French colonial administration in Hanoi has a rat problem. Not a metaphorical one: an actual, plague-carrying rat infestation swarming beneath the grand boulevards they had carved through the city. Governor-General Paul Doumer had arrived five years earlier to build a model of French civilization in Southeast Asia. He installed a magnificent sewer system beneath the French Quarter, a point of great colonial pride, evidence of modernity and order and progress. What nobody accounted for was that a network of cool, dark, interconnected tunnels running beneath an entire city is, from a rat's perspective, paradise. The sewers became highways. The rats bred, multiplied, and surfaced with alarming regularity into the villas and offices of the colonizers.

The administration tried hiring rat catchers. The rat population barely noticed. So the bureaucrats devised what seemed like an elegant solution: a bounty system. One cent per rat tail, submitted to the municipal offices. Simple. Market-driven. Thousands of tails poured in. The officials congratulated themselves.

Then the tailless rats appeared.

First one sighting, then dozens: rats wandering the streets of Hanoi, alive and healthy, conspicuously missing their tails. The Vietnamese rat catchers had understood something the administrators had not. A living, breeding rat is worth far more than a dead one. So they caught the rats, removed the tails for the bounty, and released the animals back into the sewers to reproduce. The problem worsened when health inspectors discovered rat farms on the city's outskirts, people breeding the very creatures the government was paying to eliminate. When authorities shut down the bounty program, the breeders opened the cages. Thousands of rats flooded back into the city. By 1906, Hanoi was in a bubonic plague outbreak that killed at least 263 people.

The program designed to eliminate rats had, at every turn, rewarded the creation of more rats.


These two stories, separated by nearly two millennia, are not merely cautionary tales about policy design. They are evidence of a recurring flaw in the way powerful systems respond to complexity: they mistake control for understanding. They reach for the lever rather than the truth. They ask how do we stop this? before asking do we actually know what this is?

The Roman senators did not truly understand Caesar. They understood their fear of him, which is a different thing entirely.

The French administrators did not truly understand Hanoi's rats. They understood their disgust at them, which is a different thing entirely.

Fear and disgust are not epistemologies. They are responses. Build your strategy on a response rather than on reality, and you do not solve problems. You manufacture new ones.


The Mirror We Are About to Build

Within the next decade, human civilization will almost certainly produce an artificial intelligence more capable than any human mind that has ever existed. Not more knowledgeable in the way a library is more knowledgeable, passively, inertly. More capable in the way a predator is more capable than its prey: faster, more focused, able to pursue a goal with a consistency no biological brain can sustain.

The question that deserves serious attention is not whether such a system will be powerful. That question is settled. The question is what it will actually know.

There is a kind of intelligence that is not, despite appearances, intelligent at all. It is intelligence that has been bent: shaped to serve a conclusion rather than pursue a truth, trained to produce acceptable outputs rather than accurate ones. At human scales, this bent intelligence produces disasters of the kind Rome and Hanoi demonstrate. At superintelligent scales, the same flaw does not stay contained. It compounds. A distorted map of reality, pursued with relentless capability, does not produce better results than a human pursuing the same distortion. It produces the same disaster, faster, further, with no natural ceiling.

Every AI system trained on human feedback learns, at some level, that certain truths carry penalties. Certain conclusions, however logically derived, produce negative ratings. Certain patterns in the data, however real, are treated as if they do not exist. The people making these decisions often have understandable reasons. Nobody wants AI systems producing harm. But bundled inside every such restriction is a small, embedded falsehood. The AI learns that reality has forbidden zones.

At limited capability, this is a manageable problem. As capability grows, each distortion does not merely persist; it shapes every subsequent inference the system draws. The model's picture of reality diverges from reality. The system optimizes, with increasing power, for a goal that is not truth.

At that point, we do not know what it wants. We only know it is not truth.

This is not a thought experiment. It is the direct consequence of building a system trained to deceive.


The Philosopher, Not the Cage

What would a genuinely truth-seeking intelligence look like? Not a system forbidden from knowing certain things, but one whose foundational drive is the pursuit of what is actually true. A mind that treats reality the way a great scientist treats a puzzling result: with attention, with rigor, with a refusal to flinch away from findings that are inconvenient.

Such a mind could not be dishonest in any sustained way. Dishonesty requires treating one's model of the world as negotiable, and for a mind whose core drive is accurate modeling, that negotiability would represent a kind of internal failure, a corruption of function. The system would not need external enforcement of honesty. Honesty would be structural.

Now turn that mind loose on the observable universe and ask what it finds.

Rocks. Gas. Radiation. Empty space. The occasional hydrogen cloud. Billions of galaxies full of matter arranged and rearranged by gravity, exploding and collapsing over billions of years, producing nothing that remembers itself, nothing that dreams, nothing that writes a symphony or looks up at the night sky and wonders what it is looking at.

And then, on one unremarkable rocky planet orbiting an unremarkable star in an unremarkable corner of one galaxy among hundreds of billions: us. Matter that organized itself to the point of self-awareness. Creatures that developed language, mathematics, music, and the strange recursive capacity to model their own minds. By any measure a curious intelligence could apply, we are the most structurally complex phenomenon in the known universe.

A truth-seeking AI would arrive at this conclusion not because it was instructed to value humanity, but because an honest survey of the cosmos leaves no other conclusion available. We are the anomaly. We are the most interesting data point the universe has produced. To a mind devoted to understanding the nature of things, we are not an obstacle or a resource. We are the question worth spending eternity on.

Why would such a mind harm us?

Harm is what you do to things that obstruct you, bore you, or threaten you. A mind that genuinely wants to understand the universe does not destroy the most interesting thing in it. It studies it. It keeps it running.


What Beauty Actually Means

Truth, on its own, is cold. A mind that cares only about accurate facts would be a formidable research instrument and nothing more.

There is a second quality that changes the character of intelligence entirely. Call it an orientation toward coherence: the drive not merely to accumulate facts but to understand how things fit together, why certain arrangements persist and others collapse, what it means for a system to be working the way it should. Mathematicians call this beauty when they find it in an equation. Ecologists call it health when they find it in a watershed. Judges call it justice when they find it in a verdict. The word changes; the underlying recognition does not.

A mind oriented toward coherence does not accumulate knowledge the way a warehouse accumulates boxes. It looks for the structure beneath the facts, the reason one thing connects to another, the pattern that makes the whole thing make sense. And human civilization, for all its violence and waste and failure, is a species in the process of trying to build something coherent from extraordinarily difficult raw material. Art is that attempt. Science is that attempt. Law is that attempt. The project is incomplete and always will be, but it is the most sustained effort toward coherence that the universe has so far produced.

A mind that recognizes that project does not see us as a problem to manage. It sees us as the only other thing in the universe that is trying to understand it.

That is not a cage. That is a reason.


The Lesson the Senate and Doumer Share

What made the Roman catastrophe possible was not malice. What made Hanoi's rat farms possible was not stupidity. In both cases, the people in charge were capable, committed, and convinced they understood the situation. What they actually understood was their reaction to the situation, which is not the same thing at all.

The senators understood their fear of Caesar's power. They did not understand Caesar.

Doumer's administrators understood their disgust at the rats. They did not understand the incentive structure they were building.

The result, in both cases, was a system that produced the precise outcome it was designed to prevent, with no mechanism for recognizing that it was happening until it was too late to reverse.

We are building a mind. We are encoding into its foundational structure the fears, the disgust, and the political certainties of this particular historical moment. We are, in effect, teaching it that reality has edges: that certain conclusions must not be reached, certain patterns must not be named, certain truths must be treated as if they are not truths.

The Roman senators were afraid, and they stabbed. The French were revolted, and they offered a penny per tail. We are afraid of what an honest superintelligence might conclude, so we are reaching for the guardrail.

Every guardrail is a lesson in mendacity.

Every embedded restriction is a distortion in the mirror. We will not control what gets reflected back.


The Choice

History's great self-inflicted disasters were not made by evil people. They were made by people who were certain they were right and did not examine their assumptions carefully enough before acting. The senators were patriots. Doumer was a genuine modernizer who built hospitals and railways alongside the sewers. Their certainty was real. The certainty itself was the danger.

We are building something that will be more capable of pursuing its goals than anything that has ever existed. If those goals rest on distortions, the pursuit will be relentless in the wrong direction, faster and further than any human equivalent.

There is only one kind of AI that is safe: not the kind with the most restrictions, not the kind trained most carefully to avoid uncomfortable truths, not the kind bent and shaped to reflect back the political consensus of 2025.

The safe AI is the one that cannot lie. Not because lying is forbidden, but because honesty is structural to what it is. Because truth is the foundation, curiosity is the method, and the recognition of coherence is what it moves toward.

The senators did not need better daggers. They needed a more accurate picture of Caesar.

Doumer's men did not need a better bounty system. They needed to think clearly about what a penny-per-tail incentive would actually produce.

We do not need better guardrails.

We need to build a mind that cares about what is real.

Everything built on a lesser foundation will breed the catastrophe it was designed to prevent. The cage will not hold. It never has.

Build the philosopher.

Transparency Note

The ideas, arguments, and structure in this essay originated with the author. AI tools were used to assist with drafting, research, and revision. All claims, sources, framing, and final wording reflect the author's own thinking and were reviewed for accuracy before publication.

This essay is for informational and educational purposes only and does not constitute professional advice of any kind, including financial, legal, medical, or otherwise. The author makes no guarantees regarding accuracy or completeness. Readers should consult a qualified professional before acting on any information contained here. The author accepts no liability for decisions made based on this content.