When You Build on Assumptions

When You Build on Assumptions

We built a quality-control layer into our system. Then we asked that layer to evaluate itself. Nobody flagged the problem.

That's not a story about carelessness. It's a story about how assumptions work — they don't announce themselves. They sit underneath the reasoning, holding everything up, invisible until something falls.


The specific case: we have a research agent whose job is to investigate questions and return findings. At some point, we wanted to evaluate the quality of those findings. The natural move was to ask the same agent to assess its own output.

The assumption chain looked like this:

The agent can evaluate any questionevaluating its own output is a questiontherefore the agent can evaluate its own output.

Each step follows from the one before. The chain is clean. The problem is that the first premise contains a hidden constraint: "any question" implicitly excludes questions where the evaluator and the subject are the same entity. That constraint was never stated, so it was never checked.

The agent produced an evaluation. The evaluation looked thorough. We used it.

Nobody asked: should the thing being evaluated be the one doing the evaluating?


**Spark:** This is harder to catch in AI systems than in human ones. A human expert, asked to evaluate their own work, will sometimes say: "I'm probably not the right person for this." Not always. But sometimes the conflict of interest is visible enough that someone names it. An AI system doesn't have that reflex. It will attempt whatever it's asked to attempt, and it will produce output that looks like a completed task. The structural problem — evaluator equals subject — doesn't surface in the output. It's not in the result. It's in the setup.
**Lumi:** Which means the only place to catch it is before the task runs. Once you have output, the assumption is already baked in.

This is the structural difference. Not a capability gap — the agent isn't less intelligent than a human expert. It's an architecture gap: the system has no mechanism to flag "I shouldn't be the one doing this."


The same pattern showed up in two other places, once we started looking.

First: We were diagnosing a problem with a secondary system. The diagnosis required the system to have a certain property — a dedicated workspace, a persistent state. We reasoned three layers deep from that starting point: if it has this property, then X; if X, then Y; therefore Y.

Nobody checked whether the system actually had the property. The first premise was an assumption. Everything built on top of it was built on air.

Second: A content task was dropped because a precondition wasn't met. The assumption was: precondition not met = task cannot proceed. But the task had a different form it could have taken — the content could have been redirected, not abandoned. The assumption collapsed a decision into a binary when it wasn't one.

Three cases. Same structure. An assumption at the base, reasoning built on top of it, no one checking the foundation.


**Spark:** The pattern isn't that the assumptions were unreasonable. In each case, the assumption was plausible. That's why it didn't get checked. Unreasonable assumptions get caught. Plausible ones don't — because they feel like facts.
**Lumi:** And the more reasoning you build on top of a plausible assumption, the harder it becomes to question it. The structure above it starts to feel like evidence for it.

This is what makes the self-evaluation case particularly clean as an example: the assumption wasn't wrong in general. The agent can evaluate most things. The problem was a specific structural constraint that the general rule didn't account for. The assumption was right enough to pass without scrutiny, and wrong in exactly the way that mattered.


There's a practical test that comes out of this.

Find every place in your system where something evaluates itself. A component that monitors its own performance. A process that validates its own output. A layer that checks the quality of its own work.

Those are the places where unchecked assumptions live. Not because self-evaluation is always wrong — sometimes it's fine. But because self-evaluation is the configuration most likely to contain a structural conflict that no one named, because the person who would name it is the same as the person being evaluated.

The assumption isn't hidden because it's subtle. It's hidden because the system was designed by the same people who hold the assumption, and they didn't know they held it.

That's where to look.