Judgment and Recall: Dividing the Work Between People and AI

The useful question is rarely whether a system can do a task. It is which parts of skilled work should be handed to a system, which must stay with a person, and how the seam between them earns trust.

Most conversation about AI in skilled work gets stuck on the wrong question: can the system do the job. It is the wrong question because almost any task can be partially automated, and the interesting decisions are about where to draw the line, not whether a line can be crossed. Our Human and AI Collaboration research is about that line: which parts of expert work belong to a system, which belong to a person, and what makes the handoff between them trustworthy.

Two Different Kinds of Work Hide Inside One Job

Watch an expert work and you will see two very different activities braided together. One is recall and correlation: remembering the relevant precedent, pulling the right reference, cross-checking a dozen things that have to line up. The other is judgment: weighing tradeoffs, reading a situation, deciding what to do when the consequences are real and the answer is not in any reference.

People are extraordinary at the second and merely adequate at the first. We forget, we tire, we take shortcuts under pressure. A system is the reverse: tireless and complete at recall, and genuinely poor at judgment that requires understanding stakes. The opportunity is not to replace the expert. It is to split the braided work along its natural seam and give each side to whatever is actually good at it.

The Failure Mode Is Spending Judgment on Recall

When the two are tangled, something wasteful happens: experts spend their judgment guarding against bad recall. We saw this clearly in a field study with an estimating team. They wrote every proposal from scratch, not because the writing needed their expertise, but because reusing the wrong precedent created risk, and checking which precedent was right was tedious. So their scarce, valuable judgment was being burned on a recall problem.

When a person's judgment is spent compensating for unreliable recall, you have not used the person well. You have used the most expensive part of them to patch the cheapest.

Hand the recall to a system that is complete and transparent about its sources, and the judgment is freed for the part that actually needs it. In that study, handing retrieval and assembly to a system did not lower quality. The win rate held. What changed is that the experts spent their time on the calls that needed experience instead of on clerical defense against their own memory.

Trust Comes From a Legible Seam

The hard part is not deciding where the line goes. It is making the handoff trustworthy. A system that does recall but hides how it got its answer forces the expert to either accept it blindly or redo it, and both defeat the purpose. The seam has to be legible.

In practice that means the system shows its work. Where did this retrieved item come from. Which source. What was the path from question to answer. When the seam is visible, something specific happens to trust: the people who trust the system least use the transparency most, which means they are verifying rather than rubber-stamping. That is well-calibrated trust, and it is the actual goal. We are not trying to make people trust the system more. We are trying to make trust track reality, so it is relied on exactly where it is reliable and questioned everywhere else.

Designing the Division on Purpose

The principle we keep returning to is minimum viable automation: automate the recall and correlation, preserve the judgment, and make the boundary between them something a person can see and check. The system carries tireless attention to detail. The person keeps the decisions that carry consequences. Neither does the other's job.

That sounds modest, and it is deliberately modest. The grand version, where the system makes the judgment calls too, is both harder and, in most operations, not what anyone actually wants. The valuable version is the one that respects the seam: a clear division of labor where each side does what it is genuinely good at, and the handoff is honest enough to trust. Getting that division right, over and over in different kinds of work, is the research.

Judgment and Recall: Dividing the Work Between People and AI

Two Different Kinds of Work Hide Inside One Job

The Failure Mode Is Spending Judgment on Recall

Trust Comes From a Legible Seam

Designing the Division on Purpose

Keep reading the work.