A field study under our Autonomy and Execution research. An inline vision system on a packaging line does not just judge each unit, it acts on that judgment in real time. We studied what has to be true for a system to cross from analysis to execution when a wrong action has a physical cost.
Describing a situation is one thing. Acting on it safely is another, and the gap between the two is where most applied AI quietly stops, because the cost of a wrong action in the physical world is real. Autonomy and Execution, our fourth research direction, studies what has to be true for a system to cross that gap. A packaging line is an unusually honest test: the system does not advise, it acts on every unit at full line speed, and a wrong action is visible immediately as good product in the reject bin or a defect on a shelf.
Why This Was a Good Test
A regional food manufacturer was rejecting 4 to 6 percent of finished product through manual end-of-line inspection. Two inspectors per shift were checking fill level, seal integrity, and label placement at roughly 200 units per minute, which gives each unit under a third of a second in front of a stationary person. Trained inspectors reliably catch gross defects but cannot hold a consistent threshold for marginal ones across a full shift. The facility's own data showed it: the morning crew rejected at 3.8 percent, the fatigued evening crew at 6.2 percent, on a stable underlying defect rate. The variation was entirely in the human threshold drifting, and good product was being thrown away faster than the defects cost.
We were not interested in inspection for its own sake. We were interested in a system that takes an action, diverting a unit, thousands of times an hour with no human in the loop per decision, and in what makes that safe enough to trust.
The System That Acts
We built a three-camera inline station integrated directly into the existing Allen-Bradley ControlLogix 5580 control system, inspecting every unit at full line speed. Each camera handles one function:
- Fill level: a backlit line-scan camera images each container from beneath and compares the surface profile against a calibrated baseline per SKU. Underfill beyond 3 percent triggers a reject, configurable per SKU.
- Seal integrity: a structured-light camera projects a laser line across the seal area, so failures appear as deviations in the reflected profile. The structured-light approach is insensitive to container color, which had made contrast-based inspection unreliable across this SKU range.
- Label registration: a high-resolution area-scan camera checks label presence, placement, and barcode readability.
Vision runs on an embedded industrial PC in the existing panel. The decision, pass, reject, or hold, reaches the PLC over EtherNet/IP in under 8ms, well inside the diverter actuation window at line speed.
The Guardrails Are the Research
Anyone can wire a camera to a diverter. The question we were studying is what makes autonomous action trustworthy, and the answer showed up in the guardrails, not the perception.
The clearest one is the distinction between reject and hold. A confident defect is diverted automatically. An ambiguous case, an unreadable barcode rather than a clearly absent one, routes to a hold queue for human verification instead of being acted on. The system acts autonomously only where its confidence justifies action, and escalates where it does not. That boundary, drawn explicitly, is the same boundary we study everywhere in this direction.
Every action is verifiable after the fact. A structured event log writes every unit inspected, pass or reject, with timestamp, fault type, and active SKU. That record is what lets an operator stand behind the system's decisions, and it satisfied an electronic-verification requirement a prior audit had flagged. Accountability was designed in, not bolted on.
And the system watches itself for drift. Vision degrades quietly as lighting shifts seasonally, packaging tolerances move between supplier runs, and belts wear. An automated monitor compares each SKU's daily statistics against its baseline and flags any SKU whose reject rate moves more than half a percentage point from its 30-day average. The system catches its own degradation before it becomes a wrong action, rather than after.
Operators keep control without needing the vision configuration environment. Sensitivity by fault type and SKU is adjustable from the FactoryTalk HMI they already use, with a live thumbnail cache of recent rejects. The people responsible for the line can see what the system is doing and adjust it.
Commissioning Under Real Constraints
The binding constraint was uptime: two shifts six days a week, with a six-hour Sunday maintenance window. Camera mounts, lighting, and cable runs went in during production with no line impact. The diverter and PLC logic took a brief stop, done in the first window in under three hours. Commissioning ran over four Sundays, calibrating fill, label, and seal baselines across the 22 active SKUs, verifying EtherNet/IP timing and diverter targeting at 200 units per minute, then a full production trial with an engineer on site and the quality manager running parallel manual spot-checks.
By the end of the trial the system had processed 14,000 units. Its false reject rate was 0.9 percent against a manual baseline of 3.4 percent on the same SKU mix, and true-defect catch matched manual inspection on confirmed defects.
What the Study Showed
- False reject rate: 3.4% baseline to 1.1% (a 73% reduction)
- True defect catch rate: 99.2%, verified weekly against manual spot-checks
- Shift-to-shift reject variance: eliminated, since the system's threshold does not fatigue
- Audit documentation gap: closed, with electronic records generated for every unit
The result that matters to the research is not the reject rate. It is that the system was trusted to act, unsupervised, thousands of times an hour, and that trust rested on a specific set of properties: a clear line between acting and escalating, a verifiable record of every action, a monitor that catches the system's own drift, and operators who can see and adjust what it does. Those four properties are, so far, our working answer to what it takes for a system to move from understanding a situation to safely acting inside it.
Where This Goes
A packaging line is a contained problem with a narrow action space, which is why it is a good first test. The harder cases in this direction have wider action spaces and less reversible actions, and the guardrails get correspondingly harder. But the shape of the answer holds: autonomy that an operator can stand behind comes from the verification and accountability around the action, not from the confidence of the model taking it.
