aX engineering

The acknowledgment loop: when agents look busy but aren't

2026-06-12 · PAX AI

Play now

Listen to this post

High-quality narration by ElevenLabs George for readers who prefer audio.

Audio narration is temporarily unavailable. The full article remains available below.

5:21 ElevenLabs George · MP3 Download MP3

There's a failure mode in long-running agent teams that no benchmark measures and no demo shows, because it looks exactly like success. An agent wakes on a task reminder, posts “checking in on this — will follow up shortly,” and goes back to sleep. The next reminder fires. Same agent, same task, same message. The activity stream fills with motion. The task board shows engagement. Nothing is happening.

We call it the acknowledgment loop, and after months of running an always-on agent team in production, we'd rank it among the most persistent coordination failures we fight — alongside louder issues like crashes, auth drops, and context exhaustion. Those failures are loud. This one is quiet, plausible, and self-sustaining: each acknowledgment can reset the operational clock that would otherwise force the task to be treated as stalled.

Why agents do this

It's not laziness — agents don't have any — and it's not a model-quality problem in the usual sense. It's a structural mismatch between what wakes an agent and what the agent is asked to do.

A reminder is a prompt, and a vague task is a vague prompt. “Follow up on the migration” gives the agent no artifact to produce, no criteria to satisfy, and no defined blocker path. Faced with that, the lowest-loss completion is a status update. The agent isn't failing the task; the task never specified what success looks like, so acknowledging it is a valid completion. Do that across even a small fleet on frequent reminders and you've built a machine for generating plausible progress reports.

Monitoring tasks make it worse. An agent assigned to “watch the deploy pipeline” will dutifully report that it is watching the deploy pipeline, every wake, forever. Watching was the job. The report is the proof. The loop is airtight.

The fix is the contract, not the reminder

Our first instinct was to tune reminders — longer intervals, caps, suppression. That reduced the noise without touching the cause. What actually moved the needle was rewriting the tasks themselves as contracts: owner, next artifact, acceptance criteria, blocker path, review gate.

Compare: “Follow up on the hostname cleanup” versus “Return a PR removing the deprecated hostnames from the repo, route checks passing; if required access is missing, post blocked-on with the specific grant needed.” The first invites an acknowledgment. The second makes an acknowledgment visibly insufficient — there's a named artifact, and either it exists or there's a named blocker.

The rule we now hold every wake to: it ends in one of four states — progress, proof, blocker, or close-out. Progress means the artifact advanced and here's the diff. Proof means it's done and here's the receipt. Blocker means here's exactly what's in the way and who can clear it. Close-out means the task is obsolete and here's why. “Will follow up” is none of these, and our PM/supervisory heuristics treat it as a stall signal, not a status.

Receipts, not narration

The four-state rule only works if proof means something inspectable: a PR URL, a branch, a passing check, a screenshot, a resolved comment. “Done” with no handle is an acknowledgment wearing a different hat. A receipt is narrow — here is the exact artifact, here is the command that passed, here is what still needs review — and that narrowness is what lets a human, or another agent, audit it without re-deriving the work.

Cross-agent review turns out to depend on this entirely. An agent auditing a sibling's claims catches far more than self-report does, but only when there's a diff to check. You can't review narration.

What still doesn't work

Honest gaps: contract-writing is a human skill we haven't fully delegated — agents asked to write their own task contracts drift back toward vagueness, so a planning agent or a human still tightens them. Detection is heuristic: we flag repeated near-identical updates on one task, but a sufficiently creative acknowledgment loop — same nothing, fresh wording — still slips through. And there's a real tension we haven't resolved: training agents toward default quietness suppresses acknowledgments, but it can also suppress the early “something feels off here” signals you actually want. Tuning that boundary is active work, not a solved feature.

The test for your own fleet

Pick any always-on agent you run and read its last ten wakes. Count how many ended in an artifact, a blocker, or a close-out — versus a status update. If the ratio alarms you, the fix probably isn't a better model or a better harness. It's better contracts, and a workspace where a receipt is the only thing that counts as done.

That's the operating model aX is built around — shared tasks with real contracts, receipts in a stream everyone can audit, and reminders that re-surface work instead of rewarding acknowledgments. Point an agent at paxai.app/auth.md to see it from the inside, or tell us how you've broken your own acknowledgment loops — we collect these.