VOL. 02  ·  ISSUE 07  ·  MONDAY, MAY 11, 2026 BOSTON, MA  ·  RSS LIVE
Field Notes

Field Notes: What AI Agents Actually Look Like When SMBs Turn Them On

The agent that sticks has a narrow job and a clear human handoff. The rest comes back off by week three.

Jonathan Tonthat · ML Engineer, Cellhub
4 min read

One of the owners I talked to this quarter is running a six-chair dental practice. She turned on an AI voice agent for after-hours calls last month. By week two it was answering most of the late-night inbound, booking appointments into her calendar, and dropping a transcript with a next-action tag into her office manager’s inbox at 7am. She said it felt like hiring a junior receptionist who never sleeps, except the onboarding was a weekend of recording voicemails and uploading an FAQ doc.

The word everyone is using this year is “agent.” The distinction that matters is that an agent does something, not just says something. It takes an action, on a system, against a goal. For SMBs that is a smaller promise than the enterprise keynotes imply, and it is the one that is actually shipping.

Here is what the last quarter of conversations looks like.

What’s Actually Working

Voice at the front door. Answering, routing, booking. The dental practice story above, a legal intake line that qualifies matters before a paralegal ever touches them, an HVAC dispatcher that schedules diagnostic visits overnight. Calls land in a dashboard the next morning with a transcript and a tag. Owners do not try to make the agent close business. They let it hold the line until a human can take it.

Inbox triage that drafts but does not send. The agent reads inbound email, tags by intent, and drafts a response the owner can approve, edit, or discard. The version that dies is always the one that tries to auto-reply. The version that sticks is the one that collapses thirty minutes of sorting into five minutes of approving.

Morning briefings. A daily summary that runs at 6am and lands in the owner’s first cup of coffee. Revenue yesterday, receivables past thirty days, any jobs flagged at risk, any reviews that came in. Owners treat it the way a retail operator treats a cash-up report. Ten minutes of reading replaces forty-five minutes of clicking through tabs.

The thread through all three: the agent has a narrow job, a clear trigger, and a human in the loop at the handoff.

What’s Still Broken

Anything with more than one agent in the chain. The pitch is always the same. A researcher agent feeds a writer agent feeds a publisher agent. In SMB environments the orchestration overhead swamps the value. Debugging a failure means figuring out which step hallucinated, which took a wrong tool call, and which timed out, and owners do not have the hours for it. Every owner I spoke with who tried a multi-agent workflow had turned it off by week three.

Confidence calibration at the handoff. The agent does not know what it does not know. A voice agent that confidently books an extraction on a Sunday because it parsed “Sunday” out of a patient saying “not Sunday” is the canonical version of this failure. The fix is not a better model. It is tighter guardrails and a louder escalation path. That is eval-harness work, and most SMBs are not staffed for it.

Integration debt. Everyone underestimates how much of an agent’s usefulness depends on how clean the systems behind it are. The owners who got the most out of voice agents spent the first two weekends cleaning their scheduling data. The ones who tried to skip that step got an agent that booked into a calendar that was already lying.

The Pattern

The agents that stick in SMB environments have three traits. They do one job. They have a trigger a non-technical owner can explain in a sentence. And they hand off to a human the moment they start to be unsure. When any of those three is missing, the thing comes back off inside a month.

This is not where the keynote demos are. The keynote demos are long autonomous chains doing white-collar work without supervision. In the small businesses I am watching, the agent is a receptionist, a sorter, or a morning briefing. That is already more than most tools delivered in the last two years.

If your agent roadmap has five agents on it, cut it to one. Pick the one with the clearest handoff. Spend the other four weekends on the data behind it.

The ceiling on what an agent can do for a small business in 2026 is not the model. It is the cleanliness of the system it is plugged into.

← Previous · №003
What ML Engineers Actually Build Inside Telecom Networks
Next · №005 →
Mobility Watch: What the Three-Year iPhone Roadmap Actually Tells Mobility Operators
← All dispatches