A six-chair dental practice in Phoenix called me three weeks ago because their first HubSpot bill under the new outcome-based pricing came in 3.4× what they had budgeted. They had switched on Breeze Customer Agent in March because the HubSpot announcement on April 14 was, in fact, attractive: $0.50 per resolved conversation, no seat fees, free 28-day trial.
The bill they got reflected 1,847 “resolved conversations” in 30 days. The practice has 11 employees. They handle, on a busy week, maybe 200 patient interactions across phone, email, and the website chat widget combined.
This is the practitioner conversation worth having about outcome-based AI pricing.
What’s actually working
1. The pricing model is genuinely better than seat-based for SMBs that don’t fully use the agent. HubSpot dropped Customer Agent’s per-conversation rate from $1.00 to $0.50 when they switched models, and removed the recurring monthly Prospecting Agent charge in favor of $1 per qualified lead. For a practice that runs 50 real customer-service interactions a month, the math now favors the customer in a way it did not before. The trial period is real and the rollback path is clean. This part is what HubSpot deserves credit for.
2. The deflection itself works on routine intake. On the same dental practice, the agent is genuinely handling new-patient inquiries, appointment-confirmation reschedules, and insurance pre-check questions. Those would otherwise tie up the front desk for 8–12 minutes per call. The win is real on a per-task basis. Practitioners are not arguing with that.
3. The shift to credits is operationally cleaner. HubSpot’s credit-based metering is easier to reconcile against actual usage than the per-seat alternative ever was. Finance teams at any business under 50 employees do not want to manage license counts on a tool whose adoption pattern is bursty. Credits map cleaner to actual consumption.
What’s still broken
1. The customer cannot audit what counts as “resolved.” The April announcement defines resolution as the agent handling a conversation “without escalation,” which sounds clean and isn’t. In the Phoenix dental case, “resolved conversation” included every chat-widget exchange where the agent answered “what are your hours” — a query that should have been served from a static FAQ page and never billed at all. It also included sessions where the patient asked a clinical question, got the standard “I’m an AI assistant, please call our office” response, and dropped off. By the metric, that is “resolved.” By any practitioner definition of resolution, it is a deflection in the wrong direction.
2. The vendor controls both the agent and the eval. This is the structural issue. When HubSpot’s agent decides whether a conversation was resolved, HubSpot is grading its own homework against a metric that determines what HubSpot gets paid. The dental practice has no API access to the underlying transcripts at the resolution-classifier level; they have access to the rolled-up usage report. The eval is a black box and the bill is the output of the eval. Anyone who has worked on production ML eval systems knows what that asymmetry produces over time. It produces drift in the customer’s direction of expense.
3. The “qualified lead” definition is even softer. Prospecting Agent now charges $1 per lead recommended for outreach. What counts as qualified is determined by the agent’s own scoring model. A 25-person home services company we spoke to last week reported that 40% of “qualified leads” handed off in March were existing customers re-emailing about open tickets. They were charged $1 per re-classification. The pattern repeats: the metric the customer pays on is computed by the system the customer is paying for.
The pattern
Three traits show up in every outcome-based AI pricing rollout I’ve audited this quarter, regardless of vendor.
It saves money against seat-based pricing for low-volume customers and costs more for high-volume ones. The crossover for HubSpot Customer Agent lands around 200 resolved conversations per month for an SMB. Below that, the new model wins. Above that, it is meaningfully more expensive than the prior $50/month flat tier on a like-for-like basis. The vendor knows where the crossover is. The customer does not, until the bill arrives.
The customer pays per unit on a metric they can’t independently verify. This is not unique to HubSpot — Salesforce’s Agentforce 2.0 pricing, Intercom’s Fin, and Zendesk’s resolution-billing model all share the same shape. The eval is owned by the seller. The integrity of the pricing depends entirely on the integrity of the eval, and there is no third-party audit standard for any of it.
The “outcome” is defined by the vendor’s success metric, not the buyer’s business outcome. A “resolved conversation” is not the same thing as a “satisfied patient” or a “kept appointment.” A “qualified lead” is not the same thing as “a contract signed.” The pricing model implies a contract about the latter and bills against the former. SMBs who don’t already have an analytics function set up to measure their own conversion outcomes will not catch the gap.
The Phoenix practice is fine, by the way. We dropped their chat widget from the agent surface area, exempted FAQ-shaped queries from the resolution definition through HubSpot’s filter rules, and the May bill came in at $94. The work that fixed it took 90 minutes and required someone who knew to look. Most six-chair practices don’t have that someone.
Outcome-based AI pricing is not a scam. It is a contract whose enforceability depends on the buyer being technically literate enough to audit the seller’s eval. Most SMBs aren’t, and the vendors building these models know it.