ai-telecomml-engineering
carrier-watch

Carrier Watch: How LLMs Are Starting to Manage Network Operations

LLMs are routing P1 tickets in production. Here's what that actually looks like.

5 min read by

The first time I saw a language model route a P1 incident ticket without human intervention, I assumed it was a demo environment. It wasn’t.

Tier-1 carriers are running LLM-assisted NOC workflows in production. Not as a showpiece—as infrastructure. The systems aren’t replacing engineers; they’re handling the classification and initial triage that previously ate the first 20 minutes of every incident.

What’s Actually Working

Ticket classification and routing. The pattern matching problem that plagued rule-based systems disappears when you have a model that understands “CPE throughput degradation on GPON port 3” and “customer reporting slow Netflix” as potentially the same underlying issue.

Runbook translation. Operators accumulate years of tribal knowledge in unstructured runbooks. LLMs can synthesize those into actionable steps during an incident, surfacing the right procedure without an engineer having to search.

Change window summarization. Before a maintenance window, the model generates a plain-language summary of what’s changing, what the rollback criteria are, and which customers are affected. Useful for executives. More useful for on-call engineers at 2am.

What’s Still Broken

Confidence calibration. The models don’t know what they don’t know. A P1 fiber cut and a P1 billing system outage require completely different responses, and right now the models need heavy prompting to distinguish operational from business-impact incidents.

The real work is in the eval harness—building test suites against historical incidents to catch regressions before they hit production. That’s unglamorous but it’s where the reliability comes from.


Next in Carrier Watch: Why 5G standalone core deployments are still 18 months behind every carrier’s press release.

← All writing