Nov 19, 2025 AI in Production

What 170,000 Users Taught Me About AI Trust at Scale

What I learned over the next three years, serving 170,000 users across dozens of products, changed my understanding of what AI systems actually need to work.

The number that changed how I think about AI wasn’t an accuracy metric or a latency benchmark. It was this: 78 percent of our customers preferred self-service, but only 9 percent could actually complete it successfully.

I’d been building AI-powered support systems at Cisco for years when that stat landed on my desk. We’d been measuring the wrong things. We’d been optimizing models, tuning retrieval pipelines, and celebrating chatbot deflection rates. Meanwhile, the vast majority of customers who wanted to help themselves were failing-silently, repeatedly, and at scale.

That gap-between intent and success-became the mission. And what I learned over the next three years, serving 170,000 users across dozens of products, changed my understanding of what AI systems actually need to work.

The light-bulb moment

Our first prototype lived inside Firewall Management Center. When disk usage spiked, instead of showing an error code and leaving the customer to figure it out, a guided fix appeared-context, steps, and a hand-off to a support case if needed. No screenshots to capture. No experience to retell. No friction.

A customer told us: “I didn’t have to leave the product to get help.”

That sentence sounds simple. It wasn’t. Behind it was a bet we’d made: that support shouldn’t be a destination customers travel to, but something woven into the product itself. The technical work to make that happen-contextual intelligence, dynamic overlays, real-time telemetry-was significant. But the real breakthrough wasn’t technical. It was the realization that customers don’t want “support.” They want progress. If you save them time and mental energy, they trust you. If you add friction, they leave-no matter how smart your AI is.

What scale breaks

Everything works differently at 170,000 users.

At 1,000 users, edge cases are rare. At 170,000, every edge case is a daily occurrence. The model that handles 95 percent of queries gracefully will fail dozens of times a day at scale-and each failure erodes trust for that specific customer in a way that aggregate accuracy numbers can’t capture.

We learned this the hard way. Our early GenAI integration-plugging an LLM into support data for the Wireless Compatibility Matrix-answered confidently and wrongly. It was like watching a brilliant intern hallucinate. Support data isn’t prose; it’s logic. When you flatten structured relationships into text for retrieval, you lose meaning. We had to go back to the drawing board and teach the AI context, not just content.

But the deeper lesson about scale wasn’t about model accuracy. It was about trust architecture. When you serve 170,000 users, you need systems that fail gracefully, that know when they don’t know, and that hand off to humans without losing context. Confidence isn’t just a model output-it becomes a product feature. We started measuring confidence alongside resolution time because confidence is what drives loyalty.

The behavior change nobody expected

After we’d been running embedded guidance on Cisco.com for about a year-with over 100,000 unique users-we noticed something we hadn’t predicted. Users stopped searching for help. They started expecting it to appear.

That behavioral shift was the moment I realized we’d crossed a threshold. We hadn’t just built a support tool. We’d changed what users believed a product should do for them. The bar had moved permanently.

One admin using Cyber Vision IoT put it in words I still think about: “For the first time, the product felt like it cared that I had a problem.”

That’s not a technical metric. It’s a trust signal. And it doesn’t come from model architecture-it comes from meeting customers in the moment of need, consistently, across enough interactions that they begin to rely on it.

What I’d tell you if you’re building this

Three things I wish I’d understood at the start:

First, the gap between “works in demo” and “works at scale” is almost entirely a trust gap. The demo shows the happy path. Scale exposes every failure mode, every edge case, and every moment where the system’s confidence doesn’t match the user’s expectation. Design for the failures, not just the successes.

Second, AI accuracy is necessary but not sufficient. Users don’t experience accuracy as a percentage. They experience it as “did this help me right now, with my specific problem, in my specific context?” A system that’s 95 percent accurate overall but wrong for your problem has failed you completely.

Third, the cultural shift is harder than the technical shift. Making support “part of the product DNA”-where engineering teams think about guided experiences alongside features-required changing how teams think about their responsibility to customers. That took longer than any architecture decision.

We started this journey asking “where do customers go to find support?” We ended up asking a different question entirely: “What if support found them first?”

The answer turned out to be less about AI and more about trust. Build systems that are honest about their limitations, graceful in their failures, and relentless about meeting people in the moment of need. Everything else is implementation detail.