
The world of AI models moves fast. And for customer experience leaders, the landscape can feel like a confusing mix of buzzwords, benchmarks, and bake-offs. With dozens of AI models available, each with different strengths and costs, how do you know which one to use for what? Do you build, buy or blend?
Truth is, there’s no one-size-fits-all AI model for customer experience and any partner who tells you otherwise is selling simplicity, not strategy. It's about matching the right model to the right job and understanding that in customer experience, you'll likely need several models working together.
That’s where this decision framework comes in.
A framework for model selection
GPT-4o might excel at complex reasoning, but it's overkill (and expensive) for routing support tickets. Claude might be brilliant at nuanced conversations, but you don't need that sophistication to detect spam in contact forms.
Customer experience is actually dozens of micro-experiences, each with different requirements for speed, accuracy, cost, and complexity. So consider these four lenses when deciding which model to run with:
Scenario 1 – Complexity vs speed requirements
High Complexity, Speed Tolerant |
Low Complexity, Speed Critical |
Use case: complex customer inquiries, detailed product recommendations, multi-step problem-solving |
Use case: intent classification, sentiment analysis, basic routing, spam detection |
Example scenario: A telecomms company's enterprise support team deals with complex network configuration issues. Customers might say: "Our VPN keeps dropping when employees connect from the London office, but only on Tuesdays between 2-4 PM." This requires understanding technical context, temporal patterns, and suggesting multi-step solutions. |
Example scenario: A bank's mobile app needs to instantly categorise customer messages: Is this a transaction dispute, account question, or general inquiry? This happens thousands of times per hour and needs sub-second response times. |
Model rationale: These scenarios need models that can reason through complex, multi-variable problems. The extra processing time (2-5 seconds) is acceptable because the alternative is a frustrated enterprise customer and a potentially lost account worth millions. |
Model rationale: A fine-tuned classification model can achieve 95%+ accuracy on this task in under 100ms, versus 2-3 seconds for a large language model. When you're processing 50,000 messages daily, those seconds add up to poor user experience and higher infrastructure costs. |
Model choice: GPT-4o, Claude 3.5 Sonnet, Gemini Pro |
Model choice: Smaller specialised models, fine-tuned BERT variants or even traditional ML |
Scenario 2 – Accuracy vs cost
High Stakes, Accuracy Critical |
Lower Stakes, Cost Sensitive |
Use case: financial advice, medical guidance, egal compliance, fraud detection |
Use case: content generation, basic personalisation, routine comms |
Example scenario: An insurance company's claims processing system needs to flag potentially fraudulent submissions. A false negative (missing fraud) could cost hundreds of thousands of dollars. A false positive (flagging legitimate claims) damages customer relationships. |
Example scenario: An ecomm platform wants to generate personalised email subject lines for millions of customers daily. The impact of a mediocre subject line is low, but the volume is massive. |
Model rationale: This calls for multiple models voting on decisions, with human oversight for edge cases. You might use a specialised fraud detection model for initial screening, a large language model to analyse claim narratives for inconsistencies, and computer vision models to verify damage photos. |
Model rationale: A smaller, specialised model trained on your email performance data will outperform a general-purpose model at a fraction of the cost. Even if it's only 80% as "creative" as GPT-4, the cost difference makes it the right choice. |
Model choice: Ensemble approach and human-in-the-loop workflows |
Model choice: Smaller more efficient models, open-source alternatives |
Scenario 3 – Data sensitivity vs privacy
Highly Sensitive Data |
Less Sensitive Data |
Use case: personal financial information, health records, internal communications |
Use case: public website interactions, general customer enquiries, marketing content |
Example scenario: A healthcare provider wants to use AI to help doctors draft patient notes faster. The AI needs to understand medical terminology and suggest relevant information, but patient data cannot leave the organisation's infrastructure. |
Example scenario: A retail company's website chatbot helps customers find products and answers basic questions about shipping and returns. |
Model rationale: This requires either deploying open-source models on-premises (like Llama 2 or Code Llama) or using private cloud instances of commercial models with strict data handling agreements. The slight performance trade-off is worth the regulatory compliance and patient trust. |
Model rationale: Standard API calls to commercial models work fine here. The data isn't particularly sensitive, and the convenience and performance of cloud-based models outweigh privacy concerns. |
Model choice: On-premises deployment, private cloud instances, models with strong privacy guarantees |
Model choice: Cloud-based APIs, shared infrastructure |
Scenario 4 – Integration complexity
Complex Integration needs |
Simple integrations needs |
Use case: enterprise systems, real-time processing, custom workflows |
Use case: standalone application, proof-of-concepts, simple API calls |
Example scenario: A logistics company wants AI that can process customer shipping requests, check inventory across multiple warehouses, coordinate with carrier APIs, and send automated updates, all in real-time. |
Example scenario: A content marketing team wants to automatically generate social media captions from their blog posts. They need something that can take a blog URL, extract key points, and create 3-4 caption variations optimised for different platforms, eg. LinkedIn and Instagram |
Model rationale: This requires custom integration work, possibly combining multiple models with traditional systems. You might use a language model to parse customer requests, a recommendation system to optimise routing, and rule-based systems for final validation. |
Model rationale: This is a straightforward content task that can be handled with a simple API integration. A tool like GPT-3.5 through a no-code platform like Zapier can automate this workflow without any custom development. The team gets immediate value without technical complexity. |
Model choice: Custom implementations, hybrid architectures, specialised tooling |
Model choice: Ready-to-use APIs, plug-and-play solutions |
Making the decision
Most enterprise customer experience solutions use multiple models working together, but the decision will come down to key questions such as:
- How many interactions will this handle? And at what frequency?
- What’s the acceptable response time?
- How much accuracy do you need?
- What’s the cost of getting it wrong?
- Where does the data need to be processed?
- What systems might we need to integrate with?
- What’s the total cost of ownership?
There are hundreds of AI models out there. But most CX teams don’t need hundreds, they just need the right combination of models, stitched into a product experience that actually works.
The companies winning with AI in customer experience aren't using the fanciest models. They're using the right models for each job, creating systems that are fast, reliable, and cost-effective at scale. They’re also working with partners who aren’t just “AI fluent”, but who know how to ship.
At the end of the day, users don’t care which model you used. They care that it works and that it works for them.
For those looking to take the next step on their AI journey, here's how we implement AI that delivers.
For more insights around what's trending in AI and CX right now, visit our FutureCX Trends Hub.