AI models
Choosing your LLM model
Compare the 13 available AI models — capability, context window, credits per reply, and ideal use cases.
The LLM model is the “brain” of your agent: it decides how questions are understood, how the knowledge base is used, and what tone the replies take. IperChat gives you access to 13 models from OpenAI and Anthropic — some are very fast and cheap, others reason deeply on complex problems. This page helps you pick the right one for your use case.
You set the model in the ops panel under “Agent configurations” (sidebar → Agents group). Open your origin and go to “Behavior” → “Model”. You can switch any time: the new model takes effect within 60 seconds on new conversations.
Comparison table
| Model | Provider | Context | Credits / msg | Best for |
|---|---|---|---|---|
| GPT-5.5 | OpenAI | 128k | 7 | Deepest reasoning, specialist scenarios |
| Claude Opus 4.7 | Anthropic | 200k | 5 | Long documents, extended thinking |
| Claude Opus 4.6 | Anthropic | 200k | 5 | Opus alternative, slightly older |
| GPT-5.4 | OpenAI | 128k | 3 | Complex conversations, high accuracy |
| GPT-5.3 | OpenAI | 128k | 3 | Quality / cost balance |
| Claude Sonnet 4.6 | Anthropic | 200k | 3 | Large KBs, lead qualification |
| GPT-5.2 | OpenAI | 128k | 2 | Standard conversations with reasoning |
| GPT-5.2 Chat | OpenAI | 128k | 2 | Smooth chat, conversational tone |
| GPT-5 Mini | OpenAI | 128k | 1 | Customer support, high volume |
| Claude Haiku 4.5 | Anthropic | 200k | 1 | Fast replies on large KBs |
| GPT-4.1 Mini | OpenAI | 128k | 1 | Cheap generalist |
| GPT-4o | OpenAI | 128k | 1 | Battle-tested, multilingual |
| GPT-4o Mini | OpenAI | 128k | 1 | Maximum volume, minimum cost |
Tier 1 — Most capable (5–7 credits)
The premium models. Use them when accuracy matters more than per-message cost: regulated industries, specialist consulting, scenarios where a wrong answer is expensive.
GPT-5.5
OpenAI’s most capable model in our catalog. Configurable reasoning effort for problems that require multi-step thinking.
Best for:
- Legal, healthcare, financial consulting
- Multi-step reasoning over complex procedures
- Analytical summaries of technical documents
- Industries where a hallucination is costly
Claude Opus 4.7
Anthropic’s flagship with adaptive extended thinking: simple questions get an instant reply, complex ones trigger a “thinking” pass before answering. 200k-token context window — ideal for very large knowledge bases.
Best for:
- Questions that span many documents at once
- Analysis of contracts, reports, technical filings
- Conversations that need long-range context memory
- Natural, considered tone
Claude Opus 4.6
Previous Opus generation, kept available for setups already tuned on this version. Same characteristics as Opus 4.7 (200k context, extended thinking, 5 credits) with slightly less polished output.
Best for:
- Established configurations on 4.6 (for stability)
- A/B comparisons across Opus generations
Tier 2 — Balanced (2–3 credits)
The sweet spot for most production agents. Solid response quality, sustainable cost even at medium-to-high volume.
Claude Sonnet 4.6
Sonnet is Anthropic’s workhorse: 200k context, extended thinking on demand, moderate cost. Default choice for agents that work over large knowledge bases.
Best for:
- Large knowledge bases (100+ documents)
- Lead qualification with detailed criteria
- Multilingual agents with a polished tone
- Booking and conversational flow management
GPT-5.4
Recent OpenAI model with reasoning effort. Good quality/price tradeoff — great for conversations that need precision but not maximum analytical depth.
Best for:
- Complex conversations at moderate cost
- Questions requiring precise references to the KB
- Technical agents (IT, engineering, software)
GPT-5.3
Previous generation of GPT-5.4. Same 3 credits, similar capabilities. Worth considering if you have active comparison tests.
Best for:
- A/B tests across GPT-5 variants
- Configurations already stable on 5.3
GPT-5.2
Reasoning effort enabled at 2 credits per message. Cheaper than 5.3 and 5.4 while keeping reasoning capability.
Best for:
- Standard conversations with light reasoning
- Medium volume on a tight budget
GPT-5.2 Chat
Chat-tuned variant of 5.2: same cost (2 credits) but optimized for smooth dialogue and a conversational tone.
Best for:
- Customer-facing agents where tone matters
- Informal chats, communities, self-service support
- Small e-commerce with friendly interactions
Tier 3 — Fast & efficient (1 credit)
For high-volume use cases: standard customer support, repetitive FAQs, traffic deflection. Quick replies at low cost.
GPT-5 Mini
The mini version of GPT-5 with reasoning effort. Surprisingly capable for the price — a great default if you don’t want to overthink it.
Best for:
- Customer support over common FAQs
- Deflection of repetitive queries
- High traffic on a controlled budget
Claude Haiku 4.5
Anthropic’s fastest model, with the 200k context window typical of the Claude family. No extended thinking — it’s built for speed. Excellent for large knowledge bases that don’t require deep reasoning.
Best for:
- Large KBs with direct questions
- Multichannel customer support at high volume
- Fast replies over well-structured content
GPT-4.1 Mini
Classic OpenAI generalist. No reasoning, but supports temperature and top_p parameters if you want fine-grained control over response creativity.
Best for:
- Setups that need custom temperature
- Agents with very specific tone (creative or strict)
GPT-4o
OpenAI’s battle-tested multimodal model. Excellent multilingual support, natural writing quality.
Best for:
- Multilingual agents with international users
- Cases where you want a “stable” model that’s been around for months
- Existing setups that already work well
GPT-4o Mini
The mini version of 4o: the cheapest model in the OpenAI pool. Maximum volume at minimum cost.
Best for:
- Demos, prototypes, test environments
- Very high traffic where budget beats perfection
- First-line agents doing triage only
How to choose
A quick guide to narrow the choice based on your use case:
- Customer support, fast FAQs, high traffic → Tier 3: Claude Haiku 4.5 or GPT-5 Mini.
- Lead qualification, booking, medium-complexity conversations → Tier 2: Claude Sonnet 4.6 or GPT-5.4.
- Regulated industries or specialist consulting (legal, healthcare, financial) → Tier 1: Claude Opus 4.7 or GPT-5.5.
- Very large knowledge base (100+ documents, long contracts) → Claude family (200k context): Sonnet 4.6 or Opus 4.7.
- English agent on technical content → GPT-5.4 or Claude Sonnet 4.6.
- Demo, MVP, test traffic → GPT-4o Mini or GPT-5 Mini.