Choosing your LLM model

The LLM model is the “brain” of your agent: it decides how questions are understood, how the knowledge base is used, and what tone the replies take. IperChat gives you access to 13 models from OpenAI and Anthropic — some are very fast and cheap, others reason deeply on complex problems. This page helps you pick the right one for your use case.

You set the model in the ops panel under “Agent configurations” (sidebar → Agents group). Open your origin and go to “Behavior” → “Model”. You can switch any time: the new model takes effect within 60 seconds on new conversations.

Comparison table

Model	Provider	Context	Credits / msg	Best for
GPT-5.5	OpenAI	128k	7	Deepest reasoning, specialist scenarios
Claude Opus 4.7	Anthropic	200k	5	Long documents, extended thinking
Claude Opus 4.6	Anthropic	200k	5	Opus alternative, slightly older
GPT-5.4	OpenAI	128k	3	Complex conversations, high accuracy
GPT-5.3	OpenAI	128k	3	Quality / cost balance
Claude Sonnet 4.6	Anthropic	200k	3	Large KBs, lead qualification
GPT-5.2	OpenAI	128k	2	Standard conversations with reasoning
GPT-5.2 Chat	OpenAI	128k	2	Smooth chat, conversational tone
GPT-5 Mini	OpenAI	128k	1	Customer support, high volume
Claude Haiku 4.5	Anthropic	200k	1	Fast replies on large KBs
GPT-4.1 Mini	OpenAI	128k	1	Cheap generalist
GPT-4o	OpenAI	128k	1	Battle-tested, multilingual
GPT-4o Mini	OpenAI	128k	1	Maximum volume, minimum cost

Tier 1 — Most capable (5–7 credits)

The premium models. Use them when accuracy matters more than per-message cost: regulated industries, specialist consulting, scenarios where a wrong answer is expensive.

GPT-5.5

OpenAI’s most capable model in our catalog. Configurable reasoning effort for problems that require multi-step thinking.

Best for:

Legal, healthcare, financial consulting
Multi-step reasoning over complex procedures
Analytical summaries of technical documents
Industries where a hallucination is costly

Claude Opus 4.7

Anthropic’s flagship with adaptive extended thinking: simple questions get an instant reply, complex ones trigger a “thinking” pass before answering. 200k-token context window — ideal for very large knowledge bases.

Best for:

Questions that span many documents at once
Analysis of contracts, reports, technical filings
Conversations that need long-range context memory
Natural, considered tone

Claude Opus 4.6

Previous Opus generation, kept available for setups already tuned on this version. Same characteristics as Opus 4.7 (200k context, extended thinking, 5 credits) with slightly less polished output.

Best for:

Established configurations on 4.6 (for stability)
A/B comparisons across Opus generations

Tier 2 — Balanced (2–3 credits)

The sweet spot for most production agents. Solid response quality, sustainable cost even at medium-to-high volume.

Claude Sonnet 4.6

Sonnet is Anthropic’s workhorse: 200k context, extended thinking on demand, moderate cost. Default choice for agents that work over large knowledge bases.

Best for:

Large knowledge bases (100+ documents)
Lead qualification with detailed criteria
Multilingual agents with a polished tone
Booking and conversational flow management

GPT-5.4

Recent OpenAI model with reasoning effort. Good quality/price tradeoff — great for conversations that need precision but not maximum analytical depth.

Best for:

Complex conversations at moderate cost
Questions requiring precise references to the KB
Technical agents (IT, engineering, software)

GPT-5.3

Previous generation of GPT-5.4. Same 3 credits, similar capabilities. Worth considering if you have active comparison tests.

Best for:

A/B tests across GPT-5 variants
Configurations already stable on 5.3

GPT-5.2

Reasoning effort enabled at 2 credits per message. Cheaper than 5.3 and 5.4 while keeping reasoning capability.

Best for:

Standard conversations with light reasoning
Medium volume on a tight budget

GPT-5.2 Chat

Chat-tuned variant of 5.2: same cost (2 credits) but optimized for smooth dialogue and a conversational tone.

Best for:

Customer-facing agents where tone matters
Informal chats, communities, self-service support
Small e-commerce with friendly interactions

Tier 3 — Fast & efficient (1 credit)

For high-volume use cases: standard customer support, repetitive FAQs, traffic deflection. Quick replies at low cost.

GPT-5 Mini

The mini version of GPT-5 with reasoning effort. Surprisingly capable for the price — a great default if you don’t want to overthink it.

Best for:

Customer support over common FAQs
Deflection of repetitive queries
High traffic on a controlled budget

Claude Haiku 4.5

Anthropic’s fastest model, with the 200k context window typical of the Claude family. No extended thinking — it’s built for speed. Excellent for large knowledge bases that don’t require deep reasoning.

Best for:

Large KBs with direct questions
Multichannel customer support at high volume
Fast replies over well-structured content

GPT-4.1 Mini

Classic OpenAI generalist. No reasoning, but supports temperature and top_p parameters if you want fine-grained control over response creativity.

Best for:

Setups that need custom temperature
Agents with very specific tone (creative or strict)

GPT-4o

OpenAI’s battle-tested multimodal model. Excellent multilingual support, natural writing quality.

Best for:

Multilingual agents with international users
Cases where you want a “stable” model that’s been around for months
Existing setups that already work well

GPT-4o Mini

The mini version of 4o: the cheapest model in the OpenAI pool. Maximum volume at minimum cost.

Best for:

Demos, prototypes, test environments
Very high traffic where budget beats perfection
First-line agents doing triage only

How to choose

A quick guide to narrow the choice based on your use case:

Customer support, fast FAQs, high traffic → Tier 3: Claude Haiku 4.5 or GPT-5 Mini.
Lead qualification, booking, medium-complexity conversations → Tier 2: Claude Sonnet 4.6 or GPT-5.4.
Regulated industries or specialist consulting (legal, healthcare, financial) → Tier 1: Claude Opus 4.7 or GPT-5.5.
Very large knowledge base (100+ documents, long contracts) → Claude family (200k context): Sonnet 4.6 or Opus 4.7.
English agent on technical content → GPT-5.4 or Claude Sonnet 4.6.
Demo, MVP, test traffic → GPT-4o Mini or GPT-5 Mini.

Comparison table

Tier 1 — Most capable (5–7 credits)

GPT-5.5

Claude Opus 4.7

Claude Opus 4.6

Tier 2 — Balanced (2–3 credits)

Claude Sonnet 4.6

GPT-5.4

GPT-5.3

GPT-5.2

GPT-5.2 Chat

Tier 3 — Fast & efficient (1 credit)

GPT-5 Mini

Claude Haiku 4.5

GPT-4.1 Mini

GPT-4o

GPT-4o Mini

How to choose

Next steps