Skip to main content
IperChat
IT
Log in

AI models

Choosing your LLM model

Compare the 13 available AI models — capability, context window, credits per reply, and ideal use cases.

The LLM model is the “brain” of your agent: it decides how questions are understood, how the knowledge base is used, and what tone the replies take. IperChat gives you access to 13 models from OpenAI and Anthropic — some are very fast and cheap, others reason deeply on complex problems. This page helps you pick the right one for your use case.

You set the model in the ops panel under “Agent configurations” (sidebar → Agents group). Open your origin and go to “Behavior” → “Model”. You can switch any time: the new model takes effect within 60 seconds on new conversations.

Comparison table

ModelProviderContextCredits / msgBest for
GPT-5.5OpenAI128k7Deepest reasoning, specialist scenarios
Claude Opus 4.7Anthropic200k5Long documents, extended thinking
Claude Opus 4.6Anthropic200k5Opus alternative, slightly older
GPT-5.4OpenAI128k3Complex conversations, high accuracy
GPT-5.3OpenAI128k3Quality / cost balance
Claude Sonnet 4.6Anthropic200k3Large KBs, lead qualification
GPT-5.2OpenAI128k2Standard conversations with reasoning
GPT-5.2 ChatOpenAI128k2Smooth chat, conversational tone
GPT-5 MiniOpenAI128k1Customer support, high volume
Claude Haiku 4.5Anthropic200k1Fast replies on large KBs
GPT-4.1 MiniOpenAI128k1Cheap generalist
GPT-4oOpenAI128k1Battle-tested, multilingual
GPT-4o MiniOpenAI128k1Maximum volume, minimum cost

Tier 1 — Most capable (5–7 credits)

The premium models. Use them when accuracy matters more than per-message cost: regulated industries, specialist consulting, scenarios where a wrong answer is expensive.

GPT-5.5

OpenAI’s most capable model in our catalog. Configurable reasoning effort for problems that require multi-step thinking.

Best for:

  • Legal, healthcare, financial consulting
  • Multi-step reasoning over complex procedures
  • Analytical summaries of technical documents
  • Industries where a hallucination is costly

Claude Opus 4.7

Anthropic’s flagship with adaptive extended thinking: simple questions get an instant reply, complex ones trigger a “thinking” pass before answering. 200k-token context window — ideal for very large knowledge bases.

Best for:

  • Questions that span many documents at once
  • Analysis of contracts, reports, technical filings
  • Conversations that need long-range context memory
  • Natural, considered tone

Claude Opus 4.6

Previous Opus generation, kept available for setups already tuned on this version. Same characteristics as Opus 4.7 (200k context, extended thinking, 5 credits) with slightly less polished output.

Best for:

  • Established configurations on 4.6 (for stability)
  • A/B comparisons across Opus generations

Tier 2 — Balanced (2–3 credits)

The sweet spot for most production agents. Solid response quality, sustainable cost even at medium-to-high volume.

Claude Sonnet 4.6

Sonnet is Anthropic’s workhorse: 200k context, extended thinking on demand, moderate cost. Default choice for agents that work over large knowledge bases.

Best for:

  • Large knowledge bases (100+ documents)
  • Lead qualification with detailed criteria
  • Multilingual agents with a polished tone
  • Booking and conversational flow management

GPT-5.4

Recent OpenAI model with reasoning effort. Good quality/price tradeoff — great for conversations that need precision but not maximum analytical depth.

Best for:

  • Complex conversations at moderate cost
  • Questions requiring precise references to the KB
  • Technical agents (IT, engineering, software)

GPT-5.3

Previous generation of GPT-5.4. Same 3 credits, similar capabilities. Worth considering if you have active comparison tests.

Best for:

  • A/B tests across GPT-5 variants
  • Configurations already stable on 5.3

GPT-5.2

Reasoning effort enabled at 2 credits per message. Cheaper than 5.3 and 5.4 while keeping reasoning capability.

Best for:

  • Standard conversations with light reasoning
  • Medium volume on a tight budget

GPT-5.2 Chat

Chat-tuned variant of 5.2: same cost (2 credits) but optimized for smooth dialogue and a conversational tone.

Best for:

  • Customer-facing agents where tone matters
  • Informal chats, communities, self-service support
  • Small e-commerce with friendly interactions

Tier 3 — Fast & efficient (1 credit)

For high-volume use cases: standard customer support, repetitive FAQs, traffic deflection. Quick replies at low cost.

GPT-5 Mini

The mini version of GPT-5 with reasoning effort. Surprisingly capable for the price — a great default if you don’t want to overthink it.

Best for:

  • Customer support over common FAQs
  • Deflection of repetitive queries
  • High traffic on a controlled budget

Claude Haiku 4.5

Anthropic’s fastest model, with the 200k context window typical of the Claude family. No extended thinking — it’s built for speed. Excellent for large knowledge bases that don’t require deep reasoning.

Best for:

  • Large KBs with direct questions
  • Multichannel customer support at high volume
  • Fast replies over well-structured content

GPT-4.1 Mini

Classic OpenAI generalist. No reasoning, but supports temperature and top_p parameters if you want fine-grained control over response creativity.

Best for:

  • Setups that need custom temperature
  • Agents with very specific tone (creative or strict)

GPT-4o

OpenAI’s battle-tested multimodal model. Excellent multilingual support, natural writing quality.

Best for:

  • Multilingual agents with international users
  • Cases where you want a “stable” model that’s been around for months
  • Existing setups that already work well

GPT-4o Mini

The mini version of 4o: the cheapest model in the OpenAI pool. Maximum volume at minimum cost.

Best for:

  • Demos, prototypes, test environments
  • Very high traffic where budget beats perfection
  • First-line agents doing triage only

How to choose

A quick guide to narrow the choice based on your use case:

  • Customer support, fast FAQs, high traffic → Tier 3: Claude Haiku 4.5 or GPT-5 Mini.
  • Lead qualification, booking, medium-complexity conversations → Tier 2: Claude Sonnet 4.6 or GPT-5.4.
  • Regulated industries or specialist consulting (legal, healthcare, financial) → Tier 1: Claude Opus 4.7 or GPT-5.5.
  • Very large knowledge base (100+ documents, long contracts) → Claude family (200k context): Sonnet 4.6 or Opus 4.7.
  • English agent on technical contentGPT-5.4 or Claude Sonnet 4.6.
  • Demo, MVP, test trafficGPT-4o Mini or GPT-5 Mini.

Next steps