Skip to main content
A KB by itself is just an index. The interesting questions are about how the agent uses it: when, with what query, and what to do when it misses.

Implicit vs explicit retrieval

ModeWhen it triggersConfiguration
ImplicitEvery user turn; the platform retrieves from all attached KBsAttach KB in the Knowledge section
ExplicitOnly when the LLM decides to call the knowledge_base toolAdd a tool in the Abilities section
Implicit retrieval is the default and is sufficient for ~80% of agents. Explicit retrieval is for cases where you want:
  • The agent to announce it’s looking something up (“Let me check on that”)
  • Different KBs queried for different question types
  • Different top_n values for different question types
You can combine them: attach a KB implicitly and expose an explicit tool against the same KB. The implicit retrieval covers casual questions; the explicit tool fires when the agent needs to make a deliberate lookup.

Prompt patterns

Strict grounding

When hallucinations are unacceptable (medical, legal, financial):
When the caller asks a factual question:
1. Use the knowledge base content provided in this turn's context.
2. If the answer is not in the knowledge base, say "I don't have that
   information — let me get someone who does" and use the
   transfer_to_human tool.
3. Never guess.

Permissive grounding

When approximate answers are useful and the cost of “I don’t know” is high:
Prefer the knowledge base for factual questions. If the knowledge base
doesn't cover it, you may answer from general knowledge — but make it
clear that you're not sure ("I'm not certain, but I believe...").

Quote-and-summarize

For technical or policy answers where exact wording matters:
When using knowledge base content, briefly summarize in your own words,
then offer to read the exact policy if the caller wants more detail.

Multi-KB selection

When you attach two or more KBs to an agent, the platform queries all of them every turn and merges results by similarity. The LLM doesn’t choose between KBs; it sees the merged top-top_n. If you want explicit per-question routing (“for product questions, query Product KB; for billing questions, query Billing KB”), use explicit knowledge_base tools instead of implicit attachment.

Failure handling

Two failures matter:

1. The KB returns irrelevant chunks

The agent will likely hallucinate by stitching the irrelevant chunks into something that sounds confident. Mitigations:
  • Lower top_n so only the highest-confidence matches survive
  • Add a strict grounding instruction (above)
  • Improve the source content (see Retrieval tuning)

2. The KB query itself fails (rare)

Backend errors during retrieval cause the LLM to see an empty context for that turn. Configure:
  • A fallback prompt instruction: “If you don’t have any retrieved content, say ‘one moment’ and use the transfer tool.”
  • An actions_on_error on the explicit knowledge_base tool, if you’re using one

Performance considerations

Implicit retrieval adds ~50-150ms to every turn. This is usually invisible. If you observe perceptible lag:
  • Reduce top_n (default 5 is usually fine; 3 may shave latency)
  • Use text-embedding-3-small rather than -large (smaller embedding queries are faster end-to-end)
  • Split into multiple smaller KBs only if attached to the agent

Worked example

A dental scheduler agent has:
  • Implicit KB: Dental FAQ (50 sources, structure-based chunking, top_n=5) — handles “do you take Aetna?”, “what hours are you open?”, “how long is a cleaning?”
  • Explicit tool: lookup_provider_info querying a Providers KB (3 sources per dentist) — fires when the caller asks a specific question about a specific dentist by name
The system prompt directs:
Use the lookup_provider_info tool only when the caller mentions a specific dentist by name and asks about their background or specialty. For all other factual questions, answer from the knowledge base content.
This split keeps generic FAQ retrieval fast and implicit, while reserving the explicit tool for high-value targeted lookups.