Implicit vs explicit retrieval
| Mode | When it triggers | Configuration |
|---|---|---|
| Implicit | Every user turn; the platform retrieves from all attached KBs | Attach KB in the Knowledge section |
| Explicit | Only when the LLM decides to call the knowledge_base tool | Add a tool in the Abilities section |
- The agent to announce it’s looking something up (“Let me check on that”)
- Different KBs queried for different question types
- Different
top_nvalues for different question types
Prompt patterns
Strict grounding
When hallucinations are unacceptable (medical, legal, financial):Permissive grounding
When approximate answers are useful and the cost of “I don’t know” is high:Quote-and-summarize
For technical or policy answers where exact wording matters:Multi-KB selection
When you attach two or more KBs to an agent, the platform queries all of them every turn and merges results by similarity. The LLM doesn’t choose between KBs; it sees the merged top-top_n.
If you want explicit per-question routing (“for product questions, query Product KB; for billing questions, query Billing KB”), use explicit knowledge_base tools instead of implicit attachment.
Failure handling
Two failures matter:1. The KB returns irrelevant chunks
The agent will likely hallucinate by stitching the irrelevant chunks into something that sounds confident. Mitigations:- Lower
top_nso only the highest-confidence matches survive - Add a strict grounding instruction (above)
- Improve the source content (see Retrieval tuning)
2. The KB query itself fails (rare)
Backend errors during retrieval cause the LLM to see an empty context for that turn. Configure:- A fallback prompt instruction: “If you don’t have any retrieved content, say ‘one moment’ and use the transfer tool.”
- An
actions_on_erroron the explicitknowledge_basetool, if you’re using one
Performance considerations
Implicit retrieval adds ~50-150ms to every turn. This is usually invisible. If you observe perceptible lag:- Reduce
top_n(default 5 is usually fine; 3 may shave latency) - Use
text-embedding-3-smallrather than-large(smaller embedding queries are faster end-to-end) - Split into multiple smaller KBs only if attached to the agent
Worked example
A dental scheduler agent has:- Implicit KB:
Dental FAQ(50 sources, structure-based chunking,top_n=5) — handles “do you take Aetna?”, “what hours are you open?”, “how long is a cleaning?” - Explicit tool:
lookup_provider_infoquerying aProvidersKB (3 sources per dentist) — fires when the caller asks a specific question about a specific dentist by name
Use the lookup_provider_info tool only when the caller mentions a specific dentist by name and asks about their background or specialty. For all other factual questions, answer from the knowledge base content.This split keeps generic FAQ retrieval fast and implicit, while reserving the explicit tool for high-value targeted lookups.

