> ## Documentation Index
> Fetch the complete documentation index at: https://docs.anyreach.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Retrieval tuning

> Improve answer quality by adjusting how the KB is queried.

Once you have a working KB, retrieval quality has three knobs: **what** you index, **how** you chunk it, and **how many** chunks you pull at query time.

## The three levers

### 1. What you index

Garbage in, garbage out. Before tuning chunking or `top_n`, look at the actual chunks being retrieved (KB → source → view chunks) and ask:

* Is there boilerplate (nav, footers, copyright notices) competing with the real content?
* Are there outdated articles still in the index?
* Is the same fact present in multiple slightly-different versions, splitting the retrieval signal?

Cleaning up sources usually moves the needle more than any other tuning.

### 2. How you chunk

See [Chunking and embeddings](/knowledge-bases/chunking-and-embeddings). Quick guide:

* Switch to **structure-based** for any source where answers naturally live inside a section or list.
* Lower the chunk size (e.g. 500 chars) if your content is dense and you want more focused retrieval.
* Raise the chunk size (e.g. 1500 chars) if answers span multiple sentences and the LLM keeps missing context.

### 3. How many chunks (`top_n`)

Default is `5`. Adjustments:

| `top_n`       | Use when                                                                                         |
| ------------- | ------------------------------------------------------------------------------------------------ |
| `3`           | Content is highly precise, you want the LLM not to be distracted by lower-ranked matches         |
| `5` (default) | Most use cases                                                                                   |
| `8-10`        | Answers often require context from multiple chunks (e.g. a process spread over several sections) |
| `15+`         | Rarely useful; LLM starts to lose focus                                                          |

Set `top_n` per agent attachment, or per explicit `knowledge_base` tool.

## Testing retrieval directly

Use the API to test queries without involving an agent:

```bash theme={null}
curl -X POST https://api.anyreach.ai/knowledge-base/datasets/$DATASET_ID/query \
  -H "Authorization: Bearer $ANYREACH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "what is the warranty period?",
    "results_count": 5
  }'
```

The response includes the matching chunks and their similarity scores. Use this to:

* Validate retrieval quality outside the agent loop
* Build automated regression tests for KB changes
* Wire KB queries into workflows directly via an HTTP step

## Adding rewrites for hard cases

If callers ask a question in many different phrasings, add an FAQ-style rewrite source:

> **Q: How long is the warranty?**
> **A: All products carry a 2-year limited warranty from the purchase date. See \[warranty.md] for full details.**

This dense Q\&A chunk will match casual phrasings ("how long do I get to return", "is there a warranty", "what about defects") much better than the original policy doc.

A small "FAQ rewrites" source with 50-100 high-traffic Q\&As often outperforms ten times the volume of original docs.

## Multi-KB strategies

When content spans clearly distinct domains, splitting into multiple KBs and attaching all of them to one agent works better than one large KB:

* The agent retrieves from each KB with the same `top_n`
* Each KB has more focused embeddings (less semantic crowding)
* You can swap one KB's content without recomputing others

Use this when domains are unambiguous (Product / Policy / Pricing). Don't use this if the LLM would have to guess which KB to draw from for a single question — it doesn't choose, both are queried.

## When tuning isn't enough

If you've tuned `top_n`, switched chunking, cleaned up sources, and still miss key answers, the underlying problem is usually:

1. The answer literally isn't in the source content. Add it.
2. The query's phrasing is so different from the doc that no embedding model bridges the gap. Add Q\&A rewrites.
3. Your agent's prompt is suppressing KB-grounded answers. Check the system prompt for instructions like "answer concisely" — the LLM may be skipping context to be brief.
