- Sync content from your CMS on a schedule
- Build pattern-based ingestion (one KB per product, populated from canonical URLs)
- Run KB queries from workflows without going through an agent
/knowledge-base prefix.
Create a KB
dataset_id you’ll use for all subsequent calls. (In the API and database a knowledge base is called a dataset.)
Add sources
One file at a time
One URL at a time
Pattern-based bulk add
Poll source readiness
A source is queryable when its processing status reachesready. Poll the dataset’s sources:
pending, converting_to_markdown, chunking, embedding, ready, failed) plus total_chunks and processed_chunks for in-flight sources.
Query a KB
Run retrieval without going through an agent:/knowledge-base/datasets/{id}/query endpoint.
Sync pattern: idempotent CMS → KB
There is no in-place “refresh” of a source — re-ingesting means deleting and re-adding. A typical nightly sync workflow:Limits and rate
- The add-source endpoint accepts an array of sources; use
/sources/patternfor large URL batches - Embedding throughput is bounded by your OpenAI/Azure quota
- Query rate is bounded by your plan

