Skip to main content
Two settings on the text-to-speech (TTS) stack control how the agent speaks: pronunciations overrides how specific words are said, and text_normalization rewrites prices in the LLM’s output before they reach the TTS engine. Both live on the agent version’s stt_llm_tts stack alongside the rest of your voice and model config.

Pronunciations

Tts.pronunciations is a dictionary that maps a word to the pronunciation the TTS engine should use for it. Use it to fix brand names, acronyms, and other terms that the default voice mispronounces.
FieldTypeDefaultDescription
pronunciationsobject (string -> string)nullMaps each source word to its pronunciation override.
{
  "tts": {
    "models": [ /* ... */ ],
    "pronunciations": {
      "Anyreach": "any reach",
      "SQL": "sequel"
    }
  }
}
When pronunciations is omitted or null, no overrides are applied.

Text normalization

TextNormalizationConfig rewrites the LLM’s text before it is spoken so that prices are read aloud naturally. Normalization runs in a separate text-normalization-service that processes the text before it reaches the TTS engine. It is disabled by default (enabled is false). Set enabled to true to turn it on.
FieldTypeDefaultDescription
enabledbooleanfalseEnables text normalization for TTS.
price_verbosityformal | conversationalformalHow prices are spoken.
shorthand_cents_extraarray<int>[]Extra cents endings that use shorthand under conversational.
{
  "text_normalization": {
    "enabled": true,
    "price_verbosity": "conversational",
    "shorthand_cents_extra": [25, 75]
  }
}

Price verbosity

price_verbosity controls how monetary amounts are spoken. It applies only to the $, £, and symbols.
ValueBehavior
formalFull long form, for example $9.99 becomes “nine dollars and ninety-nine cents”.
conversationalRetail shorthand for $1$99 amounts whose cents fall in the shorthand set, for example $9.99 becomes “nine ninety-nine”.
Under conversational, only amounts in the $1$99 range with a cents value in the shorthand set are spoken as shorthand. Other amounts fall back to the long form.

Shorthand cents endings

The shorthand set used by conversational verbosity defaults to {49, 50, 95, 99}. Add more cents endings with shorthand_cents_extra. Each value must be an integer in the 099 range. shorthand_cents_extra is ignored when price_verbosity is formal.
LimitValue
Default shorthand cents set49, 50, 95, 99
Allowed value range099
Values are validated, de-duplicated, and sorted. Out-of-range values or non-integers are rejected.
The legacy flat enable_text_normalization flag is auto-migrated. When an older agent version is loaded, that boolean is lifted into the nested text_normalization config as {"enabled": <value>}, so you only need to set the nested form going forward.

Voice and model config

Configure the STT, LLM, and TTS stack that these settings extend.