TTS - Text to Speech

TTS - Text to Speech

What’s about

This section contains the blocks that can be used for text-to-speech conversion.

You can select your preferred provider from the list of those integrated with XCALLY, that you can find in the next paragraphs.


Please note that TTS providers are third-party applications, and their functionalities, costs, and behaviors depend on the provider you select.

An internet connection is required for the TTS blocks to properly function.



Google Cloud TTS

This block allows to perform Text-to-Speech conversion using Google Cloud TTS.

  • Label: here you can type a brief description

  • Provider: select the Google Cloud Provider account from the dropdown list

  • Text Type: select Text (PlainText) or SSML (Speech Synthesis Markup Language: refer to Google official documentation to find out more about how to use it)

  • Text: enter the text you would like to convert to speech using TTS

  • Language Code: the language you would like to use

  • Voice Type: select one of the proposed types like News, Standard, Studio, Polyglot

  • Voice Name: select the voice of the operator, choose between Female/Male and voice type

  • Speaker Type: select the device profile like smartphones, headphones, car speakers.

  • Speed: define the speed at which the TTS engine reads the text

  • Pitch: adjust the intonation of the generated voice


Exit Arrows

This block provides just one arrow out to the next step


Explore this documentation to find out How to retrieve Google Key for Cally Square blocks


Google TTS

This block allows to perform Text-to-Speech conversion using Google TTS, for internal testing.

  • Label: here you can type a brief description

  • Text: enter the text you would like to convert to speech using TTS. The maximum text length allowed is 200 characters.

  • Language: select the language you would like to use from the dropdown list


Exit Arrows

This block provides just one arrow out to the next step


We do not recommend using this block in a production environment, as it is intended for testing purposes only.

For production, we recommend using Google Cloud TTS.


ISpeech TTS

This block allows to perform Text-to-Speech conversion using the Ispeech TTS Agi Parameters

  • Label: here you can type a brief description

  • Text: enter the text you would like to convert to speech using TTS

  • Key: insert your license key from the ispeech.org account

  • Language: the language you would like to use for the translation, select it from the dropdown list

  • Speed: define the speed at which the TTS engine reads the text

  • Interrupt key: set a a key or digit that, when pressed by the user, can interrupt the ongoing speech playback.


Exit Arrows

This block provides just one arrow out to the next step.

AWS Polly

This block allows to perform Text-to-Speech conversion using AWS Polly Agi Parameters.

  • Label: here you can type a brief description

  • Access Key ID and Secret Access Key: insert AWS security credentials. See AWS Polly documentation

  • Region: select the AWS regional endpoint. See AWS Polly documentation

  • Voice: select the voice used for the synthesis, from the dropdown list

  • Text: enter the text you would like to convert to speech using TTS

  • Text Type: specifies whether the input text is plain text or SSML. The default value is plain text. See AWS Polly documentation


Exit Arrows

This block provides just one arrow out to the next step

Lumenvox TTS

This block allows to perform Text-to-Speech conversion using Lumenvox TTS


To make this block work you must install Lumenvox on a machine that is reachable by your system.


  • Label: here you can type a brief description

  • Text: enter the text you would like to convert to speech using TTS

  • Options: here you can define details about the synthesis. Valid options are:

    • l - language to use (e.g. "en-GB", "en-US", "en-AU", etc.)

    • v - voice name to use (e.g. "Lindsey", "Chris", etc.)

    • g - voice gender to use (e.g. "male", "female")

    • p - profile to use, as specified in the mrcp.conf file

    • i - digits to allow the TTS to be interrupted with (can specify "any" to allow any digits to interrupt)

    • f - filename on disk to store audio to (audio not stored if not specified or empty)

    • epe – exit on a play error

    • pv - prosody volume (silent/x-soft/soft/medium/load/x-loud/default)

    • pr - prosody rate (x-slow/slow/medium/fast/x-fast/default)

Multiple options can be provided by joining options with an ampersand, e.g. l=en-US&g=female


Exit Arrows

This block provides just one arrow out to the next step

Sestek TTS

This block allows to perform Text-to-Speech conversion using Sestek TTS.



  • Label: here you can type a brief description

  • Text: enter the text you would like to convert to speech using TTS

  • Options: they control details about the synthesis. Valid options are:

    • l - language to use (e.g. "en-GB", "en-US", "en-AU", etc.)

    • v - voice name to use (e.g. "Lindsey", "Chris", etc.)

    • g - voice gender to use (e.g. "male", "female")

    • p - profile to use, as specified in the mrcp.conf file

    • i - digits to allow the TTS to be interrupted with (can specify "any" to allow any digits to interrupt)

    • f - filename on disk to store audio to (audio not stored if not specified or empty)

    • epe – exit on a play error

    • pv - prosody volume (silent/x-soft/soft/medium/load/x-loud/default)

    • pr - prosody rate (x-slow/slow/medium/fast/x-fast/default)

Multiple options can be provided by joining options with an ampersand, e.g. l=en-US&g=female


Exit Arrows

This block provides just one arrow out to the next step

MRCP Synth

This box lets you perform a Text-To-Speech conversion using the Sestek TTS.


To make this block work you must install MRCP Synth on a machine that is reachable by your system.


  • Label: here you can type a brief description

  • Text: enter the text you would like to convert to speech using TTS

  • Options: they control details about the synthesis. Valid options are:

    • l - language to use (e.g. "en-GB", "en-US", "en-AU", etc.)

    • v - voice name to use (e.g. "Lindsey", "Chris", etc.)

    • g - voice gender to use (e.g. "male", "female")

    • p - profile to use, as specified in the mrcp.conf file

    • i - digits to allow the TTS to be interrupted with (can specify "any" to allow any digits to interrupt)

    • f - filename on disk to store audio to (audio not stored if not specified or empty)

    • epe – exit on a play error

    • pv - prosody volume (silent/x-soft/soft/medium/load/x-loud/default)

    • pr - prosody rate (x-slow/slow/medium/fast/x-fast/default)

Multiple options can be provided by joining options with an ampersand, e.g. l=en-US&g=female


Exit Arrows

This block provides just one arrow out to the next step

Related topics

Related content