Live Call Translator

What’s about

Live Call Translator provides the possibility of carrying out real-time transcription and translation of what is said and through a text channel allows the agent to write a message in his language. This message will be translated into the customer's language and synthesized vocally.


Video Demo

https://youtu.be/ouwudwYVl1g

 

Installation

First, it is necessary to access the machine via ssh and run the following command with the root user to install the package for creating Python virtual environments:
apt-get install python3-venv

The software is executed in the form of an XCALLY plugin, so it’s distributed as a zip archive to be installed directly through the admin interface on XCALLY Motion V3, from the AppZone → Plugins section.

image-20240725-101125.png

Once the plugin installation file has been loaded, you can proceed with the installation. Then you can configure and create all the elements required by this plugin.

Requirements

  • Open Channel Accounts, by entering the following URL in the Reply URL field: http://localhost:5000/api/createPlayTTS

  • On Voice Queue you need to allow recordings and use WAV format for records. To do this, select wav in the Recording Format field in the advanced section of the queue settings

  • On Outbound Route, allow recordings and use WAV format for records. To do this, select wav in the Recording Format field in the settings section of the outbound route

  • Enable audio split for voice recordings, going on Settings → General → Global and switch on this setting

  • At least one plugin-enabled agent in the desired queue: to do this you can edit the desired agent and open Permissions tab. At the bottom enable the plugin Live Call Translator. Moreover add the agent to the queue

  • Create a Context: Add a new context in Voice section and enter the name confbridge-context. This name cannot be changed.

  • Have credentials for at least one of the following providers: Google, Amazon AWS, OpenAI. In the next section, we will see in detail which credentials are needed for each provider

  • Create an Internal Route and configure as the image below

image-20240725-123306.png

Phone Number: _X.

Context: confbridge-context

In the Actions tab enter the following actions:

These are two custom applications, so drag & drop two 'custom' applications and configure them as follows.

Application Name: Set
Arguments: CONFBRIDGE(user,quiet)=yes

Application Name: ConfBridge
Arguments: ${EXTEN}

Configuration not necessary but useful if you want to enable automatic acceptance of the interaction by the agent:

Staff → Agents → three-dot menu of the desired agent → Edit Agent → Other Channels → Enable Openchannel auto answer

 

Plugin configuration

In the general section of the plugin, you will need to enter the URL of the server in use and the 'motion apikey'. In the other sections, you will find the tabs of the individual providers where you will need to enter the required data.

To make the plugin work properly, at least one provider must be correctly configured and entered in the general configurations of the individual transcription, translation and TTS (text-to-speech) services.

For some providers, it may not be sufficient to request an API key but they need to enable individual transcription, translation and text-to-speech services in their portal

Configure Cloud Provider

Google

Open your Google Cloud Service accounts, choose a project and create a new service account.
Afterwards, click on the email related to the service account that was created and go to Keys section to create new keys. You can select key type for the private key (among JSON and P12 types). By selecting JSON, the system downloads a JSON file on the computer with information to insert on the configuration. In fact by opening the JSON file you can copy from here (without apices):

  • Client email

  • Private Key

Enter this information in the respective fields within the plugin interface API Transcription, Translation and Text-To-Speech services must now be enabled. Return to the Google Cloud Console home page and search in the search bar for the three api services and click on enable.

 

 

Amazon AWS

To configure this provider you need this information, which you can obtain from your Amazon AWS account:

  • Access Key ID

  • Secret Access Key

You will also have to enter a region. Please note that the transcription service (streaming in this plugin) is only available in certain regions. Here you can see the regions available for transcription, use one of those that allows streaming transcription.

OpenAI

For OpenAI you only need the API key. From your account access the API section → API keys → Create a new secret key. Copy and paste the API key into the plugin

Services configurations

In the Transcript tab you can choose the transcript provider and other options:

  • Openchannel Account: an Openchannel Account must be indicated

  • Show original message: the agent will also be able to see the transcript and not just the translation

  • Enable Provider Change For Agent: the agent can change the following transcription parameters: provider, input language and output language

  • Skip Agent confirmation on start: Enabling this option will not display the window with the choice of provider, input language(s) and output language.

Other options only available for Google and OpenAI:

  • Processed audio length (between 2.0 and 8.0 seconds) ( recommended value 3.3):

    • The recording is divided into smaller audio files, with this option you choose the duration in seconds of these audio files to be sent to the provider for transcription. Reducing the duration reduces the latency between the time you speak and the time the transcription takes place. However, this may reduce the quality of the transcription.

  • Consecutive processed audio overlap (between 0.1 and 3.0)( recommended value 0.7):

    • This option affects the overlapping time of two consecutive audio files to be transcribed. Increasing this value will reduce the probability of words being lost between two consecutive audios, but will increase the probability of the final part of the first transcript being repeated in the first part of the next transcript.

Provider features and constraints:

  • OpenAI:

    • automatic language recognition.

    • It’s possible to choose a specific language, this could improve the speed and quality of transcription.

    • you will have to indicate an input language for the caller because if the caller has not yet spoken and you want to synthesize a message, you must have a target language.

  • Google:

    • It’s possible to choose a maximum of 5 possible input languages to be transcribed. By reducing the number of languages, transcription speed and quality may be improved.

  • Amazon AWS:

    • It’s possible to choose only a language.

In the Translate tab you can choose the default translation provider and the language to be translated into (the language who is called):

in the sub-tab of the translation section you can choose the provider to be used for a specific language pair:

Note: it is not necessary to indicate all the pairs as the provider is chosen using this logic if you try to translate a pair not present in the configuration: having set an output language into which to translate the system will select the provider corresponding to the last configuration set which presents the output language considered if it does not exist, the system selects the first language with the same macro-region (if we are looking for en-GB, the system will settle for en-US if it is found). If this search also fails, the system will perform the translation with the default translation provider.

The tts tab is similar to the translation section, there is a default provider. You can also choose the provider, voice type and speech synthesis engine for a specific language:

Finally, there is the LOGS tab where you can view the logs relating to the plugin.

Agent side

the agent cannot change all the configurations that were shown in the previous section. to use the plugin you will have to click on the icon at the bottom on the left.

after answering a call you can click on the call.

if the Skip Agent confirmation on the start option is not enabled, you will see this window:

where the agent will be able to modify, only for the call in progress, the default configurations set by the admin.
If the Skip Agent confirmation on start option is enabled and the Openchannel auto answer is enabled, the openchannel interaction corresponding to the current conversation will be opened and the transcription of the conversation will begin.

The agent will see the start transcript and translation message and then as the customer speaks the transcript will appear on the left in the white background messages and the translation in the agent's language will be in the yellow background messages.

The agent will be able to write messages, also using canned answers, which will be translated into the customer's language and a Text-to-Speech (TTS) speech synthesis will be played in the conversation.

In the case of using the auto-detection mode for choosing the client's language, a feature only available for the OpenAI provider, or using the Google provider with more than one language for transcription, it might be useful for the agent to set the client's language to improve the quality and speed of transcription. To do so, the agent has to type in the asterisk twice and then the language code without specifying the sub-region. For example, if the agent types **EN, lower or upper case is not important, English will be fixed as the customer's language. If he types **IT afterwards he will have fixed Italian as the language for the customer:

For the language codes can be forced by agents, see the tables in the appendix at the end of the document.

Logs

The application stores log files for each component and/or function that is executed in the background. You can find these logs in the Log view, where you can check when a specific log is been updated, download, or delete it.

The logs allow you to check the proper functioning of the application, and to eventually check for errors.

You can find logs in folder /var/log/xcally/live-call-translator
As mentioned, the log files are organized by component and by type, and can be of four types:

  • <component_name>-combined.log, includes the list of all complete logs for that component;

  • <component_name>-combined.date.log (e.g. api-combined.2024-01-01.log) : these files gather the log details for the specified component, including both debug and error messages, and for the day specified in the date part of the name. The logs are rotated daily, meaning that you’ll have a new combined log file for each component for each day. The rotation limit is set to 30 days, so after 30 days the oldest logs will be deleted;

  • <component_name>-error.log, includes the list of all error logs for that component;

  • <component_name>-error.date.log: these files gather the log details related to errors only, and for the day specified in the date part of the name. The logs are rotated daily, meaning that you’ll have a new combined log file for each component for each day. The rotation limit is set to 30 days, meaning that after 30 days the oldest logs will be deleted.

  • TranslatorService_backend.log: includes the complete logs for the backend service.

  • TranslatorService_latency.log: includes information on provider latency times. Average, minimum and maximum latency are provided per call.

TranslatorService_backend.log details:

Error type

Text

Description

Solution

Error type

Text

Description

Solution

Configuration

error: 'Plugin license is not enabled for this server'

error: 'Plugin license is not enabled for this server'

Request machine token enabling. You can find the token by following: Settings → License

Configuration

ERROR provider initialization: error_specification

The error specified in the configuration of the indicated provider is present.

Verify that all information entered for that provider is correct

Configuration

ERROR - asr_streaming_[provider_name]_translate: [Errno 2] No such file or directory:

System cannot find the recording file

Check if registration is enabled for the queue or for the outbound route used, check if split recording is enabled. See documentation

Configuration

ERROR - Exception on /api/createPlayTTS [POST] […] Invalid Value, Bad language pair:

The language code used does not exist or is not supported by the provider

Set the chosen language for the desired provider that supports the language or check that the language code entered exists

Configuration

ERROR - asr_streaming_openai_translate: Error code: 400 - {'error': {'message': "Invalid language 'arb'. Language parameter must be specified in ISO-639-1 format.", 'type': 'invalid_request_error', 'param': 'language', 'code': 'invalid_language_format'}}

The language code used does not exist or is not supported by the provider

Set the chosen language for the desired provider that supports the language or check that the language code entered exists

Configuration

ERROR - api rpc status is: 401

Motion username or Motion Password wrong

Check Motion username and Motion Password in Live Call Translator General Settings

 

ERROR - asr_streaming_openai_translate: Request timed out.

The request to OpenAI went into timeout

If the error occurs constantly, check the connection status and check the status of the OpenAI API at the following link https://status.openai.com/

Error

ERROR - asr_streaming_openai_translate: Error code: 400 - {'error': {'message': 'Audio file is too short. Minimum audio length is 0.1 seconds.', 'type': 'invalid_request_error', 'param': 'file', 'code': 'audio_too_short'}}

If it occurs at the end of the conversation it's not a problem. If it occurs several times in the same call it could be an error

Please collect relevant logs and proceed to open a ticket

Error

ERROR - trim_silence: data length must be a multiple of '(sample_width * channels)'

If it only occurs at the beginning of the call it's not a problem. if it occurs repeatedly within the same call it could be an error

Please collect relevant logs and proceed to open a ticket

Error

ERROR - : incomplete initialization

System failed to initialise correctly

Please collect relevant logs and proceed to open a ticket

Error

ERROR - update_call_info call not found

System failed to update the call data

Please collect relevant logs and proceed to open a ticket

translateService-error.yyyy-mm-dd.log details:

Error type

Text

Description

Solution

Error type

Text

Description

Solution

Error


Error starting translate service connect ECONNREFUSED 127.0.0.1:5000

The frontend cannot communicate with the backend service

Check that the backend service is active by executing the command 'ps ax | grep python'. If no live call translator plugin-related python process is present, restart the plugin. If it is present, contact support.

Language codes on different providers

Table 1, Google TTS language codes that can be forced by agents

Language code

Language

Provider

Language code

Language

Provider

af

Afrikaans (South Africa)

Google

ar

Arabic

Google

eu

Basque (Spain)

Google

bn

Bengali (India)

Google

bg

Bulgarian (Bulgaria)

Google

ca

Catalan (Spain)

Google

cs

Czech (Czech Republic)

Google

da

Danish (Denmark)

Google

nl

Dutch (Belgium)

Google

en

English (UK)

Google

fil Filipino

(Philippines)

Google

fi

Finnish (Finland)

Google

fr

French (Canada)

Google

gl

Galician (Spain)

Google

de

German (Germany)

Google

el

Greek (Greece)

Google

gu

Gujarati (India)

Google

he

Hebrew (Israel)hi Hindi (India)

Google

hu

Hungarian (Hungary)

Google

is

Icelandic (Iceland)

Google

id

Indonesian (Indonesia)

Google

it

Italian (Italy)

Google

ja

Japanese (Japan)

Google

kn

Kannada (India)

Google

ko

Korean (South Korea)

Google

lv

Latvian (Latvia)

Google

lt

Lithuanian (Lithuania)

Google

ms

Malay (Malaysia)

Google

ml

Malayalam (India)

Google

cmn

Mandarin Chinese

Google

mr

Marathi (India)

Google

nb

Norwegian (Norway)

Google

pl

Polish (Poland)

Google

pt

Portuguese (Brazil)

Google

pa

Punjabi (India)

Google

ro

Romanian (Romania)

Google

ru

Russian (Russia)

Google

sr

Serbian (Cyrillic)

Google

sk

Slovak (Slovakia)

Google

es

Spanish (Spain)

Google

sv Swedish

(Sweden)

Google

ta

Tamil (India)

Google

te

Telugu (India)

Google

th

Thai (Thailand)

Google

tr

Turkish (Turkey)

Google

uk

Ukrainian (Ukraine)

Google

vi

Vietnamese (Vietnam)

Google

Table 2, OpenAI TTS language codes that can be forced by agents

Language code

Language

Provider

Language code

Language

Provider

af

Afrikaans

OpenAI

ar

Arabic

OpenAI

hy

Armenian

OpenAI

az

Azerbaijani

OpenAI

be

Belarusian

OpenAI

bs

Bosnian

OpenAI

bg

Bulgarian

OpenAI

ca

Catalan

OpenAI

zh

Chinese

OpenAI

hr

Croatian

OpenAI

cs

Czech

OpenAI

da

Danish

OpenAI

nl

Dutch

OpenAI

en

English

OpenAI

et

Estonian

OpenAI

fi

Finnish

OpenAI

fr

French

OpenAI

gl

Galician

OpenAI

de

German

OpenAI

el

Greek

OpenAI

he

Hebrew

OpenAI

hi

Hindi

OpenAI

hu

Hungarian

OpenAI

is

Icelandic

OpenAI

id

Indonesian

OpenAI

it

Italian

OpenAI

ja

Japanese

OpenAI

kn

Kannada

OpenAI

kk

Kazakh

OpenAI

ko

Korean

OpenAI

lv

Latvian

OpenAI

lt

Lithuanian

OpenAI

mk

Macedonian

OpenAI

ms

Malay

OpenAI

mr

Marathi

OpenAI

mi

Maori

OpenAI

ne

Nepali

OpenAI

no

Norwegian

OpenAI

fa

Persian

OpenAI

pl

Polish

OpenAI

pt

Portuguese

OpenAI

ro

Romanian

OpenAI

ru

Russian

OpenAI

sr

Serbian

OpenAI

sk

Slovaksl Slovenian

OpenAI

sw

Swahili

OpenAI

sv

Swedish

OpenAI

tl

Tagalog

OpenAI

ta

Tamil

OpenAI

th

Thai

OpenAI

tr

Turkish

OpenAI

uk

Ukrainian

OpenAI

ur

Urdu

OpenAI

vi

Vietnamese

OpenAI

cy

Welsh

OpenAI

Table 3, Amazon AWS TTS language codes that can be forced by agents

Language code

Language

Provider

Language code

Language

Provider

arb

Arabic

Amazon AWS

ar

Arabic (Gulf)

Amazon AWS

cmn

Chinese (Mandarin)

Amazon AWS

da

Danish

Amazon AWS

en

English (British)

Amazon AWS

fr

French

Amazon AWS

de

German

Amazon AWS

it

Italian

Amazon AWS

ja

Japanese

Amazon AWS

ko

Konb Norwegian

Amazon AWS

pl

Polishpt Portuguese (Brazilian)

Amazon AWS

ro

Romanian

Amazon AWS

ru

Russian Spanish (European)

Amazon AWS

sv

Swedish

Amazon AWS

tr

Turkish

Amazon AWS

cy

Welsh

Amazon AWS

Related pages