Leçon 1, Chapitre 1
En cours

Reconnaissance vocale – Fonctions python

Yann KIDSHAKER 17 mars 2026

In this session we use the following two extensions:

  1. Speech Recognition
  2. Text to Speech

Let us understand the functions from both of these extensions:

Speech Recognition

Speech recognition is the ability of a machine to identify words and phrases in spoken language and convert them to a machine-readable format.

Object declaration for speech recognition:

  • sr = SpeechRecognition()

Functions

  1. sethreshold(): The function sets a loudness filter threshold to remove the background noise from the audio file which is being analyzed.
    1. Syntax: setthreshold(loudness = 30)
    2. Parameters:
      1. loudness = 1 to 100
  2. analysespeech(): When the function is executed, a recognition window will open and you will get a specified time during which PictoBlox will record whatever you say. Once recorded, the speech will be converted to the text of the language you spoke in and saved locally.
    1. Syntax: analysespeech(time = 2, language = “en-US”)
    2. Parameters:
      1. time = Any positive integer
      2. language = {“ar-AE”, “ca-ES”, “da-DK”, “da-de”, “en-GB”, “en-US”, “es-ES”, “fi-FI”, “fr-FR”, “gu-IN”, “hi-IN”, “it-IT”, “ja-JP”, “ko-KR”, “mr-IN”, “nb-NO”, “nl-NL”, “pl-PL”, “pt-PT”, “ru-RU”, “sv-SE”, “ta-IN”, “te-IN”, “th-TH”, “tr-TR”, “zh-CN”, “zh-HK”}These parameters correspond to the following languages:
        {Arabic, Chinese (Mandarin), Danish, Dutch, English, French, German, Hindi, Icelandic, Italian, Japanese, Korean, Norwegian, Polish, Portuguese (Brazilian), Portuguese (European), Romanian, Russian, Spanish (European), Spanish (Latin American), Swedish, Turkish, Welsh}
  3. speechresult(): This function reports the last text detected from the speech.
    1. Syntax: speechresult():

Text to Speech

The Text to Speech Extension is an extension that allows conversion of text to speech. It is useful for projects where we want to output synthesized speech. This service is provided by Amazon Web Services.

Object declaration for text to speech:

  • ts = TexttoSpeech()

Functions

  1. speak(): This function speaks the text entered into it as argument.
    Usage of this function is limited to 128 characters. If a string longer than 128 characters is given, then only the first 128 characters will be spoken.

    1. Syntax: speak(time = 2, language = “en-US”)
    2. Parameters:
      1. time = Any positive integer
  2. setlanguageto(): The function causes the text to be spoken using the pronunciation of the given language. However it does not translate the text.
    1. Syntax: setlanguageto(language = “en-US”)
    2. Parameters:
      1. language = {“ar-AE”, “ca-ES”, “da-DK”, “da-de”, “en-GB”, “en-US”, “es-ES”, “fi-FI”, “fr-FR”, “gu-IN”, “hi-IN”, “it-IT”, “ja-JP”, “ko-KR”, “mr-IN”, “nb-NO”, “nl-NL”, “pl-PL”, “pt-PT”, “ru-RU”, “sv-SE”, “ta-IN”, “te-IN”, “th-TH”, “tr-TR”, “zh-CN”, “zh-HK”}These parameters correspond to the following languages:
        {Arabic, Chinese (Mandarin), Danish, Dutch, English, French, German, Hindi, Icelandic, Italian, Japanese, Korean, Norwegian, Polish, Portuguese (Brazilian), Portuguese (European), Romanian, Russian, Spanish (European), Spanish (Latin American), Swedish, Turkish, Welsh}
  3. setvoiceto(): This function changes the kind of voice used in Text to Speech.
    1. Syntax: setvoiceto(voice = “alto”)
    2. Parameter:
      1. voice = {“alto”, “tenor”, “squeak”, “giant”, “kitten”}