View Source ExAzureSpeech.SpeechToText.Recognizer (ex_azure_speech v0.1.0)

Speech-to-Text Recognizer module, which provides the functionality to recognize speech from audio input.

Internals

The communication with the Speech-to-Text API is done through a WebSocket connection. For safety and isolation purposes, each recognition request is handled by a separate WebSocket connection. This is achieved by spawning a new WebSocket process thats supervised by a DynamicSupervisor, which will guarantee that the WebSocket connection is properly terminated after the recognition process is done.

Supported Formats

Right now the recognition service supports only RIFF WAV (WAVE) audio format. The audio must be mono, with a sample rate of 16 kHz and 16-bit PCM encoding.

Summary

Types

See the SocketConfig.t() and SpeechContextConfig.t() module for more information on the available options.

Functions

Recognizes speech from the given audio input continuously. It imediately returns a stream that can be lazily consumed.

Synchronously recognizes speech from the given audio input.

Types

@type opts() :: [
  socket_opts: ExAzureSpeech.SpeechToText.SocketConfig.t() | nil,
  speech_context_opts: ExAzureSpeech.SpeechToText.SpeechContextConfig.t() | nil,
  timeout: integer() | nil
]

See the SocketConfig.t() and SpeechContextConfig.t() module for more information on the available options.

Functions

Link to this function

recognize_continous(stream, opts \\ [])

View Source

Recognizes speech from the given audio input continuously. It imediately returns a stream that can be lazily consumed.

Link to this function

recognize_once(stream, opts \\ [])

View Source

Synchronously recognizes speech from the given audio input.

@spec start_link(any()) :: :ignore | {:error, any()} | {:ok, pid()}