View Source ExAzureSpeech.SpeechToText.Recognizer (ex_azure_speech v0.1.1)
Speech-to-Text Recognizer module, which provides the functionality to recognize speech from audio input.
Internals
The communication with the Speech-to-Text API is done through a WebSocket connection. For safety and isolation purposes, each recognition request is handled by a separate WebSocket connection. This is achieved by spawning a new WebSocket process thats supervised by a DynamicSupervisor, which will guarantee that the WebSocket connection is properly terminated after the recognition process is done.
Supported Formats
Right now the recognition service supports only RIFF WAV (WAVE) audio format. The audio must be mono, with a sample rate of 16 kHz and 16-bit PCM encoding.
Summary
Types
See the SocketConfig.t()
and SpeechContextConfig.t()
module for more information on the available options.
Functions
Recognizes speech from the given audio input continuously. It imediately returns a stream that can be lazily consumed.
Synchronously recognizes speech from the given audio input.
Types
@type opts() :: [ socket_opts: ExAzureSpeech.SpeechToText.SocketConfig.t() | nil, speech_context_opts: ExAzureSpeech.SpeechToText.SpeechContextConfig.t() | nil, timeout: integer() | nil ]
See the SocketConfig.t()
and SpeechContextConfig.t()
module for more information on the available options.
Functions
@spec recognize_continous( audio_stream :: Enumerable.t(), recognition_options :: opts() ) :: {:ok, Enumerable.t()} | {:error, ExAzureSpeech.Common.Errors.Internal.t() | ExAzureSpeech.Common.Errors.InvalidResponse.t() | ExAzureSpeech.Common.Errors.Forbidden.t() | NimbleOptions.ValidationError.t()}
Recognizes speech from the given audio input continuously. It imediately returns a stream that can be lazily consumed.
@spec recognize_once(audio_stream :: Enumerable.t(), recognition_options :: opts()) :: {:ok, [ExAzureSpeech.SpeechToText.Responses.SpeechPhrase.t()]} | {:error, ExAzureSpeech.Common.Errors.Internal.t() | ExAzureSpeech.Common.Errors.InvalidResponse.t() | ExAzureSpeech.Common.Errors.Forbidden.t() | NimbleOptions.ValidationError.t() | ExAzureSpeech.Common.Errors.Timeout.t()}
Synchronously recognizes speech from the given audio input.