Deepgram.Speak (Deepgram v0.1.0)
View SourceText-to-Speech services for the Deepgram API.
The Deepgram.Speak
module provides comprehensive text-to-speech synthesis capabilities
through Deepgram's API. It offers both synchronous (REST API) and asynchronous streaming
(WebSocket API) approaches for converting text to natural-sounding speech.
Key Features
- Text-to-Speech Synthesis - Convert text to high-quality audio
- Multiple Voice Models - Access to various voice models like Aura 2
- Voice Customization - Control pitch, rate, and other voice characteristics
- Audio Format Options - Support for various audio formats and encodings
- Streaming TTS - Real-time text-to-speech via WebSocket connections
- SSML Support - Speech Synthesis Markup Language for fine-grained control
- Asynchronous Callbacks - Send results to a webhook when processing completes
Authentication
All functions in this module require a properly configured Deepgram.Client
struct,
which can be created using Deepgram.new/1
.
Example:
# Create client with API key
client = Deepgram.new(api_key: System.get_env("DEEPGRAM_API_KEY"))
# Or with OAuth token
client = Deepgram.new(token: "your-oauth-token")
Basic Usage
Synthesize text to speech and get audio data:
client = Deepgram.new(api_key: System.get_env("DEEPGRAM_API_KEY"))
text_source = %{text: "Welcome to Deepgram's text to speech API."}
options = %{model: "aura-2-thalia-en", encoding: "mp3"}
{:ok, audio_data} = Deepgram.Speak.synthesize(client, text_source, options)
Save synthesized audio to a file:
{:ok, response} = Deepgram.Speak.save_to_file(client, "welcome.mp3", text_source, options)
Advanced Usage
Using Speech Synthesis Markup Language (SSML):
# Create request with SSML
ssml_source = %{ssml: "<speak><p>Welcome to <emphasis>Deepgram's</emphasis> API.</p></speak>"}
{:ok, audio_data} = Deepgram.Speak.synthesize(client, ssml_source, options)
Live streaming synthesis via WebSocket:
options = %{model: "aura-2-thalia-en", encoding: "mp3"}
{:ok, ws} = Deepgram.Speak.live_synthesis(client, options)
Summary
Functions
Starts a live text-to-speech WebSocket connection.
Synthesizes text to speech and saves it to a file.
Synthesizes text to speech and returns the audio data.
Synthesizes text to speech with callback support (asynchronous).
Functions
@spec live_synthesis(Deepgram.Client.t(), Deepgram.Types.Speak.speak_ws_options()) :: {:ok, pid()} | {:error, any()}
Starts a live text-to-speech WebSocket connection.
Parameters
client
- ADeepgram.Client
structoptions
- Optional live synthesis options (seeDeepgram.Types.Speak.speak_ws_options/0
)
Examples
iex> client = Deepgram.new(api_key: "your-api-key")
iex> options = %{model: "aura-2-thalia-en", encoding: "linear16"}
iex> {:ok, websocket} = Deepgram.Speak.live_synthesis(client, options)
{:ok, #PID<...>}
@spec save_to_file( Deepgram.Client.t(), String.t(), Deepgram.Types.Speak.text_source(), Deepgram.Types.Speak.speak_options() ) :: {:ok, Deepgram.Types.Speak.speak_response()} | {:error, any()}
Synthesizes text to speech and saves it to a file.
Parameters
client
- ADeepgram.Client
structfile_path
- Path where the audio file should be savedtext_source
- A map containing the text:%{text: "Hello, world!"}
options
- Optional synthesis options (seeDeepgram.Types.Speak.speak_options/0
)
Examples
iex> client = Deepgram.new(api_key: "your-api-key")
iex> text_source = %{text: "Hello, world!"}
iex> options = %{model: "aura-2-thalia-en", encoding: "linear16"}
iex> {:ok, response} = Deepgram.Speak.save_to_file(client, "output.wav", text_source, options)
{:ok, %{content_type: "audio/wav", ...}}
@spec synthesize( Deepgram.Client.t(), Deepgram.Types.Speak.text_source(), Deepgram.Types.Speak.speak_options() ) :: {:ok, binary()} | {:error, any()}
Synthesizes text to speech and returns the audio data.
Parameters
client
- ADeepgram.Client
structtext_source
- A map containing the text:%{text: "Hello, world!"}
options
- Optional synthesis options (seeDeepgram.Types.Speak.speak_options/0
)
Examples
iex> client = Deepgram.new(api_key: "your-api-key")
iex> text_source = %{text: "Hello, world!"}
iex> options = %{model: "aura-2-thalia-en", encoding: "linear16"}
iex> {:ok, audio_data} = Deepgram.Speak.synthesize(client, text_source, options)
{:ok, <<binary_audio_data>>}
@spec synthesize_callback( Deepgram.Client.t(), Deepgram.Types.Speak.text_source(), String.t(), Deepgram.Types.Speak.speak_options() ) :: {:ok, map()} | {:error, any()}
Synthesizes text to speech with callback support (asynchronous).
Parameters
client
- ADeepgram.Client
structtext_source
- A map containing the text:%{text: "Hello, world!"}
callback_url
- URL to receive the audio resultoptions
- Optional synthesis options (seeDeepgram.Types.Speak.speak_options/0
)
Examples
iex> client = Deepgram.new(api_key: "your-api-key")
iex> text_source = %{text: "Hello, world!"}
iex> callback_url = "https://example.com/webhook"
iex> options = %{model: "aura-2-thalia-en", encoding: "linear16"}
iex> {:ok, response} = Deepgram.Speak.synthesize_callback(client, text_source, callback_url, options)
{:ok, %{request_id: "..."}}