ExLLM.Providers.Gemini.Live (ex_llm v0.8.1)
View SourceGoogle Gemini Live API implementation using WebSockets.
The Live API enables real-time bidirectional communication with Gemini models, supporting text, audio, and video inputs with streaming responses.
Features
- Real-time text, audio, and video streaming
- Bidirectional communication with interruption support
- Tool/function calling in real-time sessions
- Session resumption capabilities
- Activity detection and management
- Audio transcription for both input and output
Usage
# Start a live session
config = %{
model: "models/gemini-2.5-flash-preview-05-20",
generation_config: %{
temperature: 0.7,
response_modalities: ["TEXT", "AUDIO"]
},
system_instruction: "You are a helpful assistant."
}
{:ok, session} = Live.start_session(config, api_key: "your-api-key")
# Send text message
:ok = Live.send_text(session, "Hello, how are you?")
# Send audio data
:ok = Live.send_audio(session, audio_chunk)
# Listen for responses
receive do
{:live_response, :server_content, content} ->
IO.puts("Model response: #{content.model_turn_content}")
{:live_response, :tool_call, tool_call} ->
# Handle tool call
response = execute_tool(tool_call)
Live.send_tool_response(session, response)
end
# Close session
Live.close_session(session)
Authentication
The Live API supports both API key and OAuth2 authentication:
- API key: Passed as query parameter in WebSocket URL
- OAuth2: Passed as Authorization header during WebSocket handshake
Summary
Functions
Returns a specification to start this module under a supervisor.
Closes the Live API session.
Sends activity end signal to the session.
Sends activity start signal to the session.
Sends audio data to the session.
Sends real-time text input to the session.
Sends a text message to the session.
Sends tool/function response to the session.
Sends video data to the session.
Starts a new Live API session.
Types
Functions
Returns a specification to start this module under a supervisor.
See Supervisor
.
@spec close_session(pid()) :: :ok
Closes the Live API session.
Sends activity end signal to the session.
This is used when automatic activity detection is disabled.
Sends activity start signal to the session.
This is used when automatic activity detection is disabled.
Sends audio data to the session.
Parameters
session
- Session process pidaudio_data
- Binary audio data
Sends real-time text input to the session.
This is different from send_text/3 as it's designed for streaming text input that doesn't interrupt model generation.
Sends a text message to the session.
Parameters
session
- Session process pidtext
- Text content to sendopts
- Options
Options
:turn_complete
- Whether this completes the user's turn (default: true)
Sends tool/function response to the session.
Parameters
session
- Session process pidfunction_responses
- List of function response objects
Sends video data to the session.
Parameters
session
- Session process pidvideo_data
- Binary video data
Starts a new Live API session.
Parameters
config
- Session configuration containing model and parametersopts
- Authentication and options
Options
:api_key
- Google API key for authentication:oauth_token
- OAuth2 token for authentication (alternative to API key):owner_pid
- Process to receive session messages (defaults to caller)
Returns
{:ok, session_pid}
- Session GenServer process{:error, reason}
- Error details
Examples
config = %{
model: "models/gemini-2.5-flash-preview-05-20",
generation_config: %{
temperature: 0.7,
response_modalities: ["TEXT", "AUDIO"]
}
}
{:ok, session} = Live.start_session(config, api_key: "your-api-key")