ExLLM.Providers.Gemini.Live (ex_llm v0.8.1)

View Source

Google Gemini Live API implementation using WebSockets.

The Live API enables real-time bidirectional communication with Gemini models, supporting text, audio, and video inputs with streaming responses.

Features

  • Real-time text, audio, and video streaming
  • Bidirectional communication with interruption support
  • Tool/function calling in real-time sessions
  • Session resumption capabilities
  • Activity detection and management
  • Audio transcription for both input and output

Usage

# Start a live session
config = %{
  model: "models/gemini-2.5-flash-preview-05-20",
  generation_config: %{
    temperature: 0.7,
    response_modalities: ["TEXT", "AUDIO"]
  },
  system_instruction: "You are a helpful assistant."
}

{:ok, session} = Live.start_session(config, api_key: "your-api-key")

# Send text message
:ok = Live.send_text(session, "Hello, how are you?")

# Send audio data
:ok = Live.send_audio(session, audio_chunk)

# Listen for responses
receive do
  {:live_response, :server_content, content} ->
    IO.puts("Model response: #{content.model_turn_content}")
  {:live_response, :tool_call, tool_call} ->
    # Handle tool call
    response = execute_tool(tool_call)
    Live.send_tool_response(session, response)
end

# Close session
Live.close_session(session)

Authentication

The Live API supports both API key and OAuth2 authentication:

  • API key: Passed as query parameter in WebSocket URL
  • OAuth2: Passed as Authorization header during WebSocket handshake

Summary

Functions

Returns a specification to start this module under a supervisor.

Closes the Live API session.

Sends activity end signal to the session.

Sends activity start signal to the session.

Sends audio data to the session.

Sends real-time text input to the session.

Sends a text message to the session.

Sends tool/function response to the session.

Sends video data to the session.

Starts a new Live API session.

Types

t()

@type t() :: %ExLLM.Providers.Gemini.Live{
  api_key: String.t() | nil,
  config: map(),
  conn_pid: pid() | nil,
  oauth_token: String.t() | nil,
  owner_pid: pid(),
  status: :connecting | :connected | :ready | :closed | :error,
  stream_ref: reference() | nil
}

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

close_session(session)

@spec close_session(pid()) :: :ok

Closes the Live API session.

send_activity_end(session)

@spec send_activity_end(pid()) :: :ok | {:error, term()}

Sends activity end signal to the session.

This is used when automatic activity detection is disabled.

send_activity_start(session)

@spec send_activity_start(pid()) :: :ok | {:error, term()}

Sends activity start signal to the session.

This is used when automatic activity detection is disabled.

send_audio(session, audio_data)

@spec send_audio(pid(), binary()) :: :ok | {:error, term()}

Sends audio data to the session.

Parameters

  • session - Session process pid
  • audio_data - Binary audio data

send_realtime_text(session, text)

@spec send_realtime_text(pid(), String.t()) :: :ok | {:error, term()}

Sends real-time text input to the session.

This is different from send_text/3 as it's designed for streaming text input that doesn't interrupt model generation.

send_text(session, text, opts \\ [])

@spec send_text(pid(), String.t(), keyword()) :: :ok | {:error, term()}

Sends a text message to the session.

Parameters

  • session - Session process pid
  • text - Text content to send
  • opts - Options

Options

  • :turn_complete - Whether this completes the user's turn (default: true)

send_tool_response(session, function_responses)

@spec send_tool_response(pid(), [map()]) :: :ok | {:error, term()}

Sends tool/function response to the session.

Parameters

  • session - Session process pid
  • function_responses - List of function response objects

send_video(session, video_data)

@spec send_video(pid(), binary()) :: :ok | {:error, term()}

Sends video data to the session.

Parameters

  • session - Session process pid
  • video_data - Binary video data

start_session(config, opts \\ [])

@spec start_session(
  map(),
  keyword()
) :: {:ok, pid()} | {:error, term()}

Starts a new Live API session.

Parameters

  • config - Session configuration containing model and parameters
  • opts - Authentication and options

Options

  • :api_key - Google API key for authentication
  • :oauth_token - OAuth2 token for authentication (alternative to API key)
  • :owner_pid - Process to receive session messages (defaults to caller)

Returns

  • {:ok, session_pid} - Session GenServer process
  • {:error, reason} - Error details

Examples

config = %{
  model: "models/gemini-2.5-flash-preview-05-20",
  generation_config: %{
    temperature: 0.7,
    response_modalities: ["TEXT", "AUDIO"]
  }
}

{:ok, session} = Live.start_session(config, api_key: "your-api-key")