Browser automation for Jido AI agents.
Overview
Jido.Browser is organized around three simple lanes:
web_fetch/2for stateless HTTP-first retrievalstart_session/1andend_session/1for browser-backed workflowsJido.Browser.Poolplusstart_session(pool: ...)as an optional acceleration layer
agent-browser remains the default adapter. Web also supports warm pools when
you want browser-backed sessions with lower cold-start overhead. Vibium
remains available without warm-pool support.
The Hex package and OTP app remain jido_browser, while the public Elixir namespace is Jido.Browser.*.
Installation
Add the dependency:
def deps do
[
{:jido_browser, "~> 2.0"}
]
endInstall the default browser backend:
mix jido_browser.install
That installs the pinned agent-browser binary for the current platform and runs agent-browser install to provision the browser runtime.
Recommended Alias Setup
defp aliases do
[
setup: ["deps.get", "jido_browser.install --if-missing"],
test: ["jido_browser.install --if-missing", "test"]
]
endInstalling Specific Backends
mix jido_browser.install agent_browser
mix jido_browser.install vibium
mix jido_browser.install web
Quick Start
{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, snapshot} = Jido.Browser.snapshot(session)
snapshot["snapshot"] || snapshot[:snapshot]
{:ok, session, _} = Jido.Browser.click(session, "@e1")
{:ok, _session, %{content: markdown}} = Jido.Browser.extract_content(session, format: :markdown)
:ok = Jido.Browser.end_session(session)Selectors remain supported, but ref-based interaction is the preferred 2.0 flow:
snapshot- act on
@eNrefs - re-snapshot
Stateless Web Fetch
{:ok, result} =
Jido.Browser.web_fetch(
"https://example.com/docs",
format: :markdown,
allowed_domains: ["example.com"],
focus_terms: ["API", "authentication"],
citations: true
)
result.content
result.passages
result.metadata # present when extraction returns document metadataweb_fetch/2 keeps HTML handling native for selector extraction and markdown conversion, and uses extractous_ex for fetched binary documents such as PDFs, Word, Excel, PowerPoint, OpenDocument, EPUB, and common email formats. Binary document responses may also include result.metadata when extraction returns document metadata.
State Persistence
state_path = Path.expand("tmp/browser-state.json")
File.mkdir_p!(Path.dirname(state_path))
{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.save_state(session, state_path)
:ok = Jido.Browser.end_session(session)
{:ok, restored} = Jido.Browser.start_session()
{:ok, restored, _} = Jido.Browser.load_state(restored, state_path)Tab Workflow
{:ok, session} = Jido.Browser.start_session()
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
{:ok, session, _} = Jido.Browser.new_tab(session, "https://example.org")
{:ok, session, tabs} = Jido.Browser.list_tabs(session)
{:ok, session, _} = Jido.Browser.switch_tab(session, 1)
{:ok, session, _} = Jido.Browser.close_tab(session, 1)Warm Session Pools
Warm pools are explicit and optional. They speed up browser-backed workflows,
while web_fetch/2 stays stateless and never uses pools.
For OTP applications, prefer adding a named pool to your supervision tree:
defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [
{Jido.Browser.Pool,
name: :default,
size: 2,
headless: true,
startup_timeout: 60_000}
]
Supervisor.start_link(children, strategy: :one_for_one, name: MyApp.Supervisor)
end
endThen check out pooled sessions by name:
{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)Use start_pool/1 for scripts, tests, or ad hoc startup:
{:ok, _pool} =
Jido.Browser.start_pool(
name: :default,
size: 2,
headless: true
)
{:ok, session} =
Jido.Browser.start_session(
pool: :default,
checkout_timeout: 5_000
)
{:ok, session, _} = Jido.Browser.navigate(session, "https://example.com")
:ok = Jido.Browser.end_session(session)Warm pools are currently supported by Jido.Browser.Adapters.AgentBrowser and
Jido.Browser.Adapters.Web.
- AgentBrowser pools keep full warm daemon-backed sessions ready for checkout.
- Web pools keep reserved warmed profiles ready for checkout.
end_session/1always recycles the checked-out worker and warms a replacement in the background.
For the Web adapter, pooled sessions are still browser sessions, not HTTP
fetches. Use web_fetch/2 when you want the simplest request/response API
without browser state.
Plugin Setup
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "browser_agent",
plugins: [
{Jido.Browser.Plugin,
[
adapter: Jido.Browser.Adapters.AgentBrowser,
pool: :default,
checkout_timeout: 5_000,
headless: true,
timeout: 30_000
]}
]
endConfiguration
config :jido_browser,
adapter: Jido.Browser.Adapters.AgentBrowser
config :jido_browser, :agent_browser,
binary_path: "/usr/local/bin/agent-browser",
headed: falseOther adapters can still be configured explicitly:
config :jido_browser, :vibium,
binary_path: "/path/to/vibium"
config :jido_browser, :web,
binary_path: "/usr/local/bin/web",
profile: "default"Optional web fetch settings:
config :jido_browser, :web_fetch,
cache_ttl_ms: 300_000,
extractous: [
pdf: [extract_annotation_text: true],
office: [include_headers_and_footers: true]
]Configured extractous options are merged with any per-call extractous: keyword options passed to Jido.Browser.web_fetch/2.
Backends
AgentBrowser (Default)
- native snapshot support with refs
- supervised daemon per session
- optional warm session pools with explicit checkout
- direct JSON IPC from Elixir
- built-in state save/load and tab management support
Vibium (Legacy)
- retained for transitional compatibility
- feature-frozen in 2.0
Web (Legacy)
- retained for transitional compatibility
- feature-frozen in 2.0
Public API
Core operations:
start_pool/1stop_pool/1start_session/1end_session/1navigate/3click/3type/4screenshot/2extract_content/2web_fetch/2evaluate/3
Agent-browser-native operations:
snapshot/2wait_for_selector/3wait_for_navigation/2query/3get_text/3get_attribute/4is_visible/3save_state/3load_state/3list_tabs/2new_tab/3switch_tab/3close_tab/3console/2errors/2
Available Actions
Session
StartSessionEndSessionGetStatusSaveStateLoadState
Navigation
NavigateBackForwardReloadGetUrlGetTitle
Interaction
ClickTypeHoverFocusScrollSelectOption
Waiting and Queries
WaitWaitForSelectorWaitForNavigationQueryGetTextGetAttributeIsVisible
Content and Diagnostics
SnapshotScreenshotExtractContentConsoleErrors
Tabs
ListTabsNewTabSwitchTabCloseTab
Advanced and Composite
EvaluateReadPageSnapshotUrlSearchWebWebFetch
Using With Jido Agents
defmodule MyBrowsingAgent do
use Jido.Agent,
name: "web_browser",
description: "An agent that can browse the web",
plugins: [{Jido.Browser.Plugin, [headless: true]}]
endJido.Browser.Plugin now exposes 37 browser actions, including snapshot/refs workflows, browser state actions, diagnostics, and tab management.
License
Apache-2.0 - See LICENSE for details.