Telemetry.Poller v0.1.0 Telemetry.Poller View Source
A time-based poller to periodically dispatch Telemetry events.
Measurements are MFAs called periodically by the Poller process. These MFAs should collect
a value (if possible) and dispatch an event using Telemetry.execute/3
function.
If the invokation of the MFA fails, the measurement is removed from the Poller.
See the “Example - (…)” sections for more concrete examples.
Starting and stopping
You can start the Poller using the start_link/1
function. Poller can be alaso started as a
part of your supervision tree, using both the old-style and the new-style child specifications:
# pre Elixir 1.5.0
children = [Supervisor.Spec.worker(Telemetry.Poller, [[period: 5000]])]
# post Elixir 1.5.0
children = [{Telemetry.Poller, [period: 5000]}]
Supervisor.start_link(children, [strategy: :one_for_one])
You can start as many Pollers as you wish, but generally you shouldn’t need to do it, unless you know that it’s not keeping up with collecting all specified measurements.
Measurements need to be provided via :measurements
option.
VM measurements
The vm_measurements/1
function returns common measurements related to Erlang virtual machine
metrics. See its documentation for more information.
Example - measuring message queue length of the process
Measuring process’ message queue length is a good way to find out if and when the process becomes the bottleneck. If the length of the queue is growing, it means that the process is not keeping up with the work it’s been assigned and other processes asking it to do the work will get timeouts. Let’s try to simulate that situation using the following GenServer:
defmodule Worker do
use GenServer
def start_link(name) do
GenServer.start_link(__MODULE__, [], name: name)
end
def do_work(name) do
GenServer.call(name, :do_work, timeout = 5_000)
end
def init([]) do
{:ok, %{}}
end
def handle_call(:do_work, _, state) do
Process.sleep(1000)
{:reply, :ok, state}
end
end
When assigned with work (handle_call/3
), the worker will sleep for 1 second to imitate long
running task.
Now we need a measurement dispatching the message queue length of the worker:
defmodule ExampleApp.Measurements do
def message_queue_length(name) do
with pid when is_pid(pid) <- Process.whereis(name),
{:message_queue_len, length} <- Process.info(pid, :message_queue_len) do
Telemetry.execute([:example_app, :message_queue_length], length, %{name: name})
end
end
end
Let’s start the worker and Poller with just defined measurement:
iex> name = MyWorker
iex> {:ok, pid} = Worker.start_link(name)
iex> Telemetry.Poller.start_link(
...> measurements: [{ExampleApp.Measurements, :message_queue_length, [MyWorker]}],
...> period: 2000
...> )
{:ok, _}
In order to observe the message queue length we can install the event handler printing it out to the console:
iex> defmodule Handler do
...> def handle([:example_app, :message_queue_length], length, %{name: name}, _) do
...> IO.puts("Process #{inspect(name)} message queue length: #{length}")
...> end
...> end
iex> Telemetry.attach(:handler, [:example_app, :message_queue_length], Handler, :handle)
:ok
Now let’s start assigning work to the worker:
iex> for _ <- 1..1000 do
...> spawn_link(fn -> Worker.do_work(name) end)
...> Process.sleep(500)
...> end
iex> :ok
:ok
Here we start 1000 processes placing a work order, waiting 500 milliseconds after starting each one. Given that the worker does its work in 1000 milliseconds, it means that new work orders come twice as fast as the worker is able to complete them. In the console, you’ll see something like this:
Process MyWorker message queue length: 1
Process MyWorker message queue length: 3
Process MyWorker message queue length: 5
Process MyWorker message queue length: 7
and finally:
** (EXIT from #PID<0.168.0>) shell process exited with reason: exited in: GenServer.call(Worker, :do_work, 5000)
** (EXIT) time out
The worker wasn’t able to complete the work on time (we set the 5000 millisecond timeout) and
Worker.do_work/1
finally failed. Observing the message queue length metric allowed us to notice
that the worker is the system’s bottleneck. In a healthy situation the message queue length would
be roughly constant.
Example - tracking number of active sessions in web application
Let’s imagine that you have a web application and you would like to periodically measure number of active user sessions.
defmodule ExampleApp do
def session_count() do
# logic for calculating session count
...
end
end
To achieve that, we need a measurement dispatching the value we’re interested in:
defmodule ExampleApp.Measurements do
def dispatch_session_count() do
Telemetry.execute([:example_app, :session_count], ExampleApp.session_count())
end
end
and tell the Poller to invoke it periodically:
Telemetry.Poller.start_link(measurements: [
{ExampleApp.Measurements, :dispatch_session_count, []}
])
If you find that you need to somehow label the event values, e.g. differentiate between number of sessions of regular and admin users, you could use event metadata:
defmodule ExampleApp.Measurements do
def dispatch_session_count() do
regulars = ExampleApp.regular_users_session_count()
admins = ExampleApp.admin_users_session_count()
Telemetry.execute([:example_app, :session_count], regulars, %{role: :regular})
Telemetry.execute([:example_app, :session_count], admins, %{role: :admin})
end
end
Note: the other solution would be to dispatch two different events by hooking up
ExampleApp.regular_users_session_count/0
andExampleApp.admin_users_session_count/0
functions directly. However, if you add more and more user roles to your app, you’ll find yourself creating a new event for each one of them, which will force you to modify existing event handlers. If you can break down event value by some feature, like user role in this example, it’s usually better to use event metadata than add new events.
This is a perfect use case for Poller, because you don’t need to write a dedicated process which would call these functions periodically. Additionally, if you find that you need to collect more statistics like this in the future, you can easily hook them up to the same Poller process and avoid creating lots of processes which would stay idle most of the time.
Link to this section Summary
Functions
Returns a child specifiction for Poller
Returns a list of measurements used by the poller
Starts a Poller linked to the calling process
Stops the poller
with specified reason
Returns measurements dispatching events with Erlang virtual machine metrics
Link to this section Types
option() :: {:name, GenServer.name()} | {:period, period()} | {:measurements, [measurement()]}
vm_measurement() :: :total_memory | :processes_memory | :processes_used_memory | :system_memory | :atom_memory | :atom_used_memory | :binary_memory | :code_memory | :ets_memory
Link to this section Functions
Returns a child specifiction for Poller.
It accepts options/0
as an argument, meaning that it’s valid to start it under the supervisor
as follows:
alias Telemetry.Poller
# use default options
Supervisor.start_link([Poller], supervisor_opts) # use default options
# customize options
Supervisor.start_link([{Poller, period: 10_000}], supervisor_opts)
# modify the child spec
Supervisor.start_link(Supervisor.child_spec(Poller, id: MyPoller), supervisor_opts)
list_measurements(t()) :: [measurement()]
Returns a list of measurements used by the poller.
start_link(options()) :: GenServer.on_start()
Starts a Poller linked to the calling process.
Useful for starting Pollers as a part of a supervision tree.
Options
:measurements
- a list of measurements used by Poller. For description of possible values seeTelemetry.Poller
module documentation;:period
- time period before performing the same measurement again, in milliseconds. Default value is 10000 ms;:name
- the name of the Poller process. See “Name Registragion” section ofGenServer
documentation for information about allowed values.
Stops the poller
with specified reason
.
See documentation for GenServer.stop/3
to learn more about the behaviour of this function.
vm_measurements([vm_measurement()]) :: [measurement()]
Returns measurements dispatching events with Erlang virtual machine metrics
It accepts a list vm_measurement/0
s and returns a list of measurement/0
s which can
be provided to start_link/1
’s :measurements
option.
Do not rely on the exact values returned by this function - the only guarantee is that they
are of type measurement/0
and their modification will not be considered a breaking change,
unless the shape of events dispatched by returned measurements changes.
Returned measurements are unique.
Available measurements
Memory
See documentation for :erlang.memory/0
function for more information about each type of memory
measured.
:total_memory
- dispatches an event with total amount of currently allocated memory, in bytes. Event name is[:vm, :memory, :total]
and event metadata is empty;:processes_memory
- dispatches an event with amount of memory cyrrently allocated for processes, in bytes. Event name is[:vm, :memory, :processes]
and event metadata is empty;:processes_used_memory
- dispatches an event with amount of memory currently used for processes, in bytes. Event name is[:vm, :memory, :processes_used]
and event metadata is empty. Memory measured is a fraction of value collected by:processes_memory
measurement;:binary_memory
- dispatches an event with amount of memory currently allocated for binaries. Event name is[:vm, :memory, :binary]
and event metadata is empty;:ets_memory
- dispatches an event with amount of memory currently allocated for ETS tables. Event name is[:vm, :memory, :ets]
and event metadata is empty;:system_memory
- dispatches an event with amount of currently allocated memory not directly related to any process running in the VM, in bytes. Event name is[:vm, :memory, :system]
and event metadata is empty;:atom_memory
- dispatches an event with amount of memory currently allocated for atoms. Event name is[:vm, :memory, :atom]
and event metadata is empty;:atom_used_memory
- dispatches an event with amount of memory currently used for atoms. Event name is[:vm, :memory, :atom_used]
and event metadata is empty;:code_memory
- dispatches an event with amount of memory currently allocated for code. Event name is[:vm, :memory, :code]
and event metadata is empty;
Default measurements
The 0-arity version of this function includes :total_memory
, :processes_memory
,
:processes_used_memory
, :binary_memory
and :ets_memory
measurements by default.
Examples
alias Telemetry.Poller
Poller.start_link(
measurements: Poller.vm_measurements() ++ Poller.vm_measurements(:atom_memory)
)