vowpal_fleet v0.1.5 VowpalFleet View Source

Vowpal Fleet - manage Vowpal Wabbit instances usint Swarm

Installation

  • Make sure you have Vowpal Wabbit installed and it is findable in $PATH
  • add the dependency to your mix.exs
def deps do
  [
    {:vowpal_fleet, "~> 0.1.0"}
  ]
end

def application do
  [
    extra_applications: [:vowpal_fleet]
  ]
end
  • configure the parameters, edit config/config.exs
config :vowpal_fleet,
  root: "/tmp/vw",
  some_cluster_id: %{:autosave => 300_000, :args => ["--random_seed", "123"]}
  some_bandit_cluster: %{:autosave => 300_000, :args => ["--random_seed", "123", "--cb_explore", "3"]}

Work In Progress

More testing is needed to ensure that the failure scenarios are covered, at the moment the code just works but.. well take it with grain of salt

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
...
:ok
iex> VowpalFleet.start_worker(:some_cluster_id, :instance_2)
...
:ok
iex> VowpalFleet.train(:some_cluster_id, 1, [{"features", [1, 2, 3]}])
:ok
iex> VowpalFleet.predict(:some_cluster_id, [{"features", [1, 2, 3]}])
1.0
iex> VowpalFleet.start_worker(:test_abc_bandit, :a_1, %{:autosave => 60_000,:args => ["--cb_explore", "3"]})
:ok
iex> VowpalFleet.train(:test_abc_bandit, [{1, 100, 0.7}, {3, 70, 0.3}], [{:test_abc, [1, 2, 3]}])
:ok
iex> VowpalFleet.predict(:test_abc_bandit, [{:test_abc, [1, 2, 3]}])
[0.016667, 0.966667, 0.016667]

Configuration

config :vowpal_fleet,
  root: "/tmp/vw",
  some_cluster_id: %{:autosave => 300_000, :args => ["--random_seed", "123",]}
  some_bandit: %{:autosave => 300_000, :args => ["--random_seed", "123", "--cb_explore","3"]}

Handoff

When the process has to be moved to a different node, the working model is saved, and then handed off to the starting process

issues fork license - MIT

Link to this section Summary

Functions

shut it down, kill the pid and close the socket

load a binary model on all the nodes in a group

Send |namespace feature1 feature2:1\n … to a random instance from the specified group of Swarm.members/1

saves a binary model on all the nodes in a group, and returns a list of all the models, can be fed to VowpalFleet.load/2

Start a Vowpal Wabbit instance, (running local vw --port 0 --daemon ...) and connect to it usint TCP, then publish the new node in the Swarm using Swarm.register_name/5

Send label |namespace feature1 feature2:1\n … to vw in all the active instances using Swarm.publish/2 for the selected cluster if label is array it will assume --cb_explore and send action_idx:cost:prob (read more on Logged-Contextual-Bandit-Example

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)
  • label: the training label (for example -1 for click, 1 for convert), or [{1,100,0.3},{3,70,0.3}] list of type VowpalFleet.Type.action/0
  • namespaces: training features of that example, list of VowpalFleet.Type.namespace/0 type

Link to this section Functions

Link to this function exit(group, name) View Source
exit(atom(), atom()) :: :ok

shut it down, kill the pid and close the socket

Link to this function load(group, model) View Source
load(atom(), binary()) :: [:ok]

load a binary model on all the nodes in a group

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)
  • model: binary output of File.read! of vowpal’s regressor, or Enum.random(VowpalFleet.save(:some_cluster_id))

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
:ok
iex> VowpalFleet.load(:some_cluster_id, Enum.random(VowpalFleet.save(:some_cluster_id)))
[:ok]
iex>
Link to this function predict(group, namespaces) View Source
predict(atom(), [VowpalFleet.Type.namespace()]) :: float()

Send |namespace feature1 feature2:1\n … to a random instance from the specified group of Swarm.members/1

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)
  • namespaces: training features of that example, list of VowpalFleet.Type.namespace/0 type

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
:ok
iex(3)> VowpalFleet.predict(:some_cluster_id, [{"features", [1, 2, 3]}])
0.632031

saves a binary model on all the nodes in a group, and returns a list of all the models, can be fed to VowpalFleet.load/2

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
:ok
iex> VowpalFleet.save(:some_cluster_id)

15:04:10.402 [info]  waiting for /tmp/vw/vw.some_cluster_id_instance_1.model

15:04:11.403 [info]  waiting for /tmp/vw/vw.some_cluster_id_instance_1.model
[
  <<6, 0, 0, 0, 56, 46, 54, 46, 49, 0, 1, 0, 0, 0, 0, 109, 0, 0, 0, 0, 0, 0, 0,
    0, 18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 31, 0, 0, 0, 32, 45, 45,
    104, 97, ...>>
]
iex>
Link to this function start_worker(group, name, settings \\ nil) View Source
start_worker(atom(), atom(), %{} | nil) :: :ok

Start a Vowpal Wabbit instance, (running local vw --port 0 --daemon ...) and connect to it usint TCP, then publish the new node in the Swarm using Swarm.register_name/5

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)
  • name: instance id (e.g. :xyz)
  • settings: you can send parameters same as config :vowpal_fleet, :cluster_id like %{:autosave => 300_000, :args => ["--random_seed", "123", "--cb_explore","3"]}

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
11:42:38.973 [info]  [swarm on nonode@nohost] [tracker:cluster_wait] joining cluster..

11:42:38.973 [info]  [swarm on nonode@nohost] [tracker:cluster_wait] no connected nodes, proceeding without sync

11:42:38.984 [debug] [swarm on nonode@nohost] [tracker:handle_call] registering :some_cluster_id_instance_1 as process started by Elixir.VowpalFleet.Supervisor.register/1 with args [some_cluster_id: :instance_1]

11:42:38.984 [debug] [swarm on nonode@nohost] [tracker:do_track] starting :some_cluster_id_instance_1 on nonode@nohost

11:42:38.984 [debug] killing 75225

11:42:38.991 [info]  waiting for /tmp/vw/vw.some_cluster_id_instance_1.port

11:42:39.992 [info]  waiting for /tmp/vw/vw.some_cluster_id_instance_1.port

11:42:39.993 [info]  waiting for /tmp/vw/vw.some_cluster_id_instance_1.pid

11:42:39.997 [debug] starting group: some_cluster_id, vw some_cluster_id_instance_1 60895 75980

11:42:39.997 [debug] autosaving every 3600000

11:42:40.000 [debug] [swarm on nonode@nohost] [tracker:do_track] started :some_cluster_id_instance_1 on nonode@nohost

11:42:40.002 [debug] [swarm on nonode@nohost] [tracker:handle_call] add_meta {:some_cluster_id, true} to #PID<0.218.0>
:ok
iex> VowpalFleet.start_worker(:test_abc_bandit, :a_1, %{:autosave => 60_000,:args => ["--cb_explore", "3"]})
...
:ok
Link to this function train(group, label, namespaces) View Source

Send label |namespace feature1 feature2:1\n … to vw in all the active instances using Swarm.publish/2 for the selected cluster if label is array it will assume --cb_explore and send action_idx:cost:prob (read more on Logged-Contextual-Bandit-Example

Parameters

  • group: some kind of cluster id, for examle model_name (:linear_abc_something)
  • label: the training label (for example -1 for click, 1 for convert), or [{1,100,0.3},{3,70,0.3}] list of type VowpalFleet.Type.action/0
  • namespaces: training features of that example, list of VowpalFleet.Type.namespace/0 type

Examples

iex> VowpalFleet.start_worker(:some_cluster_id, :instance_1)
:ok
iex> VowpalFleet.train(:some_cluster_id, 1, [{"features", [1, 2, 3]}])
:ok

Bandit Example

if yous tart vw with "--cb_explore", "3" you can send array of actions

iex> VowpalFleet.start_worker(:bandit, :node, %{:autosave => 300_000, :args => ["--random_seed", "123", "--cb_explore", "3"]})
iex> VowpalFleet.train(:bandit, [{1, 100, 0.7}, {3, 70, 0.3}], [{:namespace, [1, 2, 3]}])
:ok
iex> VowpalFleet.predict(:bandit,[{:namespace, [1, 2, 3]}])
[0.016667, 0.966667, 0.016667]