Penelope v0.1.0 Penelope.ML.Word2vec.Index
This module represents a word2vec-style vectorset, compiled into a set of hash-partitioned DETS files. Each record is a tuple consisting of the term (word) and a set of weights (vector). This module also supports parsing the standard text representation of word vectors via the compile function.
On disk, the following files are created:
Link to this section Summary
Functions
closes the index
inserts word vectors from a text file into a word2vec index
creates a new word2vec index
inserts a word vector tuple into a word2vec index
searches for a term in the word2vec index
opens an existing word2vec index at the specified path
parses and inserts a single word vector text line into a word2vec index
parses a word vector line: “
Link to this section Types
t() :: %Penelope.ML.Word2vec.Index{name: atom, partitions: pos_integer, tables: [atom], vector_size: pos_integer, version: pos_integer}
Link to this section Functions
closes the index
compile!(index :: Penelope.ML.Word2vec.Index.t, path :: String.t) :: :ok
inserts word vectors from a text file into a word2vec index
the index must have been opened using create()
create!(path :: String.t, name :: String.t, [partitions: pos_integer, size_hint: pos_integer, vector_size: pos_integer]) :: Penelope.ML.Word2vec.Index.t
creates a new word2vec index
files will be created as
insert!(index :: Penelope.ML.Word2vec.Index.t, record :: {String.t, Penelope.ML.Vector.t}) :: :ok
inserts a word vector tuple into a word2vec index
lookup!(index :: Penelope.ML.Word2vec.Index.t, term :: String.t) :: Penelope.ML.Vector.t
searches for a term in the word2vec index
if found, returns the word vector (no term) otherwise, returns nil
open!(path :: String.t, [{:cache_size, pos_integer}]) :: Penelope.ML.Word2vec.Index.t
opens an existing word2vec index at the specified path
parse_insert!(index :: Penelope.ML.Word2vec.Index.t, line :: String.t) :: {String.t, Penelope.ML.Vector.t}
parses and inserts a single word vector text line into a word2vec index
parse_line!(line :: String.t) :: {String.t, Penelope.ML.Vector.t}
parses a word vector line: “