Penelope v0.1.0 Penelope.ML.Word2vec.Index

This module represents a word2vec-style vectorset, compiled into a set of hash-partitioned DETS files. Each record is a tuple consisting of the term (word) and a set of weights (vector). This module also supports parsing the standard text representation of word vectors via the compile function.

On disk, the following files are created: /header.dets index header (version, metadata) /_.dets partition file

Link to this section Summary

Functions

closes the index

inserts word vectors from a text file into a word2vec index

creates a new word2vec index

inserts a word vector tuple into a word2vec index

searches for a term in the word2vec index

opens an existing word2vec index at the specified path

parses and inserts a single word vector text line into a word2vec index

parses a word vector line: “ …”

Link to this section Types

Link to this type t()
t() :: %Penelope.ML.Word2vec.Index{name: atom, partitions: pos_integer, tables: [atom], vector_size: pos_integer, version: pos_integer}

Link to this section Functions

Link to this function close(index)
close(index :: Penelope.ML.Word2vec.Index.t) :: :ok

closes the index

Link to this function compile!(index, path)
compile!(index :: Penelope.ML.Word2vec.Index.t, path :: String.t) :: :ok

inserts word vectors from a text file into a word2vec index

the index must have been opened using create()

Link to this function create!(path, name, options \\ [])
create!(path :: String.t, name :: String.t, [partitions: pos_integer, size_hint: pos_integer, vector_size: pos_integer]) :: Penelope.ML.Word2vec.Index.t

creates a new word2vec index

files will be created as /_.dets, one per partition

Link to this function do_lookup(index, term)
Link to this function insert!(index, record)
insert!(index :: Penelope.ML.Word2vec.Index.t, record :: {String.t, Penelope.ML.Vector.t}) :: :ok

inserts a word vector tuple into a word2vec index

Link to this function lookup!(index, term)

searches for a term in the word2vec index

if found, returns the word vector (no term) otherwise, returns nil

Link to this function open!(path, options \\ [])
open!(path :: String.t, [{:cache_size, pos_integer}]) :: Penelope.ML.Word2vec.Index.t

opens an existing word2vec index at the specified path

Link to this function parse_insert!(index, line)
parse_insert!(index :: Penelope.ML.Word2vec.Index.t, line :: String.t) :: {String.t, Penelope.ML.Vector.t}

parses and inserts a single word vector text line into a word2vec index

Link to this function parse_line!(line)
parse_line!(line :: String.t) :: {String.t, Penelope.ML.Vector.t}

parses a word vector line: “ …”