View Source Tsne (t-SNE v0.1.0)
t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data in two or three dimensions.
This library provides bindings to fast exact and
Barnes-Hut
implementations of t-SNE in Rust using the
bhtsne
crate.
Link to this section Summary
Link to this section Functions
Barnes Hut t-SNE.
Barnes-Hut is a tree-based algorithm for accelerating t-SNE. It runs in O(NlogN) time (while exact runs in O(N^2)) time.
options
Options
:embedding_dimensions
- Dimension of the embedded space. The default value is2
.:learning_rate
- The learning rate for t-SNE is usually in the range [10.0, 1000.0]. If the learning rate is too high, the data may look like a ‘ball’ with any point approximately equidistant from its nearest neighbours. If the learning rate is too low, most points may look compressed in a dense cloud with few outliers. If the cost function gets stuck in a bad local minimum increasing the learning rate may help. The default value is200.0
.:epochs
- Maximum number of iterations for the optimization. Should be at least 250. The default value is1000
.:perplexity
- The perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. Consider selecting a value between 5 and 50. Different values can result in significantly different results. The perplexity must be less than the number of samples. The default value is30.0
.:final_momentum
- The value for momentum after the initial early exaggeration phase. Seemomentum
for more info. The default value is0.8
.:momentum
- Gradient descent with momentum keeps a sum exponentially decaying weights from previous iterations, speeding up convergence. In early stages of the optimization, this is typically set to a lower value (0.5 in most implementations) since points generally move around quite a bit in this phase and increased after the initial early exaggeration phase (typically to 0.8, see:final_momentum
) to speed up convergence. The default value is0.5
.:metric
- The distance metric to use. Must be either:euclidean
or:cosine
. The default value is:euclidean
.:theta
- The tradeoff parameter between accuracy (0) and speed (1). The default value is0.5
.
Exact t-SNE.
options
Options
:embedding_dimensions
- Dimension of the embedded space. The default value is2
.:learning_rate
- The learning rate for t-SNE is usually in the range [10.0, 1000.0]. If the learning rate is too high, the data may look like a ‘ball’ with any point approximately equidistant from its nearest neighbours. If the learning rate is too low, most points may look compressed in a dense cloud with few outliers. If the cost function gets stuck in a bad local minimum increasing the learning rate may help. The default value is200.0
.:epochs
- Maximum number of iterations for the optimization. Should be at least 250. The default value is1000
.:perplexity
- The perplexity is related to the number of nearest neighbors that is used in other manifold learning algorithms. Larger datasets usually require a larger perplexity. Consider selecting a value between 5 and 50. Different values can result in significantly different results. The perplexity must be less than the number of samples. The default value is30.0
.:final_momentum
- The value for momentum after the initial early exaggeration phase. Seemomentum
for more info. The default value is0.8
.:momentum
- Gradient descent with momentum keeps a sum exponentially decaying weights from previous iterations, speeding up convergence. In early stages of the optimization, this is typically set to a lower value (0.5 in most implementations) since points generally move around quite a bit in this phase and increased after the initial early exaggeration phase (typically to 0.8, see:final_momentum
) to speed up convergence. The default value is0.5
.:metric
- The distance metric to use. Must be either:euclidean
or:cosine
. The default value is:euclidean
.:theta
- The tradeoff parameter between accuracy (0) and speed (1). The default value is0.5
.