t-distributed Stochastic Neighbor Embedding (TSNE) for visualizing high- dimensional data.
cuda_ml_tsne( x, n_components = 2L, n_neighbors = ceiling(3 * perplexity), method = c("barnes_hut", "fft", "exact"), angle = 0.5, n_iter = 1000L, learning_rate = 200, learning_rate_method = c("adaptive", "none"), perplexity = 30, perplexity_max_iter = 100L, perplexity_tol = 1e-05, early_exaggeration = 12, late_exaggeration = 1, exaggeration_iter = 250L, min_grad_norm = 1e-07, pre_momentum = 0.5, post_momentum = 0.8, square_distances = TRUE, seed = NULL, cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace") )
x | The input matrix or dataframe. Each data point should be a row and should consist of numeric values only. |
---|---|
n_components | Dimension of the embedded space. |
n_neighbors | The number of datapoints to use in the attractive forces. Default: ceiling(3 * perplexity). |
method | T-SNE method, must be one of "barnes_hut", "fft", "exact". The "exact" method will be more accurate but slower. Both "barnes_hut" and "fft" methods are fast approximations. |
angle | Valid values are between 0.0 and 1.0, which trade off speed and accuracy, respectively. Generally, these values are set between 0.2 and 0.8. (Barnes-Hut only.) |
n_iter | Maximum number of iterations for the optimization. Should be at least 250. Default: 1000L. |
learning_rate | Learning rate of the t-SNE algorithm, usually between (10, 1000). If the learning rate is too high, then t-SNE result could look like a cloud / ball of points. |
learning_rate_method | Must be one of "adaptive", "none". If "adaptive", then learning rate, early exaggeration, and perplexity are automatically tuned based on input size. Default: "adaptive". |
perplexity | The target value of the conditional distribution's perplexity (see https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding for details). |
perplexity_max_iter | The number of epochs the best Gaussian bands are found for. Default: 100L. |
perplexity_tol | Stop optimizing the Gaussian bands when the conditional distribution's perplexity is within this desired tolerance compared to its taget value. Default: 1e-5. |
early_exaggeration | Controls the space between clusters. Not critical to tune this. Default: 12.0. |
late_exaggeration | Controls the space between clusters. It may be beneficial to increase this slightly to improve cluster separation. This will be applied after `exaggeration_iter` iterations (FFT only). |
exaggeration_iter | Number of exaggeration iterations. Default: 250L. |
min_grad_norm | If the gradient norm is below this threshold, the optimization will be stopped. Default: 1e-7. |
pre_momentum | During the exaggeration iteration, more forcefully apply gradients. Default: 0.5. |
post_momentum | During the late phases, less forcefully apply gradients. Default: 0.8. |
square_distances | Whether TSNE should square the distance values. |
seed | Seed to the psuedorandom number generator. Setting this can make
repeated runs look more similar. Note, however, that this highly
parallelized t-SNE implementation is not completely deterministic between
runs, even with the same |
cuML_log_level | Log level within cuML library functions. Must be one of "off", "critical", "error", "warn", "info", "debug", "trace". Default: off. |
A matrix containing the embedding of the input data in a low- dimensional space, with each row representing an embedded data point.
library(cuda.ml) embedding <- cuda_ml_tsne(iris[1:4], method = "exact") set.seed(0L) print(kmeans(embedding, centers = 3))#> K-means clustering with 3 clusters of sizes 148, 1, 1 #> #> Cluster means: #> [,1] [,2] [,3] [,4] [,5] [,6] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,7] [,8] [,9] [,10] [,11] [,12] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,13] [,14] [,15] [,16] [,17] [,18] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,19] [,20] [,21] [,22] [,23] [,24] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,25] [,26] [,27] [,28] [,29] [,30] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,31] [,32] [,33] [,34] [,35] [,36] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,37] [,38] [,39] [,40] [,41] [,42] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,43] [,44] [,45] [,46] [,47] [,48] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,49] [,50] [,51] [,52] [,53] [,54] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,55] [,56] [,57] [,58] [,59] [,60] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,61] [,62] [,63] [,64] [,65] [,66] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,67] [,68] [,69] [,70] [,71] [,72] [,73] #> 1 0.006756757 0 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 1 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,74] [,75] [,76] [,77] [,78] [,79] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,80] [,81] [,82] [,83] [,84] [,85] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,86] [,87] [,88] [,89] [,90] [,91] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,92] [,93] [,94] [,95] [,96] [,97] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,98] [,99] [,100] [,101] [,102] [,103] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,104] [,105] [,106] [,107] [,108] [,109] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,110] [,111] [,112] [,113] [,114] [,115] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,116] [,117] [,118] [,119] [,120] [,121] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,122] [,123] [,124] [,125] [,126] [,127] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,128] [,129] [,130] [,131] [,132] [,133] #> 1 0.006756757 0 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 1 0.000000000 0.000000000 0.000000000 0.000000000 #> [,134] [,135] [,136] [,137] [,138] [,139] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,140] [,141] [,142] [,143] [,144] [,145] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> [,146] [,147] [,148] [,149] [,150] #> 1 0.006756757 0.006756757 0.006756757 0.006756757 0.006756757 #> 2 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> 3 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 #> #> Clustering vector: #> [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 #> [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> [149] 1 1 #> #> Within cluster sum of squares by cluster: #> [1] 147 0 0 #> (between_SS / total_SS = 1.3 %) #> #> Available components: #> #> [1] "cluster" "centers" "totss" "withinss" "tot.withinss" #> [6] "betweenss" "size" "iter" "ifault"