Run the DBSCAN (Density-based spatial clustering of applications with noise) clustering algorithm.
cuda_ml_dbscan( x, min_pts, eps, cuML_log_level = c("off", "critical", "error", "warn", "info", "debug", "trace") )
x | The input matrix or dataframe. Each data point should be a row and should consist of numeric values only. |
---|---|
min_pts, eps | A point `p` is a core point if at least `min_pts` are within distance `eps` from it. |
cuML_log_level | Log level within cuML library functions. Must be one of "off", "critical", "error", "warn", "info", "debug", "trace". Default: off. |
A list containing the cluster assignments of all data points. A data point not belonging to any cluster (i.e., "noise") will have NA its cluster assignment.
library(cuda.ml) library(magrittr) gen_pts <- function() { centroids <- list(c(1000, 1000), c(-1000, -1000), c(-1000, 1000)) pts <- centroids %>% purrr::map(~ MASS::mvrnorm(10, mu = .x, Sigma = diag(2))) rlang::exec(rbind, !!!pts) } m <- gen_pts() clusters <- cuda_ml_dbscan(m, min_pts = 5, eps = 3) print(clusters)#> $labels #> logical(0) #>