Man page - mlpack_dbscan(1)
Packages contains this manual
- mlpack_fastmks(1)
- mlpack_mean_shift(1)
- mlpack_hmm_generate(1)
- mlpack_local_coordinate_coding(1)
- mlpack_sparse_coding(1)
- mlpack_preprocess_scale(1)
- mlpack_kmeans(1)
- mlpack_linear_svm(1)
- mlpack_preprocess_split(1)
- mlpack_softmax_regression(1)
- mlpack_hmm_train(1)
- mlpack_nca(1)
- mlpack_range_search(1)
- mlpack_radical(1)
- mlpack_gmm_generate(1)
- mlpack_cf(1)
- mlpack_random_forest(1)
- mlpack_lmnn(1)
- mlpack_gmm_probability(1)
- mlpack_emst(1)
- mlpack_dbscan(1)
- mlpack_nbc(1)
- mlpack_preprocess_one_hot_encoding(1)
- mlpack_lsh(1)
- mlpack_knn(1)
- mlpack_kde(1)
- mlpack_hoeffding_tree(1)
- mlpack_adaboost(1)
- mlpack_hmm_loglik(1)
- mlpack_nmf(1)
- mlpack_pca(1)
- mlpack_bayesian_linear_regression(1)
- mlpack_hmm_viterbi(1)
- mlpack_preprocess_describe(1)
- mlpack_decision_tree(1)
- mlpack_krann(1)
- mlpack_det(1)
- mlpack_lars(1)
- mlpack_preprocess_binarize(1)
- mlpack_logistic_regression(1)
- mlpack_gmm_train(1)
- mlpack_perceptron(1)
- mlpack_preprocess_imputer(1)
- mlpack_kernel_pca(1)
- mlpack_kfn(1)
- mlpack_linear_regression(1)
- mlpack_approx_kfn(1)
apt-get install mlpack-bin
Manual
mlpack_dbscan
NAMESYNOPSIS
DESCRIPTION
REQUIRED INPUT OPTIONS
OPTIONAL INPUT OPTIONS
OPTIONAL OUTPUT OPTIONS
ADDITIONAL INFORMATION
NAME
mlpack_dbscan - dbscan clustering
SYNOPSIS
mlpack_dbscan -i unknown [ -e double ] [ -m int ] [ -N bool ] [ -s string ] [ -S bool ] [ -t string ] [ -V bool ] [ -a unknown ] [ -C unknown ] [ -h -v ]
DESCRIPTION
This program implements the DBSCAN algorithm for clustering using accelerated tree-based range search. The type of tree that is used may be parameterized, or brute-force range search may also be used.
The input dataset to be clustered may be specified with the ’ --input_file ( -i )’ parameter; the radius of each range search may be specified with the ’ --epsilon ( -e )’ parameters, and the minimum number of points in a cluster may be specified with the ’ --min_size ( -m )’ parameter.
The ’ --assignments_file ( -a )’ and ’ --centroids_file ( -C )’ output parameters may be used to save the output of the clustering. ’ --assignments_file ( -a )’ contains the cluster assignments of each point, and ’ --centroids_file ( -C )’ contains the centroids of each cluster.
The range search may be controlled with the ’ --tree_type ( -t )’, ’ --single_mode ( -S )’, and ’ --naive ( -N )’ parameters. ’ --tree_type ( -t )’ can control the type of tree used for range search; this can take a variety of values: ’kd’, ’r’, ’r-star’, ’x’, ’hilbert-r’, ’r-plus’, ’r-plus-plus’, ’cover’, ’ball’. The ’ --single_mode ( -S )’ parameter will force single-tree search (as opposed to the default dual-tree search), and ’’ --naive ( -N )’ will force brute-force range search.
An example usage to run DBSCAN on the dataset in ’input.csv’ with a radius of 0.5 and a minimum cluster size of 5 is given below:
$ mlpack_dbscan --input_file input.csv --epsilon 0.5 --min_size 5
REQUIRED INPUT OPTIONS
--input_file (-i) [ unknown ]
Input dataset to cluster.
OPTIONAL INPUT OPTIONS
--epsilon (-e) [ double ]
Radius of each range search. Default value 1.
--help (-h) [ bool ]
Default help info.
--info [ string ]
Print help on a specific option. Default value ’’.
--min_size (-m) [ int ]
Minimum number of points for a cluster. Default value 5.
--naive (-N) [ bool ]
If set, brute-force range search (not tree-based) will be used.
--selection_type (-s) [ string ]
If using point selection policy, the type of selection to use (’ordered’, ’random’). Default value ’ordered’.
--single_mode (-S) [ bool ]
If set, single-tree range search (not dual-tree) will be used.
--tree_type (-t) [ string ]
If using single-tree or dual-tree search, the type of tree to use (’kd’, ’r’, ’r-star’, ’x’, ’hilbert-r’, ’r-plus’, ’r-plus-plus’, ’cover’, ’ball’). Default value ’kd’.
--verbose (-v) [ bool ]
Display informational messages and the full list of parameters and timers at the end of execution.
--version (-V) [ bool ]
Display the version of mlpack.
OPTIONAL OUTPUT OPTIONS
--assignments_file (-a) [ unknown ]
Output matrix for assignments of each point.
--centroids_file (-C) [ unknown ]
Matrix to save output centroids to.
ADDITIONAL INFORMATION
For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.