Man page - mlpack_decision_tree(1)
Packages contains this manual
- mlpack_fastmks(1)
- mlpack_mean_shift(1)
- mlpack_hmm_generate(1)
- mlpack_local_coordinate_coding(1)
- mlpack_sparse_coding(1)
- mlpack_preprocess_scale(1)
- mlpack_kmeans(1)
- mlpack_linear_svm(1)
- mlpack_preprocess_split(1)
- mlpack_softmax_regression(1)
- mlpack_hmm_train(1)
- mlpack_nca(1)
- mlpack_range_search(1)
- mlpack_radical(1)
- mlpack_gmm_generate(1)
- mlpack_cf(1)
- mlpack_random_forest(1)
- mlpack_lmnn(1)
- mlpack_gmm_probability(1)
- mlpack_emst(1)
- mlpack_dbscan(1)
- mlpack_nbc(1)
- mlpack_preprocess_one_hot_encoding(1)
- mlpack_lsh(1)
- mlpack_knn(1)
- mlpack_kde(1)
- mlpack_hoeffding_tree(1)
- mlpack_adaboost(1)
- mlpack_hmm_loglik(1)
- mlpack_nmf(1)
- mlpack_pca(1)
- mlpack_bayesian_linear_regression(1)
- mlpack_hmm_viterbi(1)
- mlpack_preprocess_describe(1)
- mlpack_decision_tree(1)
- mlpack_krann(1)
- mlpack_det(1)
- mlpack_lars(1)
- mlpack_preprocess_binarize(1)
- mlpack_logistic_regression(1)
- mlpack_gmm_train(1)
- mlpack_perceptron(1)
- mlpack_preprocess_imputer(1)
- mlpack_kernel_pca(1)
- mlpack_kfn(1)
- mlpack_linear_regression(1)
- mlpack_approx_kfn(1)
apt-get install mlpack-bin
Manual
mlpack_decision_tree
NAMESYNOPSIS
DESCRIPTION
OPTIONAL INPUT OPTIONS
OPTIONAL OUTPUT OPTIONS
ADDITIONAL INFORMATION
NAME
mlpack_decision_tree - decision tree
SYNOPSIS
mlpack_decision_tree [ -m unknown ] [ -l unknown ] [ -D int ] [ -g double ] [ -n int ] [ -a bool ] [ -T string ] [ -L unknown ] [ -t string ] [ -V bool ] [ -w unknown ] [ -M unknown ] [ -p unknown ] [ -P unknown ] [ -h -v ]
DESCRIPTION
Train and evaluate using a decision tree. Given a dataset containing numeric or categorical features, and associated labels for each point in the dataset, this program can train a decision tree on that data.
The training set and associated labels are specified with the β --training_file ( -t )β and β --labels_file ( -l )β parameters, respectively. The labels should be in the range β[0, num_classes - 1]β. Optionally, if β --labels_file ( -l )β is not specified, the labels are assumed to be the last dimension of the training dataset.
When a model is trained, the β --output_model_file ( -M )β output parameter may be used to save the trained model. A model may be loaded for predictions with the β --input_model_file ( -m )β parameter. The β --input_model_file ( -m )β parameter may not be specified when the β --training_file ( -t )β parameter is specified. The β --minimum_leaf_size ( -n )β parameter specifies the minimum number of training points that must fall into each leaf for it to be split. The β --minimum_gain_split ( -g )β parameter specifies the minimum gain that is needed for the node to split. The β --maximum_depth ( -D )β parameter specifies the maximum depth of the tree. If β --print_training_accuracy ( -a )β is specified, the training accuracy will be printed.
Test data may be specified with the β --test_file ( -T )β parameter, and if performance numbers are desired for that test set, labels may be specified with the β --test_labels_file ( -L )β parameter. Predictions for each test point may be saved via the β --predictions_file ( -p )β output parameter. Class probabilities for each prediction may be saved with the β --probabilities_file ( -P )β output parameter.
For example, to train a decision tree with a minimum leaf size of 20 on the dataset contained in βdata.csvβ with labels βlabels.csvβ, saving the output model to βtree.binβ and printing the training error, one could call
$ mlpack_decision_tree --training_file data.arff --labels_file labels.csv --output_model_file tree.bin --minimum_leaf_size 20 --minimum_gain_split 0.001 --print_training_accuracy
Then, to use that model to classify points in βtest_set.csvβ and print the test error given the labels βtest_labels.csvβ using that model, while saving the predictions for each point to βpredictions.csvβ, one could call
$ mlpack_decision_tree --input_model_file tree.bin --test_file test_set.arff --test_labels_file test_labels.csv --predictions_file predictions.csv
OPTIONAL INPUT OPTIONS
--help (-h) [ bool ]
Default help info.
--info [ string ]
Print help on a specific option. Default value ββ.
--input_model_file (-m) [ unknown ]
Pre-trained decision tree, to be used with test points. --labels_file ( -l ) [ unknown ] Training labels.
--maximum_depth (-D) [ int ]
Maximum depth of the tree (0 means no limit). Default value 0.
--minimum_gain_split (-g) [ double ]
Minimum gain for node splitting. Default value 1e-07.
--minimum_leaf_size (-n) [ int ]
Minimum number of points in a leaf. Default value 20.
--print_training_accuracy (-a) [ bool ]
Print the training accuracy.
--test_file (-T) [ string ]
Testing dataset (may be categorical).
--test_labels_file (-L) [ unknown ]
Test point labels, if accuracy calculation is desired.
--training_file (-t) [ string ]
Training dataset (may be categorical).
--verbose (-v) [ bool ]
Display informational messages and the full list of parameters and timers at the end of execution.
--version (-V) [ bool ]
Display the version of mlpack.
--weights_file (-w) [ unknown ]
The weight of labels
OPTIONAL OUTPUT OPTIONS
--output_model_file (-M) [ unknown ]
Output for trained decision tree.
--predictions_file (-p) [ unknown ]
Class predictions for each test point.
--probabilities_file (-P) [ unknown ]
Class probabilities for each test point.
ADDITIONAL INFORMATION
For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.