mlpack_hoeffding_tree(1) | User Commands | mlpack_hoeffding_tree(1) |
mlpack_hoeffding_tree - hoeffding trees
mlpack_hoeffding_tree [-b bool] [-B int] [-c double] [-m unknown] [-l string] [-n int] [-I int] [-N string] [-o int] [-s int] [-T string] [-L string] [-t string] [-V bool] [-M unknown] [-p string] [-P string] [-h -v]
This program implements Hoeffding trees, a form of streaming decision tree suited best for large (or streaming) datasets. This program supports both categorical and numeric data. Given an input dataset, this program is able to train the tree with numerous training options, and save the model to a file. The program is also able to use a trained model or a model from file in order to predict classes for a given test set.
The training file and associated labels are specified with the ’--training_file (-t)' and '--labels_file (-l)' parameters, respectively. Optionally, if '--labels_file (-l)' is not specified, the labels are assumed to be the last dimension of the training dataset.
The training may be performed in batch mode (like a typical decision tree algorithm) by specifying the '--batch_mode (-b)' option, but this may not be the best option for large datasets.
When a model is trained, it may be saved via the '--output_model_file (-M)' output parameter. A model may be loaded from file for further training or testing with the '--input_model_file (-m)' parameter.
Test data may be specified with the '--test_file (-T)' parameter, and if performance statistics are desired for that test set, labels may be specified with the '--test_labels_file (-L)' parameter. Predictions for each test point may be saved with the '--predictions_file (-p)' output parameter, and class probabilities for each prediction may be saved with the '--probabilities_file (-P)' output parameter.
For example, to train a Hoeffding tree with confidence 0.99 with data ’dataset.csv', saving the trained tree to 'tree.bin', the following command may be used:
$ hoeffding_tree --training_file dataset.arff --confidence 0.99 --output_model_file tree.bin
Then, this tree may be used to make predictions on the test set ’test_set.csv', saving the predictions into 'predictions.csv' and the class probabilities into 'class_probs.csv' with the following command:
$ hoeffding_tree --input_model_file tree.bin --test_file test_set.arff --predictions_file predictions.csv --probabilities_file class_probs.csv
For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.
18 November 2018 | mlpack-3.0.4 |