DOKK / manpages / debian 11 / mlpack-bin / mlpack_preprocess_scale.1.en
mlpack_preprocess_scale(1) User Commands mlpack_preprocess_scale(1)

mlpack_preprocess_scale - scale data


mlpack_preprocess_scale -i string [-r double] [-m unknown] [-f bool] [-e int] [-b int] [-a string] [-s int] [-V bool] [-o string] [-M unknown] [-h -v]

This utility takes a dataset and performs feature scaling using one of the six scaler methods namely: 'max_abs_scaler', 'mean_normalization', ’min_max_scaler' ,'standard_scaler', 'pca_whitening' and 'zca_whitening'. The function takes a matrix as '--input_file (-i)' and a scaling method type which you can specify using '--scaler_method (-a)' parameter; the default is standard scaler, and outputs a matrix with scaled feature.

The output scaled feature matrix may be saved with the '--output_file (-o)' output parameters.

The model to scale features can be saved using '--output_model_file (-M)' and later can be loaded back using'--input_model_file (-m)'.

So, a simple example where we want to scale the dataset 'X.csv' into ’X_scaled.csv' with standard_scaler as scaler_method, we could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method standard_scaler

A simple example where we want to whiten the dataset 'X.csv' into ’X_whitened.csv' with PCA as whitening_method and use 0.01 as regularization parameter, we could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method pca_whitening --epsilon 0.01

You can also retransform the scaled dataset back using'--inverse_scaling (-f)'. An example to rescale : 'X_scaled.csv' into 'X.csv'using the saved model '--input_model_file (-m)' is:

$ mlpack_preprocess_scale --input_file X_scaled.csv --output_file X.csv --inverse_scaling --input_model_file saved.bin

Another simple example where we want to scale the dataset 'X.csv' into ’X_scaled.csv' with min_max_scaler as scaler method, where scaling range is 1 to 3 instead of default 0 to 1. We could run

$ mlpack_preprocess_scale --input_file X.csv --output_file X_scaled.csv --scaler_method min_max_scaler --min_value 1 --max_value 3

Matrix containing data.

regularization Parameter for pcawhitening, or zcawhitening, should be between -1 to 1. Default value 1e-06.
Default help info.
Print help on a specific option. Default value ''.
Input Scaling model.
Inverse Scaling to get original dataset
Ending value of range for min_max_scaler. Default value 1.
Starting value of range for min_max_scaler. Default value 0.
method to use for scaling, the default is standard_scaler. Default value 'standard_scaler'.
Random seed (0 for std::time(NULL)). Default value 0.
Display informational messages and the full list of parameters and timers at the end of execution.
Display the version of mlpack.

Matrix to save scaled data to.
Output scaling model.

For further information, including relevant papers, citations, and theory, consult the documentation found at http://www.mlpack.org or included with your distribution of mlpack.

12 December 2020 mlpack-3.4.2