DOKK / manpages / debian 12 / pktools / pksvm.1.en

NAME

pksvm - classify raster image using Support Vector Machine

SYNOPSIS

pksvm
-t training [-i input] [-o output] [-cv value] [options] [advanced options]

DESCRIPTION

pksvm implements a support vector machine (SVM) to solve a supervised classification problem. The implementation is based on the open source C++ library libSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm). Both raster and vector files are supported as input. The output will contain the classification result, either in raster or vector format, corresponding to the format of the input. A training sample must be provided as an OGR vector dataset that contains the class labels and the features for each training point. The point locations are not considered in the training step. You can use the same training sample for classifying different images, provided the number of bands of the images are identical. Use the utility pkextract to create a suitable training sample, based on a sample of points or polygons. For raster output maps you can attach a color table using the option -ct.

OPTIONS

-t filename, --training filename: Training vector file. A single vector file contains all training features (must be set as: b0, b1, b2,...) for all classes (class numbers identified by label option). Use multiple training files for bootstrap aggregation (alternative to the --bag and --bagsize options, where a random subset is taken from a single training file)
-i filename, --input filename: input image
-o filename, --output filename: Output classification image
-cv value, --cv value: N-fold cross validation mode (default: 0)
-tln layer, --tln layer: Training layer name(s)
-c name, --class name: List of class names.
-r value, --reclass value: List of class values (use same order as in --class option).
-of GDALformat, --oformat GDALformat: Output image format (see also gdal_translate(1)).
-f format, --f format: Output ogr format for active training sample
-co NAME=VALUE, --co NAME=VALUE: Creation option for output file. Multiple options can be specified.
-ct filename, --ct filename: Color table in ASCII format having 5 columns: id R G B ALFA (0: transparent, 255: solid)
-label attribute, --label attribute: Identifier for class label in training vector file. (default: label)
-prior value, --prior value: Prior probabilities for each class (e.g., -prior 0.3 -prior 0.3 -prior 0.2) Used for input only (ignored for cross validation)
-g gamma, --gamma gamma: Gamma in kernel function
-cc cost, --ccost cost: The parameter C of C_SVC, epsilon_SVR, and nu_SVR
-m filename, --mask filename: Only classify within specified mask (vector or raster). For raster mask, set nodata values with the option --msknodata.
-msknodata value, --msknodata value: Mask value(s) not to consider for classification. Values will be taken over in classification image.
-nodata value, --nodata value: Nodata value to put where image is masked as nodata
-v level, --verbose level: Verbose level

Advanced options

-b band, --band band: Band index (starting from 0, either use --band option or use --startband to --endband)
-sband band, --startband band: Start band sequence number
-eband band, --endband band: End band sequence number
-bal size, --balance size: Balance the input data to this number of samples for each class
-min number, --min number: If number of training pixels is less then min, do not take this class into account (0: consider all classes)
-bag value, --bag value: Number of bootstrap aggregations (default is no bagging: 1)
-bagsize value, --bagsize value: Percentage of features used from available training features for each bootstrap aggregation (one size for all classes, or a different size for each class respectively
-comb rule, --comb rule: How to combine bootstrap aggregation classifiers (0: sum rule, 1: product rule, 2: max rule). Also used to aggregate classes with rc option.
-cb filename, --classbag filename: Output for each individual bootstrap aggregation
-prob filename, --prob filename: Probability image.
-offset value, --offset value: Offset value for each spectral band input features: refl[band]=(DN[band]-offset[band])/scale[band]
-scale value, --scale value: Scale value for each spectral band input features: refl=(DN[band]-offset[band])/scale[band] (use 0 if scale min and max in each band to -1.0 and 1.0)
-svmt type, --svmtype type: Type of SVM (C_SVC, nu_SVC,one_class, epsilon_SVR, nu_SVR)
-kt type, --kerneltype type: Type of kernel function (linear,polynomial,radial,sigmoid)
-kd value, --kd value: Degree in kernel function
-c0 value, --coef0 value: Coef0 in kernel function
-nu value, --nu value: The parameter nu of nu-SVC, one-class SVM, and nu-SVR
-eloss value, --eloss value: The epsilon in loss function of epsilon-SVR
-cache number, --cache number: Cache ⟨http://pktools.nongnu.org/html/classCache.html⟩ memory size in MB (default: 100)
-etol value, --etol value: the tolerance of termination criterion (default: 0.001)
-shrink, --shrink: Whether to use the shrinking heuristics
-na number, --nactive number: Number of active training points

EXAMPLE

Classify input image input.tif with a support vector machine. A training sample that is provided as an OGR vector dataset. It contains all features (same dimensionality as input.tif) in its fields (please check pkextract(1) on how to obtain such a file from a "clean" vector file containing locations only). A two-fold cross validation (cv) is performed (output on screen). The parameters cost and gamma of the support vector machine are set to 1000 and 0.1 respectively. A colourtable (a five column text file: image value, RED, GREEN, BLUE, ALPHA) has also been provided.

pksvm -i input.tif -t training.sqlite -o output.tif -cv 2 -ct colourtable.txt -cc 1000 -g 0.1

Classification using bootstrap aggregation. The training sample is randomly split in three subsamples (33% of the original sample each).

pksvm -i input.tif -t training.sqlite -o output.tif -bs 33 -bag 3

Classification using prior probabilities for each class. The priors are automatically normalized. The order in which the options -p are provide should respect the alphanumeric order of the class names (class 10 comes before 2...)

pksvm -i input.tif -t training.sqlite -o output.tif -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 1 -p 0.2 -p 1 -p 1 -p 1

01 December 2022