i.cluster(1grass) | GRASS GIS User's Manual | i.cluster(1grass) |
i.cluster - Generates spectral signatures for land
cover types in an image using a clustering algorithm.
The resulting signature file is used as input for i.maxlik, to generate an
unsupervised image classification.
imagery, classification, signatures
i.cluster
i.cluster --help
i.cluster group=name subgroup=name
signaturefile=name classes=integer
[seed=name] [sample=rows,cols]
[iterations=integer] [convergence=float]
[separation=float] [min_size=integer]
[reportfile=name] [--overwrite] [--help]
[--verbose] [--quiet] [--ui]
i.cluster performs the first pass in the two-pass unsupervised classification of imagery, while the GRASS module i.maxlik executes the second pass. Both commands must be run to complete the unsupervised classification.
i.cluster is a clustering algorithm (a modification of the k-means clustering algorithm) that reads through the (raster) imagery data and builds pixel clusters based on the spectral reflectances of the pixels (see Figure). The pixel clusters are imagery categories that can be related to land cover types on the ground. The spectral distributions of the clusters (e.g., land cover spectral signatures) are influenced by six parameters set by the user. A relevant parameter set by the user is the initial number of clusters to be discriminated.
Fig.: Land use/land cover clustering of LANDSAT scene (simplified) |
i.cluster starts by generating spectral signatures for this number of clusters and "attempts" to end up with this number of clusters during the clustering process. The resulting number of clusters and their spectral distributions, however, are also influenced by the range of the spectral values (category values) in the image files and the other parameters set by the user. These parameters are: the minimum cluster size, minimum cluster separation, the percent convergence, the maximum number of iterations, and the row and column sampling intervals.
The cluster spectral signatures that result are composed of cluster means and covariance matrices. These cluster means and covariance matrices are used in the second pass (i.maxlik) to classify the image. The clusters or spectral classes result can be related to land cover types on the ground. The user has to specify the name of group file, the name of subgroup file, the name of a file to contain result signatures, the initial number of clusters to be discriminated, and optionally other parameters (see below) where the group should contain the imagery files that the user wishes to classify. The subgroup is a subset of this group. The user must create a group and subgroup by running the GRASS program i.group before running i.cluster. The subgroup should contain only the imagery band files that the user wishes to classify. Note that this subgroup must contain more than one band file. The purpose of the group and subgroup is to collect map layers for classification or analysis. The signaturefile is the file to contain result signatures which can be used as input for i.maxlik. The classes value is the initial number of clusters to be discriminated; any parameter values left unspecified are set to their default values.
For all raster maps used to generate signature file it is recommended to have semantic label set. Use r.support to set semantc labels of each member of the imagery group. Signatures generated for one scene are suitable for classification of other scenes as long as they consist of same raster bands (semantic labels match). If semantic labels are not set, it will be possible to use obtained signature file to classify only the same imagery group used for generating signatures.
i.cluster does not cluster all pixels, but only a sample (see parameter sample). The result of that clustering is not that all pixels are assigned to a given cluster; essentially, only signatures which are representative of a given cluster are generated. When running i.cluster on the same data asking for the same number of classes, but with different sample sizes, likely slightly different signatures for each cluster are obtained at each run.
The algorithm uses input parameters set by the user on the initial number of clusters, the minimum distance between clusters, and the correspondence between iterations which is desired, and minimum size for each cluster. It also asks if all pixels to be clustered, or every "x"th row and "y"th column (sampling), the correspondence between iterations desired, and the maximum number of iterations to be carried out.
In the 1st pass, initial cluster means for each band are defined by giving the first cluster a value equal to the band mean minus its standard deviation, and the last cluster a value equal to the band mean plus its standard deviation, with all other cluster means distributed equally spaced in between these. Each pixel is then assigned to the class which it is closest to, distance being measured as Euclidean distance. All clusters less than the user-specified minimum distance are then merged. If a cluster has less than the user-specified minimum number of pixels, all those pixels are again reassigned to the next nearest cluster. New cluster means are calculated for each band as the average of raster pixel values in that band for all pixels present in that cluster.
In the 2nd pass, pixels are then again reassigned to clusters based on new cluster means. The cluster means are then again recalculated. This process is repeated until the correspondence between iterations reaches a user-specified level, or till the maximum number of iterations specified is over, whichever comes first.
Preparing the statistics for unsupervised classification of a
LANDSAT scene within North Carolina location:
# Set computational region to match the scene g.region raster=lsat7_2002_10 -p # store VIZ, NIR, MIR into group/subgroup (leaving out TIR) i.group group=lsat7_2002 subgroup=res_30m \
input=lsat7_2002_10,lsat7_2002_20,lsat7_2002_30,lsat7_2002_40,lsat7_2002_50,lsat7_2002_70 # generate signature file and report i.cluster group=lsat7_2002 subgroup=res_30m \
signaturefile=cluster_lsat2002 \
classes=10 reportfile=rep_clust_lsat2002.txt
To complete the unsupervised classification, i.maxlik is subsequently used. See example in its manual page.
The signature file obtained in the example above will allow to
classify the current imagery group only (lsat7_2002). If the user would like
to re-use the signature file for the classification of different imagery
group(s), they can set semantic labels for each group member beforehand,
i.e., before generating the signature files. Semantic labels are set by
means of r.support as shown below:
# Define semantic labels for all LANDSAT bands r.support map=lsat7_2002_10 semantic_label=TM7_1 r.support map=lsat7_2002_20 semantic_label=TM7_2 r.support map=lsat7_2002_30 semantic_label=TM7_3 r.support map=lsat7_2002_40 semantic_label=TM7_4 r.support map=lsat7_2002_50 semantic_label=TM7_5 r.support map=lsat7_2002_61 semantic_label=TM7_61 r.support map=lsat7_2002_62 semantic_label=TM7_62 r.support map=lsat7_2002_70 semantic_label=TM7_7 r.support map=lsat7_2002_80 semantic_label=TM7_8
r.support, g.gui.iclass, i.group, i.gensig, i.maxlik, i.segment, i.smap, r.kappa
Michael Shapiro, U.S. Army Construction Engineering Research
Laboratory
Tao Wen, University of Illinois at Urbana-Champaign, Illinois
Semantic label support: Maris Nartiss, University of Latvia
Available at: i.cluster source code (history)
Accessed: Sunday Jan 22 07:37:24 2023
Main index | Imagery index | Topics index | Keywords index | Graphical index | Full index
© 2003-2023 GRASS Development Team, GRASS GIS 8.2.1 Reference Manual
GRASS 8.2.1 |