6  Classification

6.1 Summary

6.1.1 Overview

Classification in the remote sensing context is to identify the different land uses based on the observations. This is often not the final output of research, but is incorporated as an important part of the data processing pipeline in many use cases. Identification of urban areas or vegetation are some common goals of classification. The main idea behind classification is to slice data in different ways, trying to differentiate the different characteristics of each usage.

6.1.2 Methodology

There are many methodologies that achieve the clustering of earth observation data. 2 of the methodologies will be summarised below. The ones I could not cover include generic machine learning methodologies such as supervised / unsupervised machine learning techniques.

Classification and Regression Trees (CART)

This methodology comprises of two parts: the classification tree that handles discrete data and regression trees that deal with continuous data.

Classification trees take multiple rows of data as independent variables that contribute to the outcome decision. The more criteria the decision tree has, the smaller and purer each of the categories (leaves) become, but classifying into too small groups does not constitute an useful measure. Setting the point to stop may be defined by the minimum number of observations per leaf, or pruning leaves by considering tree scores.

Regression trees are used for continuous values that can be classified into groups of observations each (seem to be) following a different pattern. The separating threshold is determined where the sum of the squared residuals become lowest - indicating the divide is significant.

Using only one tree may result in overfitting, thus not generalisable to new data. In order to overcome this issue, the random forest methodology using many different models (a collection of many trees = forest) is implemented to even out the impact.

Support Vector Machine

The support vector machine (SVM) methodology is a linear binary classifier with soft margins. It finds the hyperplane within the high-dimensional space where it can divide groups with the largest margin between the support vectors (observations on other sides closest to divide) while minimizing misclassification.

Example of SVM methodology. Excerpt from Sheykhmousa et al. (2020)

Hyperparameters that impact the performance include \(c\) and \(\gamma\) that corresponds to the slope and distance of control points, respectedly.

There will be differences in behaviour that can be observed. A one-to-one model tries to identify borders between each labeled group, while a one-to-rest approach tries to draw a border between the group of interest to the rest of the observations.

6.2 Applications

Many projects integrate classification of land cover as part of their projects. Sheykhmousa et al. (2020) conducted a research on the accuracy of the 2 major models discussed above: the SVM and the random forest approach for classification models. They have shown these models are superior in different scenarios, for instance the random forest approach performs better overall, but SVM outperforms random forest models in low-resolution imageries and observations including multiple features.

Kussul et al. (2017) is an example of an attempt to identify the different crop types using remote sensing data in Ukraine. The crops were identified using supervised classification into 7 different groups + forest, grassland, bare land and water, as shown in the picture below. The overall accuracy of over 90% was recorded in most of the models, but has failed to consider the spatial autocorrelation of train and test models. The paper concludes we may be able to apply this to a wider region for a more systematic analysis.

Classification conducted using Landsat 8 imagery, conducted by Kussul et al. (2017).

An example of a research with classification as part of the data processing pipeline is conducted by Bhatia, Patil, and Buddhiraju (2023), where they investigate the urban sprawl of the Mumbai metropolitan area. They have used satellite imagery of the area to identify urban areas using supervised classification techniques. They have observed how the urban area of this city has grown over the past few decades, and have identified areas which have seen the most urban sprawl, thus identifying areas that require a stronger intervention on the policy level.

Research regarding classification can be categorised into 2 streams: trying to improve the performance of classification (such as the former example) and papers that address regional problems using the classification of urban areas. Although not explicitly titled in many research, classification is an underlying process in the many applications. One thing to keep in mind, however, is the measurement of accuracy. The latter has used the overall accuracy and the kappa value for identification, both of which have potential issues if not interpreted correctly. The train-test split must be done on a regional scale rather than the pixel level, where spatial autocorrelation may impact the outcome. The kappa value is inconsistent with a range of values available for the same level of confidence, thus making it not ideal for measuring performance.

6.3 Reflections

Identifying different land use from remote sensing is without a doubt an important topic, and I have got the chance to learn what is going on behind the scenes. I have found the idea of the regression tree to be interesting, being able to quantify gaps which could not be quantified using the original value. In our group project, we have utilised this algorithm to measure the urban growth and extrapolate into the future, getting insights for the necessary interventions that must be made in the Da Nang area in central Viet Nam.