Tutorial 2: Chlorophyll Estimation from UAV-borne Multispectral Image
Preface
The purpose of this tutorial is to estimate chlorophyll concentration of plants using machine learning. That’s why a multispectral image with blue, green, red, red-edge, and near infrared band is provided. A shapefile of some sample plots is also given. In the field, actual chlorophyll values of the plots were measured using destructive sampling. These can be used as ground truth for the machine learning model. However, to train a machine learning model, we need to extract remote sensing features or vegetation indices under each plot. We can extract the average values of every vegetation indices for each plot and then train the model.
Datasets
The dataset for this tutorial can be found in the raster4ml_data
google drive repository.
Download the chlorophyll.tar.gz
file and extract in the data directory.
1. Micasense Altum Image
The image provided was collected from a UAV-borne multispectral sensor called Micasense Altum. More information about Altum camera can be found here. It has blue, green, red, red-edge, and near infrared (NIR) bands. The channels in the attached image also follows the previously mentioned order of bands. The center wavelengths of the bandas can be found in this link.
2. Visualize
Try to visualize the image using the plotting
functionality of raster4ml
.
2. Calculate Vegetation Indices
Calculate the vegetation indices from the micasense-altum-5bands.tif
image. Please
remember to consider the threshold
parameter. Depending on the threshold
you will
get the desired amount of VIs. The image provided has a reflectance value ranging from 0
to 1, which is why there is no need to provide any bit_depth information.
# Define the VegetationIndices object
VI = VegetationIndices(`...`)
# Run the process while providing the output directory
VI.calculate(out_dir='...')
3. Extract Values based on plot shape
Locate the sample polygon shapefile in the extracted data folder. The name of the shapefile
is plot-shapefile.shp
. There are two columns in this shape, i.e., plotid
and
chl
, where the first one is the unique id and the later is the measured chlorophyll
value. You can use the batch_extract_by_polygons
from the extraction
module.
You can only extract the mean
as the statistics.
# Batch extract values by polygons
values = batch_extract_by_polygons(`...`)
4. Machine Learning Training
Now train at least 3 machine learning regression models of your choice. Remember to do feature scaling, feature selection, and hyperparameter optimization. Also, perform a 70/30 split of training and testing, where the models should be evaluated using the root mean squared error of the test set.