For today’s lab, we will be looking at image segmentation using SciKit-Image, while there are other software platforms that we could use SciKit-Image is used as it offers a large selection of algorithms and provides a good framework for testing hyperparameters (hyperparameters are the variables used to control machine learning processes).
We will be working primarily in Jupyter Notebooks, and will use ArcGIS Pro for some data conversions.
Go to https://jupyter.gis.unbc.ca, the JupyterLab has been upgraded since we last used it in this course after logging in you will be able to start your server, when asked to select an image, choose “Datascience”. The Datascience environment will take slightly longer to start however contains extra libraries for machine learning, of most interest today is SciKit.
Create a new notebook, and in the first code block add the following import statements.
import rasterio as rio import rasterio.plot import matplotlib.pyplot as plt import matplotlib.collections as plt_collection from skimage import io, segmentation, filters, metrics import numpy as np from osgeo import ogr from osgeo import gdal import geopandas
Confirm this runs without any errors before proceeding.
Next, define a function to plot image segments, it is not important at this stage to understand exactly what this function does, just that it will overlay segments ontop of the image they are derived from.
def seg_show(segments, image, raster): dst_layername = "temp_semgnets.tif" img_profile = raster.profile img_profile.update(dtype=raster.meta['dtype'], count=1) with rio.open(dst_layername, 'w', **img_profile) as dst: dst.write(segments.astype(raster.meta['dtype']), 1) src_ds = gdal.Open(dst_layername) srcband = src_ds.GetRasterBand(1) drv = ogr.GetDriverByName("ESRI Shapefile") dst_ds = drv.CreateDataSource(dst_layername + ".shp") dst_layer = dst_ds.CreateLayer(dst_layername, srs=None) gdal.Polygonize(srcband, None, dst_layer, -1, [], callback=None) dst_layer = None dst_ds = None fig, ax = plt.subplots(figsize=(15, 15)) rasterio.plot.show(raster, ax=ax) edges = geopandas.read_file(dst_layername + ".shp") edges.plot(ax=ax, facecolor='none', edgecolor='red')
For those trying to follow along the function has 3 inputs
- segments: This is a set of polygons to graph
- image: RGB Image, and spifically band is stored in axis 2.
- raster: is the original image that contains the geocoding information from the original image (This function could be rewritten to not take image, using only raster, however raster would need to be reshapped as band is axis 0.
This is far from an optimal solution, however, it is functional and will be good enough for today. To plot the segments over the image they need to be geocoded, this is done by writing them to a tif with the same profile as the original image, then GDAL is used to make a shapefile from this tif, finally, Geopandas is used to plot this shapefile over the original images. Performing the conversion Vector > Raster > Vector should not be taken as an example of proper coding rather The Art of the Bodge.
Open and Plot an Image
In JupyterLab upload the tif file PGDowntownSmall.tif from L:\GEOG457\Lab7\pg, remembering to add all 4 parts of the file.
You should now be able to open the image, line 2 opens a greyscale copy, raster is the original data, and rgb_image is reshaped such that the structure changes from [band, x, y] to [x, y, band] for compatibility with the functions used today.
image_path = "PGDowntownSmall.tif" grey_image = io.imread(image_path, as_gray=True) raster = rio.open(image_path) rgb_image = np.empty((raster.height, raster.width, raster.count-1), raster.meta['dtype']) for band in range(rgb_image.shape[2]): rgb_image[:, :, band] = raster.read(band+1)
And display the image with plt
plt.figure(figsize=(15, 12)) plt.imshow(rgb_image) plt.show()
How Segmentaiton Works
Broadly speaking there are 3 types of data that are used to segment an image
- Geometry: Many algorithims make segments of approximate size and shape, this could be defined in how many segments to generate over the entire image, how many pixels are in each segment, or how smooth the edges should be.
- Watershed/Maxima: Not used in todays lab, however this technique was used in the LiDAR lab for tree detection, particularly well suited for elevation data, though also useful with uniform colours.
- Edges (Colour): Edges are detected in images as high contrast lines, these lines are detected with a rapid change of colour in the image, and have direction and intesity. Some older digital cameras focus by maximizing edge intensity.
Look at the edges in the image
edge_roberts = filters.roberts(grey_image) plt.figure(figsize=(15, 12)) plt.imshow(edge_roberts) plt.show()
Also note that there are multiple algorithms that may perform better or worse depending on the source data, try the Sobel filter, and compare the results.
edge_sobel = filters.sobel(grey_image) plt.figure(figsize=(15, 12)) plt.imshow(edge_sobel) plt.show()
At this stage it should be becoming clear that there is as much art as science in the segmentation process.
Generate Segments
One of the great features of SciKit is that it provides many algorithms that all have a similar structure making comparison easy, they will all be of the format segmentation.<algorithm>(image, hyperparameters….)
slic_segments = segmentation.slic(rgb_image, convert2lab = False, start_label=1, compactness=0.5, n_segments=150) seg_show(slic_segments, rgb_image, raster)
*Note the parameter convert2lab, by default, many of the segmentation algorithms convert RGB colour to lab Colour, however, this is a bug in one of the currently installed Python libraries that cause this process to stall as such we want to suppress this behaviour. This likely has an impact on the accuracy, however, this is not a practical fix in JupyterLab at this time.
Similarly, the image could also be segmented using Felzenszwalb’s Efficient Graph
fGraph_segments = segmentation.felzenszwalb(rgb_image, scale=3000.0, sigma=0.99, min_size=150) seg_show(segments, rgb_image, raster)
Segmentation Quality
The naive solution to segmentation is to visually assess the segments and adjust the hyper parameters, however it is possible to quantify segmentation quality. Five common metrics for this are
- False Merge
- False Split
- Adapted Random Error
- Adapted Random Precision
- Adapted Random Recall

Adapted Random Precision (ARP) is the probability that a pixel of a given class in the results is the same class in the truthing data, and normalized over the total number of pixels in the classified data, in simple terms this can be thought of as the percentage of pixels correctly segmented. (Arganda-Carreras et al. 2015) Higher values are better for this base-metric. This can be thought of as like false merges, in that if a segment is too large it will lower the ARP, by a proportion of the area of over segmentation as opposed to the quantity of segments.
Adapted Random Recall (ARR) is the probability that a pixel of a given class in the results is the same class in the truthing data, and normalized over the number of pixels in the truthing data. (Arganda-Carreras et al. 2015) Higher values are better for this base-metric. As ARP was to FM, ARR is to False Splits, again looking at the number of pixels that have been placed into alternate segments as opposed to the number of additional segments produced.
ARP and ARR are closely related statistics as such it may be useful to take an average of them. Adapted Random Error is defined by the equation. ARE=1-2(ARP*ARR)/(ARP+ARR), and in this case, we seek a lower value, with 0 being the perfect classification. In the case of 0 all pixels can be directly mapped from the classified data to the training mask.
Below are some graphs from Matt’s Theis showing prots of these metrics against a brute force exploration of the SLIC HyperParameter Space (about 30,000 tests).

False Merges compete with False Splits, and Adapted Random Precision with Adapted Random Recall, as such a balance must be found. At the extreme example if only one segment is produced there will be no False Splits, conversly if every pixel were its own segment there would be no False Merges.
Make a Segmentation Mask
To generate the above statistics we need a segmentation mask to compare the generated segments against. You can create this in ArcGIS Pro, create a new project and add the PGDowntownSmall.tif image.
- Right Click on the home folder in the catalog pane and create new shapefile with geometry type of polygon.
- Use the Edit Tab, and create new features, drawing polygons over the roofs (at least 10).
- Use the Feature to Raster Toolbox to create a tif file, setting the vaue to FID (every segment should be a unique colour), and making sure the Output raster has the .tif extension (default format is ESRI GRID). Under Environments the Output Coordinate System, Processing Extent, Cell Size, and Snap Raster should all be set to PGDowntownSmall.tif, the file generated must be exactly the same dimensions as the original image, even one pixel difference in resolution will cause the metrics to fail.
For a real project, the mask would likely contain hundreds of segments representative of all classes to be detected and encompassing the range of sizes and shapes present in the scene.

ground_truth = rio.open("pg_downtown_truth.tif") ground_truth_np = np.empty((ground_truth.height, ground_truth.width), ground_truth.meta['dtype']) ground_truth_np = ground_truth.read(1)
And compute the metrics
# Further reading on error metrics https://www.frontiersin.org/articles/10.3389/fnana.2015.00142/full error, precision, recall = metrics.adapted_rand_error(ground_truth_np, slic_segments, ignore_labels=65535) splits, merges = metrics.variation_of_information(ground_truth_np, slic_segments, ignore_labels=6553) print("ARE: " + str(error) + " ARP: " + str(precision) + " ARR: " + str(recall) + " FS: " + str(splits) + " FM: " + str(merges) + " Segments: " + str(len(np.unique(slic_segments))))
error, precision, recall = metrics.adapted_rand_error(ground_truth_np, fGraph_segments, ignore_labels=65535) splits, merges = metrics.variation_of_information(ground_truth_np, fGraph_segments, ignore_labels=6553) print("ARE: " + str(error) + " ARP: " + str(precision) + " ARR: " + str(recall) + " FS: " + str(splits) + " FM: " + str(merges) + " Segments: " + str(len(np.unique(fGraph_segments))))
You may notice the ignore_labes=65535, this is the default nodata value from ArcGIS as not all objects have been masked this prevents the background from being interpreted as a single monolithic segment.
Assignment
This lab is intentionally short to allow time to explore the hyperparameters, as such this assignment may take more time than previous assignments however should balance out overall timewise.
- (1.5 point) Produce a research log exploring the SLIC hyper parameters, starting with the lab example modify one parameter at at time, in your word document paste a screenshot of the segments, and a brief statement of if the change was an improvement or not. modify a hyper parameter that you think will improve the results, repeat until your document has 8 different segmentations.
Brief concluding statement on the impact of each hyper parameter. - (1.5 point) Do the same with Felzenszwalb’s Efficcent Graph, increasing to a minimum of 10 repitions as there is an extra hyper parameter.
- (2 points) Implement QuickShift Algorithm (see list of modules here https://scikit-image.org/docs/stable/api/skimage.segmentation.html), and same as questions 1 & 2, investigate the ratio, and kernel_size parameters (restrict kernel_size options to {1, 3, or 5}, 4 repetitions, if you have trouble getting this to run e-mail Matt.