Skip to content


Current Mandate Schedule : October 2020 to September 2021 (with extension options until 2024)

This document describes the state and roadmap of an ongoing task (Imagery Detections - IDET) and is subject to daily revision and evolution


This STDL task consists of the automated analysis of geospatial images using deep learning while providing practical applications for specific use cases. The overall goal is the extraction of semantic information from remote sensing data. The earlier involved case studies revolve around concrete object detection use cases deploying modern machine learning methods and utilizing a multitude of available datasets. Later, full semantic surface layers can be produced leveraging the obtained datasets to arrive at a prototypical platform for object detection which is highly useful for decision makers at various contact points in society.

Background and Potential Use Cases

Swimming Pool Detection

Label inputs for deep learning derived from cadastral data

Providing a reliable detection of swimming pools allows authorities to assess the status quo to update archival datasets and to reinforce administrative construction permit processes. The status quo is based on manually digitized cadastral information. This data is used to extract feature masks which can be applied to orthophoto imagery such as the SWISSIMAGE dataset or aerial photos provided by the end users. Deep Learning algorithms such as Faster RCNN or Mask RCNN then allow the detection of previously unregistered swimming pools in a defined perimeter. Achievable detection accuracies range above 90% (F1 Score). Current users of the technology include the Canton of Geneva and the Canton of Neuchâtel.

Detected Swimming Pools in the Canton of Neuchâtel

Solar Panel Detection

The project «SolAi» was launched in 2018 at the Institute of Geomatics (IGEO) of the University of Applied Sciences Northwestern Switzerland (FHNW), in collaboration with the Swiss Federal Office of Energy (SFOE) and will be finished by the end of 2020. The project aims to use Mask R-CNN algorithms to automatically identify and quantify existing solar installations from Swissimage orthophotos. Such an approach should then serve as a basis for the implementation of the energy strategy and statistical estimation models of the solar market. A solar register is already available in Switzerland based on applications of government subsidies. This dataset is lacking integral completeness as well as absolute shape and positions of the installations though. To date, Switzerland still lacks reliable position and area data of the photovoltaic and solar thermal systems already installed in order to enable a complete evaluation in conjunction with the solar cadastre data. In the scope of the project over 30'000 polygons of solar panels classified into "Photovoltaic" and "Thermal" installations were drawn over the SWISSIMAGE dataset to generate traning data, currently yielding a mean average prediction accuracy of ~87%.

The Predecessor-Project «SolAi» was funded by the Swiss Federal Office of Energy

Various cantonal and federal authorities as well as research groups have shown exceptional interest in obtaining the dataset and building on these scientific findings. The use case will therefore be continued in the framework of the Swiss Territorial Data Lab project to refine the scope of the outcomes to end user needs and achieve a maturation of the classification results through hyperparameter optimization, retraining on multispectral imagery and evaluating prediction/inferencing robustness in different radiometric scenarios.

Early Detection Result of a Resnet-50 Mask-RCNN architecture

Surface Classification for the Area Statistics

The Area Statistics of Switzerland classifies land use (LU) and land cover (LC) based on a regular 100x100m grid of 4.2 million sample points in 72 categories combined from 46 LU and 27 LC classes. The arduous manual labelling process performed by experts relies mainly on aerial imagery but also uses a catalogue of additional data to increase reliability. The goal of this study is to investigate the benefits using multimodal data from different sources and sensors using modern machine learning (ML) algorithms. Deep Convolutional Neural Networks (CNN) as well as Random Forest (RF) architectures provide automatized classification. The models are trained on aerial RGB and FCIR images, as well as auxiliary datasets such as satellite-derived time series indices and GIS data. The accurate manually annotated Area Statistics serve as conducive “ground truth” to study the performance of ML algorithms in a challenging paradigm in particular due to the high level of detail of the LU/LC categories. Especially for underrepresented categories containing low sample counts the benefits of transfer learning were exploited by applying the CNN architecture “Xception” independently on 50x50m RGB and FCIR orthophoto tiles. Best preliminary classification accuracies of CNN reached over 80% on individual major classes, but maximal overall accuracies threshold to 52% due to tiles with visually very similar characteristics in different smaller ground truth classes. To further improve the classification, the resulting CNN probabilities were used together with extrapolated Landsat-based index time series, digital elevation data, vegetation canopy height models and categorical cadastral information as a combined input vector for RF post-classification. The RF achieved reproducible overall accuracies of 84% for LU and 89% for LC. Certain major classes, which so far required an especially monotonous manual process, are classified with very high specific accuracies (>90%). We conclude that multimodal data as provided by sensor fusion and auxiliary data sources in a bipartite system of ML algorithms has the potential to support the expert-based classification and reduce the manual work. Therefore, the reporting period can be significantly shortened and the spatial resolution can be further increased in future.


Environments and Frameworks

  • High-Performance Computing Cluster at FHNW

HPE Apollo 6500 with 4 NVidia V100 GPUs

  • Google Colab

Google Colab as a rapid prototyping environment

  • Pytorch

  • Detectron2

  • Tensorflow

  • COCO

Data Sources

  • SWISSIMAGE RGB 10cm by swisstopo

  • SWISSIMAGE RS 10cm by swisstopo

  • Labels drawn from the official Swiss cadastral services. The Swiss cadastral system comprises the cadastral surveying, the Cadastre of Public-law Restrictions on landownership (PLR-cadastre) and the land register.

  • SITN

  • SITG - Geneva Geodata Services


Deep Learning Benchmarking Tests

Swimming Pool Detector

  • Masks vs. Bbox with Resnet-50 Mask RCNN

  • Resnet-50 vs. Resnet-101

  • Tile Size Dependencies

Multi-Class Detectors

  • Combining Solar Detector and Swimming Pool Detector vs. Single Class Paradigms

Base Technology

  • Detectron2 / Pytorch vs. Tensorflow


Ultimately, an automatized system for surface classification based on aerial imagery and additional data sources will be proposed that allows consistent differential surface segmentation as a basis for differential change analysis. Spatiotemporal data storage provides insight into current, historical and future territorial features scalabe from small communal objects to the environmental and landscape levels.

  • Institute Geomatics FHNW

    • Project Area Statistics
    • Project Sol Ai
    • Project Animal Detection
  • Swisstopo

  • BFS / OFS

  • BFE / Energiestrategie 2050