Semantic Segmentation of High-Resolution Airborne Images with Dual-Stream DeepLabV3+


Creative Commons License

AKÇAY Ö., KINACI A. C., AVŞAR E. Ö., AYDAR U.

ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, vol.11, no.1, 2022 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 11 Issue: 1
  • Publication Date: 2022
  • Doi Number: 10.3390/ijgi11010023
  • Journal Name: ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Agricultural & Environmental Science Database, CAB Abstracts, INSPEC, Veterinary Science Database, Directory of Open Access Journals
  • Keywords: deep learning, semantic segmentation, photogrammetry, multi-spectral aerial imagery, digital surface model, vegetation index, land cover classification, CLASSIFICATION, BOUNDARY
  • Çanakkale Onsekiz Mart University Affiliated: Yes

Abstract

In geospatial applications such as urban planning and land use management, automatic detection and classification of earth objects are essential and primary subjects. When the significant semantic segmentation algorithms are considered, DeepLabV3+ stands out as a state-of-the-art CNN. Although the DeepLabV3+ model is capable of extracting multi-scale contextual information, there is still a need for multi-stream architectural approaches and different training approaches of the model that can leverage multi-modal geographic datasets. In this study, a new end-to-end dual-stream architecture that considers geospatial imagery was developed based on the DeepLabV3+ architecture. As a result, the spectral datasets other than RGB provided increments in semantic segmentation accuracies when they were used as additional channels to height information. Furthermore, both the given data augmentation and Tversky loss function which is sensitive to imbalanced data accomplished better overall accuracies. Also, it has been shown that the new dual-stream architecture using Potsdam and Vaihingen datasets produced 88.87% and 87.39% overall semantic segmentation accuracies, respectively. Eventually, it was seen that enhancement of the traditional significant semantic segmentation networks has a great potential to provide higher model performances, whereas the contribution of geospatial data as the second stream to RGB to segmentation was explicitly shown.