Abstract
Semantic segmentation of land cover types is a pivotal task in remote sensing, essential for
applications in urban planning, environmental monitoring, disaster management, and agriculture. Accurate segmentation is challenged by imbalanced class distributions and ambiguous
boundaries. This paper introduces AdaptiveFusionNet, a novel architecture designed to address
heterogeneous complexities in remote sensory image, by leveraging adaptive, multi-scale feature
extraction and efficient fusion mechanisms. The architecture comprises three core modules: the
Adaptive Pixel Encoder (APE), which enhances pixel-level feature extraction across multiple
scales; the Fusion Atrous Pooling (FAP), which effectively integrates contextual information using atrous convolutions; and the Parallel Attention Decoder (PAD), which refines segmentation
boundaries through attention-enhanced upsampling. Evaluated on the high-resolution Gaofen 2
dataset, AdaptiveFusionNet demonstrates substantial improvements in key performance metrics,
achieving an overall Intersection over Union (IoU) of 71% and excelling in Precision, Recall, and
F1 score across various land cover classes, including urban areas, vegetation, water bodies, and
infrastructure. An ablation study is presented to validate AdaptiveFusionNet’s superiority over
existing architectures. The results establish AdaptiveFusionNet as an improved architecture for
high-resolution land cover segmentation in terms of both accuracy and computational efficiency.
applications in urban planning, environmental monitoring, disaster management, and agriculture. Accurate segmentation is challenged by imbalanced class distributions and ambiguous
boundaries. This paper introduces AdaptiveFusionNet, a novel architecture designed to address
heterogeneous complexities in remote sensory image, by leveraging adaptive, multi-scale feature
extraction and efficient fusion mechanisms. The architecture comprises three core modules: the
Adaptive Pixel Encoder (APE), which enhances pixel-level feature extraction across multiple
scales; the Fusion Atrous Pooling (FAP), which effectively integrates contextual information using atrous convolutions; and the Parallel Attention Decoder (PAD), which refines segmentation
boundaries through attention-enhanced upsampling. Evaluated on the high-resolution Gaofen 2
dataset, AdaptiveFusionNet demonstrates substantial improvements in key performance metrics,
achieving an overall Intersection over Union (IoU) of 71% and excelling in Precision, Recall, and
F1 score across various land cover classes, including urban areas, vegetation, water bodies, and
infrastructure. An ablation study is presented to validate AdaptiveFusionNet’s superiority over
existing architectures. The results establish AdaptiveFusionNet as an improved architecture for
high-resolution land cover segmentation in terms of both accuracy and computational efficiency.
| Original language | English |
|---|---|
| Pages (from-to) | 101679 |
| Number of pages | 17 |
| Journal | Remote Sensing Applications: Society and Environment |
| Volume | 39 |
| DOIs | |
| Publication status | Published - 8 Aug 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 11 Sustainable Cities and Communities
Keywords
- Computer vision
- Convolution neural networks
- Deep learning
- Land cover
- Machine learning
- Remote sensing
- Semantic segmentation
Fingerprint
Dive into the research topics of 'Adaptive feature extraction and attention-based segmentation network for remote sensing imagery'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver