Attention Mechanisms in Semantic Segmentation of Remote Sensing Images

Vaibhav V. Godase; Swapnil R. Takale; Rahul G. Ghodake; Altaf Mulani

Authors

Vaibhav V. Godase Assistant Professor
Swapnil R. Takale
Rahul G. Ghodake
Altaf Mulani

Keywords:

Aerial imagery, Attention mechanism, CNN, Deep learning, Image analysis, Land cover classification, Remote sensing, Semantic segmentation, Transformer, Vision transformers

Abstract

This research addresses the persistent challenge of accurately segmenting complex, high-resolution aerial and satellite imagery. The primary objective is to enhance semantic segmentation performance by leveraging advanced attention mechanisms within deep convolutional neural network (CNN) architectures. Unlike traditional segmentation approaches that often struggle with heterogeneous landscapes and intricate object boundaries, our methodology systematically integrates channel attention, spatial attention, and transformer-based attention modules into standard encoder-decoder CNN frameworks. Specifically, channel attention focuses on strengthening feature representation by adaptively recalibrating channel-wise responses, while spatial attention guides the network to prioritize salient regions across spatial dimensions. The transformer-based attention component captures long-range dependencies, enabling more coherent global context aggregation, which is crucial in remote sensing scenes characterized by scale variation and spatial complexity.

The proposed approach is evaluated on benchmark datasets widely acknowledged in the remote sensing field, including ISPRS Potsdam, Deep Globe Land Cover Classification, and Space Net. These datasets offer diverse urban and rural scenes, challenging the segmentation models to generalize across variable geographic and environmental contexts. Empirical results demonstrate that our proposed attention-infused models consistently outperform baseline CNN architectures in both overall segmentation accuracy and the delineation of fine-grained boundaries. For instance, on the ISPRS Potsdam dataset, our best-performing model achieves a 3.8% absolute improvement in mean Intersection-over-Union (mIoU) compared to established baselines.

References

Q. Zhao, J. Liu, Y. Li, and H. Zhang, “Semantic Segmentation with attention mechanism for remote sensing images,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-13, 2022, Art. no. 5403913, doi: https://doi.org/10.1109/TGRS.2021.3085889

D. Zhao, C. Wang, Y. Gao, Z. Shi, and F. Xie, “Semantic segmentation of remote sensing image based on regional self-attention mechanism,” in IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1-5, 2022, Art. no. 8010305, doi: https://doi.org/10.1109/LGRS.2021.3071624

R. Li, S. Zheng, C. Duan, J. Su, and C. Zhang, “Multistage attention ResU-Net for semantic segmentation of fine-resolution remote sensing images,” in IEEE Geoscience and Remote Sensing Letters, vol. 19, pp. 1-5, 2022, Art. no. 8009205, doi: https://doi.org/10.1109/LGRS.2021.3063381

X. Li et al., “A synergistical attention model for semantic segmentation of remote sensing images,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1-16, 2023, Art. no. 5400916, doi: https://doi.org/10.1109/TGRS.2023.3243954

H. Gao, L. Cao, D. Yu, X. Xiong, and M. Cao, “Semantic segmentation of marine remote sensing based on a cross-direction attention mechanism,” in IEEE Access, vol. 8, pp. 142483-142494, 2020, doi: https://doi.org/10.1109/ACCESS.2020.3013898

J. Jiang, X. Feng, Q. Ye, Z. Hu, Z. Gu, and H. Huang, “Semantic segmentation of remote sensing images combined with attention mechanism and feature enhancement U-Net,” International Journal of Remote Sensing, vol. 44, no. 19, pp. 6219–6232, Oct. 2023, doi: https://doi.org/10.1080/01431161.2023.2264502

R. Li et al., “Multiattention network for semantic segmentation of fine-resolution remote sensing images,” in IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-13, 2022, Art. no. 5607713, doi: https://doi.org/10.1109/TGRS.2021.3093977

V. V. Godase, Edge AI for smart surveillance: Real-time human activity recognition on low-power devices, International Journal of AI and Machine Learning Innovations in Electronics and Communication Technology, vol. 1, no. 1, pp. 29-46, Jun. 2025, Available: https://matjournals.net/engineering/index.php/IJAIMLECT/article/view/2038

V. Godase, “Optimized algorithm for face recognition using deepface and multi-task cascaded convolutional network (MTCNN)”, OPS Journal, no. 3, pp. 66–74, May 2025, Available: https://optimumscience.org/index.php/pub/article/view/31

V. K. Jamadade, M. G. Ghodke, S. S. Katakdhond, and “A comprehensive review on scalable Arduino radar platform for real-time object detection and mapping,” Journal of Microprocessor and Microcontroller Research, vol. 2, no. 2, pp. 1-12, May 2025, Available: https://matjournals.net/engineering/index.php/JoMMR/article/view/1888

V. Godase, “Comparative study of ladder logic and structured text programming for PLC,” Journal of Electronics Design and Technology, vol. 2, no. 2, pp. 34-44, Jul. 2025, Available: https://matjournals.net/engineering/index.php/JEDT/article/view/2117

S. Modi, V. Misal, and S. Kulkarni, “LoRaEdge-ESP32 Synergy: Revolutionizing farm weather data collection with low-power, long-range IoT,” Advance Research in Analog and Digital Communications, vol. 2, no. 2, pp. 1-11, Jul. 2025, Available: https://matjournals.net/engineering/index.php/ARADC/article/view/2155

V. Godase and A. Jagadale, “Three element control using PLC, PID & SCADA interface.” IJSRD - International Journal for Scientific Research & Development, vol. 7, no. 2, pp. 1105-1109, 2019, Available: https://www.ijsrd.com/articles/IJSRDV7I21002.pdf

V. Godase, A. Lawande, K. Mane, K. Davad, and S. Gangonda, “Pipeline survey robot,” International Journal for Scientific Research and Development, vol. 12, no. 3, pp. 141-144, 2024, Available: https://www.ijsrd.com/articles/IJSRDV7I21002.pdf

J. P. Patale, A. B. Jagadale, A. O. Mulani, and A. Pise, “A systematic survey on estimation of electrical vehicle,” Journal of Electronics, Computer Networking and Applied Mathematics, vol. 3, no. 1, pp. 1–6, Jan. 2023, doi: https://doi.org/10.55529/jecnam.31.1.6

Bhanudas Gadade and Altaf Mulani, “Automatic system for car health monitoring”, Int. j. innov. Eng. res. technol., pp. 57–62, Jul. 2022, Available: https://repo.ijiert.org/index.php/ijiert/article/view/3206

R. S. Shinde and A. O. Mulani, “Analysis of biomedical image using wavelet transform,” International Journal of Innovations in Engineering Research and Technology, vol. 2, no. 7, pp. 1-7, Jul. 2015, Available: https://media.neliti.com/media/publications/416834-analysisof-biomedical-image-using-wavele-37662d4d.pdf

A. J. Mandwale and A. O. Mulani, “Different approaches for implementation of Viterbi decoder on reconfigurable platform,” 2015 International Conference on Pervasive Computing (ICPC), Pune, India, 2015, pp. 1-4, doi: https://doi.org/10.1109/PERVASIVE.2015.7086976

A. O. Mulani, M. M. Jadhav, and M. Seth, “Painless machine learning approach to estimate blood glucose level with non-invasive devices,” Artificial Intelligence, Internet of Things (IoT) and Smart Materials for Energy Applications, CRC Press, Available: https://www.taylorfrancis.com/chapters/edit/10.1201/9781003220176-6/

A. O. Mulani, and P. B. Mane, “Fast and efficient VLSI implementation of DWT for image compression,” International Journal for Research in Applied Science & Engineering Technology, vol. 5, no. 9, pp. 1397-1402, Sep. 2017, Available: https://www.ijraset.com/fileserve.php?FID=10106

A. Kamble and A. O. Mulani, “Home automation using Google Assistant,” Purakala, vol. 32, no. 1, pp. 1071-1077, 2023, Available: https://www.researchgate.net/publication/370204241

Attention Mechanisms in Semantic Segmentation of Remote Sensing Images

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Current Issue