TY - GEN
T1 - Multimodal Land Use Classification
T2 - 25th International Conference on Digital Image Computing: Techniques and Applications, DICTA 2024
AU - Ur Rehman, Muhammad Zia
AU - Islam, Syed Mohammed Shamsul
AU - Ulhaq, Anwaar
AU - Janjua, Naeem
AU - Blake, David
PY - 2024
Y1 - 2024
N2 - Recently, the integration of multiple remote sensing modalities has gained significant attention in land use classification research, offering improved performance. However, this approach comes with additional challenges such as modality-specific feature extraction and effective feature fusion. In this work, a DL-based technique is proposed that utilizes dual remote sensing modalities (HSI and LiDAR) for land use classification. The proposed technique consists of three modules: 1) a CNN-based feature extraction module, 2) Attention modules designed specifically for each modality, i.e., Convolution Block Attention Module (CBAM) and a spatial attention module for the HSI and the LiDAR features respectively. 3) A fusion module to fuse separately extracted features of both modalities. The features extracted from convolution blocks are subsequently enhanced using attention modules, later, feature-level fusion is performed, and final classification is achieved. The novel combination of these modules has demonstrated a notable performance gain over the CNN-based approaches across different classes and metrics on the Trento dataset. It achieves 98.21% average accuracy on the Trento dataset, which shows its significant potential to be applied in resource management and planning and environmental monitoring.
AB - Recently, the integration of multiple remote sensing modalities has gained significant attention in land use classification research, offering improved performance. However, this approach comes with additional challenges such as modality-specific feature extraction and effective feature fusion. In this work, a DL-based technique is proposed that utilizes dual remote sensing modalities (HSI and LiDAR) for land use classification. The proposed technique consists of three modules: 1) a CNN-based feature extraction module, 2) Attention modules designed specifically for each modality, i.e., Convolution Block Attention Module (CBAM) and a spatial attention module for the HSI and the LiDAR features respectively. 3) A fusion module to fuse separately extracted features of both modalities. The features extracted from convolution blocks are subsequently enhanced using attention modules, later, feature-level fusion is performed, and final classification is achieved. The novel combination of these modules has demonstrated a notable performance gain over the CNN-based approaches across different classes and metrics on the Trento dataset. It achieves 98.21% average accuracy on the Trento dataset, which shows its significant potential to be applied in resource management and planning and environmental monitoring.
KW - Convolutional Neural Networks
KW - Hyperspectral Image
KW - Land Use Classification
KW - LiDAR
KW - Multimodal Fusion
UR - http://www.scopus.com/inward/record.url?scp=85219518957&partnerID=8YFLogxK
U2 - 10.1109/DICTA63115.2024.00099
DO - 10.1109/DICTA63115.2024.00099
M3 - Conference contribution
AN - SCOPUS:85219518957
T3 - Proceedings - 2024 25th International Conference on Digital Image Computing: Techniques and Applications, DICTA 2024
SP - 655
EP - 661
BT - Proceedings - 2024 25th International Conference on Digital Image Computing
PB - Institute of Electrical and Electronics Engineers
Y2 - 27 November 2024 through 29 November 2024
ER -